August 15, 2022
2 min

Card Sorting vs Tree Testing: what's the best?

A great information architecture (IA) is essential for a great user experience (UX). And testing your website or app’s information architecture is necessary to get it right.

Card sorting and tree testing are the very best UX research methods for exactly this. But the big question is always: which one should you use, and when? Very possibly you need both. Let’s find out with this quick summary.

What is card sorting and tree testing? 🧐

Card sorting is used to test the information architecture of a website or app. Participants group individual labels (cards) into different categories according to  criteria that makes best sense to them. Each label represents an item that needs to be categorized. The results provide deep insights to guide decisions needed to create an intuitive navigation, comprehensive labeling and content that is organized in a user-friendly way.

Tree testing is also used to test the information architecture of a website or app. When using tree testing participants are presented with a site structure and a set of tasks they need to complete. The goal for participants is to find their way through the site and complete their task. The test shows whether the structure of your website corresponds to what users expect and how easily (or not) they can navigate and complete their tasks.

What are the differences? 🂱 👉🌴

Card sorting is a UX research method which helps to gather insights about your content categorization. It focuses on creating an information architecture that responds intuitively to the users’ expectations. Things like which items go best together, the best options for labeling, what categories users expect to find on each menu.

Doing a simple card sort can give you all those pieces of information and so much more. You start understanding your user’s thoughts and expectations. Gathering enough insights and information to enable you to develop several information architecture options.

Tree testing is a UX research method that is almost a card sort in reverse. Tree testing is used to evaluate an information architecture structure and simply allows you to see what works and what doesn’t. 

Using tree testing will provide insights around whether your information architecture is intuitive to navigate, the labels easy to follow and ultimately if your items are categorized in a place that makes sense. Conversely it will also show where your users get lost and how.

What method should you use? 🤷

You’ve got this far and fine-tuning your information architecture should be a priority. An intuitive IA is an integral component of a user-friendly product. Creating a product that is usable and an experience users will come back for.

If you are still wondering which method you should use - tree testing or card sorting. The answer is pretty simple - use both.

Just like many great things, these methods work best together. They complement each other, allowing you to get much deeper insights and a rounded view of how your IA performs and where to make improvements than when used separately. We cover more reasons why card sorting loves tree testing in our article which dives deeper into why to use both.

Ok, I'm using both, but which comes first? 🐓🥚

Wanting full, rounded insights into your information architecture is great. And we know that tree testing and card sorting work well together. But is there an order you should do the testing in? It really depends on the particular context of your research - what you’re trying to achieve and your situation. 

Tree testing is a great tool to use when you have a product that is already up and running. By running a tree test first you can quickly establish where there may be issues, or snags. Places where users get caught and need help. From there you can try and solve potential issues by moving on to a card sort. 

Card sorting is a super useful method that can be instigated at any stage of the design process, from planning to development and beyond.  As long as there is an IA structure that can be tested again. Testing against an already existing website navigation can be informative. Or testing a reorganization of items (new or existing) can ensure the organization can align with what users expect.

However, when you decide to implement both of the methods in your research, where possible, tree testing should come before card sorting. If you want a little more on the issue have a read of our article here.

Check out our OptimalSort and Treejack tools - we can help you with your research and the best way forward. Wherever you might be in the process.

Share this article
Author
Optimal
Workshop

Related articles

View all blog articles
Learn more
1 min read

A quick analysis of feedback collected with OptimalSort

Card sorting is an invaluable tool for understanding how people organize information in their minds, making websites more intuitive and content easier to navigate. It’s a useful method outside of information architecture and UX research, too. It can be a useful prioritization technique, or used in a more traditional sense. For example, it’s handy in psychology, sociology or anthropology to inform research and deepen our understanding of how people conceptualize information.

The introduction of remote card sorting has provided many advantages, making it easier than ever to conduct your own research. Tools such as our very own OptimalSort allow you to quickly and easily gather findings from a large number of participants from all around the world. Not having to organize moderated, face-to-face sessions gives researchers more time to focus on their work, and easier access to larger data sets.

One of the main disadvantages of remote card sorting is that it eliminates the opportunity to dive deeper into the choices made by your participants. Human conversation is a great thing, and when conducting a remote card sort with users who could potentially be on the other side of the world, opportunities for our participants to provide direct feedback and voice their opinions are severely limited.Your survey design may not be perfect.

The labels you provide your participants may be incorrect, confusing or redundant. Your users may have their own ideas of how you could improve your products or services beyond what you are trying to capture in your card sort. People may be more willing to provide their feedback than you realize, and limiting their insights to a simple card sort may not capture all that they have to offer.So, how can you run an unmoderated, remote card sort, but do your best to mitigate this potential loss of insight?

A quick look into the data

In an effort to evaluate the usefulness of the existing “Leave a comment” feature in OptimalSort, I recently asked our development team to pull out some data.You might be asking “There’s a comment box in OptimalSort?”If you’ve never noticed this feature, I can’t exactly blame you. It’s relatively hidden away as an unassuming hyperlink in the top right corner of your card sort.

OptimalSortCommentBox1

OptimalSortCommentBox2

Comments left by your participants can be viewed in the “Participants” tab in your results section, and are indicated by a grey speech bubble.

OptimalSortSpeechBubble

The history of the button is unknown even to long-time Optimal Workshop team members. The purpose of the button is also unspecified. “Why would anyone leave a comment while participating in a card sort?”, I found myself wondering.As it turns out, 133,303 comments have been left by participants. This means 133,303 insights, opinions, critiques or frustrations. Additionally, these numbers only represent the participants who noticed the feature in the first place. Considering the current button can easily be missed when focusing on the task at hand, I can’t help but wonder how this number might change if we drew more attention to the feature.

Breaking down the comments

To avoid having to manually analyze and code 133,303 open text fields, I decided to only spend enough time to decipher any obvious patterns. Luckily for me, this didn’t take very long. After looking at only a hundred or so random entries, four distinct types of comments started to emerge.

  1. This card/group doesn’t make sense.Comments related to cards and groups dominate. This is a great thing, as it means that the majority of comments made by participants relate specifically to the task they are completing. For closed and hybrid sorts, comments frequently relate to the predefined categories available, and since the participants most likely to leave a comment are those experiencing issues, the majority of the feedback relates to issues with category names themselves. Many comments are related to card labels and offer suggestions for improving naming conventions, while many others draw attention to some terms being confusing, unclear or jargony. Comments on task length can also be found, along with reasons for why certain cards may be left ungrouped, e.g., “I’ve left behind items I think the site could do without”.
  2. Your organization is awesome for doing this/you’re doing it all wrong. A substantial number of participants used the comment box as an opportunity to voice their general feedback on the organization or company running the study. Some of the more positive comments include an appreciation for seeing private companies or public sector organizations conducting research with real users in an effort to improve their services. It’s also nice to see many comments related to general enjoyment in completing the task.On the other hand, some participants used the comment box as an opportunity to comment on what other areas of their services should be improved, or what features they would like to see implemented that may otherwise be missed in a card sort, e.g., “Increased, accurate search functionality is imperative in a new system”.
  3. This isn’t working for me. Taking a closer look at some of the comments reveals some useful feedback for us at Optimal Workshop, too. Some of the comments relate specifically to UI and usability issues. The majority of these issues are things we are already working to improve or have dealt with. However, for researchers, comments that relate to challenges in using the tool or completing the survey itself may help explain some instances of data variability.
  4. #YOLO, hello, ;) And of course, the unrelated. As you may expect, when you provide people with the opportunity to leave a comment online, you can expect just about anything in return.

How to make the most of your user insights in OptimalSort

If you’re running a card sort, chances are you already place a lot of value in the voice of your users. To ensure you capture any additional insights, it’s best to ensure your participants are aware of the opportunity to do so. Here are two ways you may like to ensure your participants have a space to voice their feedback:

Adding more context to the “Leave a comment” feature

One way to encourage your participants to leave comments is to promote the use of the this feature in your card sort instructions. OptimalSort gives you flexibility to customize your instructions every time you run a survey. By making your participants aware of the feature, or offering ideas around what kinds of comments you may be looking for, you not only make them more likely to use the feature, but also open yourself up to a whole range of additional feedback. An advantage of using this feature is that comments can be added in real time during a card sort, so any remarks can be made as soon as they arise.

Making use of post-survey questions

Adding targeted post-survey questions is the best way to ensure your participants are able to voice any thoughts or concerns that emerged during the activity. Here, you can ask specific questions that touch upon different aspects of your card sort, such as length, labels, categories or any other comments your participants may have. This can not only help you generate useful insights but also inform the design of your surveys in the future.

Make your remote card sorts more human

Card sorts are exploratory by nature. Avoid forcing your participants into choices that may not accurately reflect their thinking by giving them the space to voice their opinions. Providing opportunities to capture feedback opens up the conversation between you and your users, and can lead to surprising insights from unexpected places.

Further reading

Learn more
1 min read

Does the first click really matter? Treejack says yes

In 2009, Bob Bailey and Cari Wolfson published apaper entitled “FirstClick Usability Testing: A new methodology for predicting users’ success on tasks”. They’d analyzed 12 scenario-based user tests and concluded that the first click people make is a strong leading indicator of their ultimate success on a given task. Their results were so compelling that we got all excited and created Chalkmark, a tool especially for first click usability testing. It occurred to me recently that we’ve never revisited the original premise for ourselves in any meaningful way.

And then one day I realized that, as if by magic, we’re sitting on quite possibly the world’s biggest database of tree test results. I wondered: can we use these results to back up Bob and Cari’s findings (and thus the relevanceof Chalkmark)?Hell yes we can.So we’ve analyzed tree testing data from millions of responses in Treejack, and we're thrilled (relieved) that it confirmed the findings from the 2009 paper — convincingly.

What the original study found

Bob and Cari analyzed data from twelve usability studies on websites and products ‘with varying amounts and types of content, a range of subject matter complexity, and distinct user interfaces’. They found that people were about twice as likely to complete a task successfully if they got their first click right, than if they got it wrong:

If the first click was correct, the chances of getting the entire scenario correct was 87%If the first click was incorrect, the chances of eventually getting the scenario correct was only 46%

What our analysis of tree testing data has found

We analyzed millions of tree testing responses in our database. We've found that people who get the first click correct are almost three times as likely to complete a task successfully:

If the first click was correct, the chances of getting the entire scenario correct was 70%If the first click was incorrect, the chances of eventually getting the scenario correct was 24%

To give you another perspective on the same data, here's the inverse:

If the first click was correct, the chances of getting the entire scenario incorrect was 30%If the first click was incorrect, the chances of getting the whole scenario incorrect was 76%

How Treejack measures first clicks and task success

Bob and Cari proved the usefulness of the methodology by linking two key metrics in scenario-based usability studies: first clicks and task success. Chalkmark doesn't measure task success — it's up to the researcher to determine as they're setting up the study what constitutes 'success', and then to interpret the results accordingly. Treejack does measure task success — and first clicks.

In a tree test, participants are asked to complete a task by clicking though a text-only version of a website hierarchy, and then clicking 'I'd find it here' when they've chosen an answer. Each task in a tree test has a pre-determined correct answer — as was the case in Bob and Cari's usability studies — and every click is recorded, so we can see participant paths in detail.

Thus, every single time a person completes an individual Treejack task, we record both their first click and whether they are successful or not. When we came to test the 'correct first click leads to task success' hypothesis, we could therefore mine data from millions of task.

To illustrate this, have a look at the results for one task.The overall Task result, you see a score for success and directness, and a breakdown of whether each Success, Fail, or Skip was direct (they went straight to an answer), or indirect (they went back up the tree before they selected an answer):

tree testing results

In the pietree for the same task, you can look in more detail at how many people went the wrong way froma label (each label representing one page of your website):

tree testing results

In the First Click tab, you get a percentage breakdown of which label people clicked first to complete the task:

tree testing results

And in the Paths tab, you can view individual participant paths in detail (including first clicks), and can filter the table by direct and indirect success, fails, and skips (this table is only displaying direct success and direct fail paths):

tree testing results

How to get busy with first click testing

This analysis reinforces something we already knew that firstclicks matterIt is worth your time to get that first impression right.You have plenty of options for measuring the link between first clicks and task success in your scenario-based usability tests. From simply noting where your participants go during observations, to gathering quantitative first click data via online tools, you'll win either way. And if you want to add the latter to your research, Chalkmark can give you first click data on wireframes and landing pages,and Treejack on your information architecture.

To finish, here's a few invaluable insights from other researchers ongetting the most from first click testing:

Learn more
1 min read

Card descriptions: Testing the effect of contextual information in card sorts

The key purpose of running a card sort is to learn something new about how people conceptualize and organize the information that’s found on your website. The insights you gain from running a card sort can then help you develop a site structure with content labels or headings that best represent the way your users think about this information. Card sorts are in essence a simple technique, however it’s the details of the sort that can determine the quality of your results.

Adding context to cards in OptimalSort – descriptions, links and images

In most cases, each item in a card sort has only a short label, but there are instances where you may wish to add additional context to the items in your sort. Currently, the cards tab in OptimalSort allows you to include a tooltip description, a link within the tooltip description or to format the card as an image (with or without a label).

adding descriptions and images - 640px

We generally don’t recommend using tooltip descriptions and links, unless you have a specific reason to do so. It’s likely that they’ll provide your participants with more information than they would normally have when navigating your website, which may in turn influence your results by leading participants to a particular solution.

Legitimate reasons that you may want to use descriptions and links include situations where it’s not possible or practical to translate complex or technical labels (for example, medical, financial, legal or scientific terms) into plain language, or if you’re using a card sort to understand your participants’ preferences or priorities.

If you do decide to include descriptions in your sort, it’s important that you follow the same guidelines that you would otherwise follow for writing card labels. They should be easy for your participants to understand and you should avoid obvious patterns, for example repeating words and phrases, or including details that refer to the current structure of the website.

A quick survey of how card descriptions are used in OptimalSort

I was curious to find out how often people were including descriptions in their card sorts, so I asked our development team to look into this data. It turns out that around 15% of cards created in OptimalSort have at least some text entered in the description field. In order to dig into the data a bit further, both Ania and I reviewed a random sample of recent sorts and noted how descriptions were being used in each case.

We found that out of the descriptions that we reviewed, 40% (6% of the total cards) had text that should not have impacted the sort results. Most often, these cards simply had the card label repeated in the description (to be honest, we’re not entirely sure why so many descriptions are being used this way! But it’s now in our roadmap to stop this from happening — stay tuned!). Approximately 20% (3% of the total cards) used descriptions to add context without obviously leading participants, however another 40% of cards have descriptions that may well lead to biased results. On occasion, this included linking to the current content or using what we assumed to be the current top level heading within the description.

Use of card descriptions

Create pie charts

Testing the effect of card descriptions on sort results

So, how much influence could potentially leading card descriptions have on the results of a card sort? I decided to put it to the test by running a series of card sorts to compare the effect of different descriptions. As I also wanted to test the effect of linking card descriptions to existing content, I had to base the sort on a live website. In addition, I wanted to make sure that the card labels and descriptions were easily comprehensible by a general audience, but not so familiar that participants were highly likely to sort the cards in a similar manner.

I selected the government immigration website New Zealand Now as my test case. This site, which provides information for prospective and new immigrants to New Zealand, fit the above criteria and was likely unfamiliar to potential participants.

Card descriptions

Navigating the New Zealand Now website

When I reviewed the New Zealand Now site, I found that the top level navigation labels were clear and easy to understand for me personally. Of course, this is especially important when much of your target audience is likely to be non-native English speaking! On the whole, the second level headings were also well-labeled, which meant that they should translate to cards that participants were able to group relatively easily.

There were, however, a few headings such as “High quality” and “Life experiences”, both found under “Study in New Zealand”, which become less clear when removed from the context of their current location in the site structure. These headings would be particularly useful to include in the test sorts, as I predicted that participants would be more likely to rely on card descriptions in the cases where the card label was ambiguous.

Card Descriptions2

I selected 30 headings to use as card labels from under the sections “Choose New Zealand”, “Move to New Zealand”, “Live in New Zealand”, “Work in New Zealand” and “Study in New Zealand” and tweaked the language slightly, so that the labels were more generic.

card labels

I then created four separate sorts in OptimalSort:Round 1: No description: Each card showed a heading only — this functioned as the control sort

Card descriptions illustrations - card label only

Round 2: Site section in description: Each card showed a heading with the site section in the description

Card descriptions illustrations - site section

Round 3: Short description: Each card showed a heading with a short description — these were taken from the New Zealand Now topic landing pages

Card descriptions illustrations - short description

Round 4:Link in description: Each card showed a heading with a link to the current content page on the New Zealand Now website

Card descriptions illustrations - link

For each sort, I recruited 30 participants. Each participant could only take part in one of the sorts.

What the results showed

An interesting initial finding was that when we queried the participants following the sort, only around 40% said they noticed the tooltip descriptions and even fewer participants stated that they had used them as an aid to help complete the sort.

Participant recognition of descriptions

Create bar charts

Of course, what people say they do does not always reflect what they do in practice! To measure the effect that different descriptions had on the results of this sort, I compared how frequently cards were sorted with other cards from their respective site sections across the different rounds.Let’s take a look at the “Study in New Zealand” section that was mentioned above. Out of the five cards in this section,”Where & what to study”, “Everyday student life” and “After you graduate” were sorted pretty consistently, regardless of whether a description was provided or not. The following charts show the average frequency with which each card was sorted with other cards from this section. For example in the control round, “Where & what to study” was sorted with “After you graduate” 76% of the time and with “Everyday day student life” 70% of the time, but was sorted with “Life experiences” or “High quality” each only 10% of the time. This meant that the average sort frequency for this card was 42%.

Untitled chartCreate bar charts

On the other hand, the cards “High quality” and “Life experiences” were sorted much less frequently with other cards in this section, with the exception of the second sort, which included the site section in the description.These results suggest that including the existing site section in the card description did influence how participants sorted these cards — confirming our prediction! Interestingly, this round had the fewest number of participants who stated that they used the descriptions to help them complete the sort (only 10%, compared to 40% in round 3 and 20% in round 4).Also of note is that adding a link to the existing content did not seem to increase the likelihood that cards were sorted more frequently with other cards from the same section. Reasons for this could include that participants did not want to navigate to another website (due to time-consciousness in completing the task, or concern that they’d lose their place in the sort) or simply that it can be difficult to open a link from the tooltip pop-up.

What we can take away from these results

This quick investigation into the impact of descriptions illustrates some of the intricacies around using additional context in your card sorts, and why this should always be done with careful consideration. It’s interesting that we correctly predicted some of these results, but that in this case, other uses of the description had little effect at all. And the results serve as a good reminder that participants can often be influenced by factors that they don’t even recognise themselves!If you do decide to use card descriptions in your cards sorts, here are some guidelines that we recommend you follow:

  • Avoid repeating words and phrases, participants may sort cards by pattern-matching rather than based on the actual content
  • Avoid alluding to a predetermined structure, such as including references to the current site structure
  • If it’s important that participants use the descriptions to complete the sort, you should mention this in your task instructions. It may also be worth asking them a post-survey question to validate if they used them or not

We’d love to hear your thoughts on how we tested the effects of card descriptions and the results that we got. Would you have done anything differently?Have you ever completed a card sort only to realize later that you’d inadvertently biased your results? Or have you used descriptions in your card sorts to meet a genuine need? Do you think there’s a case to make descriptions more obvious than just a tooltip, so that when they are used legitimately, most participants don’t miss this information?

Let us know by leaving a comment!

Seeing is believing

Explore our tools and see how Optimal makes gathering insights simple, powerful, and impactful.