A great information architecture (IA) is essential for a great user experience (UX). And testing your website or app’s information architecture is necessary to get it right.
Card sorting and tree testing are the very best UX research methods for exactly this. But the big question is always: which one should you use, and when? Very possibly you need both. Let’s find out with this quick summary.
What is card sorting and tree testing? 🧐
Card sorting is used to test the information architecture of a website or app. Participants group individual labels (cards) into different categories according to criteria that makes best sense to them. Each label represents an item that needs to be categorized. The results provide deep insights to guide decisions needed to create an intuitive navigation, comprehensive labeling and content that is organized in a user-friendly way.
Tree testing is also used to test the information architecture of a website or app. When using tree testing participants are presented with a site structure and a set of tasks they need to complete. The goal for participants is to find their way through the site and complete their task. The test shows whether the structure of your website corresponds to what users expect and how easily (or not) they can navigate and complete their tasks.
What are the differences? 🂱 👉🌴
Card sorting is a UX research method which helps to gather insights about your content categorization. It focuses on creating an information architecture that responds intuitively to the users’ expectations. Things like which items go best together, the best options for labeling, what categories users expect to find on each menu.
Doing a simple card sort can give you all those pieces of information and so much more. You start understanding your user’s thoughts and expectations. Gathering enough insights and information to enable you to develop several information architecture options.
Tree testing is a UX research method that is almost a card sort in reverse. Tree testing is used to evaluate an information architecture structure and simply allows you to see what works and what doesn’t.
Using tree testing will provide insights around whether your information architecture is intuitive to navigate, the labels easy to follow and ultimately if your items are categorized in a place that makes sense. Conversely it will also show where your users get lost and how.
What method should you use? 🤷
You’ve got this far and fine-tuning your information architecture should be a priority. An intuitive IA is an integral component of a user-friendly product. Creating a product that is usable and an experience users will come back for.
If you are still wondering which method you should use - tree testing or card sorting. The answer is pretty simple - use both.
Just like many great things, these methods work best together. They complement each other, allowing you to get much deeper insights and a rounded view of how your IA performs and where to make improvements than when used separately. We cover more reasons why card sorting loves tree testing in our article which dives deeper into why to use both.
Ok, I'm using both, but which comes first? 🐓🥚
Wanting full, rounded insights into your information architecture is great. And we know that tree testing and card sorting work well together. But is there an order you should do the testing in? It really depends on the particular context of your research - what you’re trying to achieve and your situation.
Tree testing is a great tool to use when you have a product that is already up and running. By running a tree test first you can quickly establish where there may be issues, or snags. Places where users get caught and need help. From there you can try and solve potential issues by moving on to a card sort.
Card sorting is a super useful method that can be instigated at any stage of the design process, from planning to development and beyond. As long as there is an IA structure that can be tested again. Testing against an already existing website navigation can be informative. Or testing a reorganization of items (new or existing) can ensure the organization can align with what users expect.
However, when you decide to implement both of the methods in your research, where possible, tree testing should come before card sorting. If you want a little more on the issue have a read of our article here.
Check out our OptimalSort and Treejack tools - we can help you with your research and the best way forward. Wherever you might be in the process.
“Dear Optimal Workshop,I want to test the structure of a university website (well certain sections anyway). My gut instinct is that it's pretty 'broken'. Lots of sections feel like they're in the wrong place. I want to test my hypotheses before proposing a new structure. I'm definitely going to do some card sorting, and was planning a mixture of online and offline. My question is about when to bring in tree testing. Should I do this first to test the existing IA? Or is card sorting sufficient? I do intend to tree test my new proposed IA in order to validate it, but is it worth doing it upfront too?" — Matt
Dear Matt,
Ah, the classic chicken or the egg scenario: Which should come first — tree testing or card sorting? It’s a question that many researchers often ask themselves, but I’m here to help clear the air!You should always use both methods when changing up your information architecture (IA) in order to capture the most information.
Tree testing and card sorting, when used together, can give you fantastic insight into the way your users interact with your site. First of all, I’ll run through some of the benefits of each testing method.
What is card sorting and why should I use it?
Card sorting is a great method to gauge the way in which your users organize the content on your site. It helps you figure out which things go together and which things don’t. There are two main types of card sorting: open and closed.
Closed card sorting involves providing participants with pre-defined categories into which they sort their cards. For example, you might be reorganizing the categories for your online clothing store for women. Your cards would have all the names of your products (e.g., “socks”, “skirts” and “singlets”) and you also provide the categories (e.g.,“outerwear”, “tops” and “bottoms”).
Open card sorting involves providing participants with cards and leaving them to organize the content in a way that makes sense to them. It’s the opposite to closed card sorting, in that participants dictate the categories themselves and also label them. This means you’d provide them with the cards only — no categories.
Card sorting, whether open or closed, is very user focused. It involves a lot of thought, input, and evaluation from each participant, helping you to form the structure of your new IA.
What is tree testing and why should I use it?
Tree testing is a fantastic way to determine how your users are navigating your site and how they’re finding information. Your site is organised into a tree structure, sorted into topics and subtopics, and participants are provided with some tasks that they need to perform. The results will show you how your participants performed those tasks, if they were successful or unsuccessful, and which route they took to complete the tasks. This data is extremely useful for creating a new and improved IA.
Tree testing is an activity that requires participants to seek information, which is quite the contrast to card sorting — an activity that requires participants to sort and organize information. Each activity requires users to behave in different ways, so each method will give its own valuable results.
Should you run a card or tree test first?
In this scenario, I’d recommend running a tree test first in order to find out how your existing IA currently performs. You said your gut instinct is telling you that your existing IA is pretty “broken”, but it’s good to have the data that proves this and shows you where your users get lost.
An initial tree test will give you a benchmark to work with — after all, how will you know your shiny, new IA is performing better if you don’t have any stats to compare it with? Your results from your first tree test will also show you which parts of your current IA are the biggest pain points and from there you can work on fixing them. Make sure you keep these tasks on hand — you’ll need them later!
Once your initial tree test is done, you can start your card sort, based on the results from your tree test. Here, I recommend conducting an open card sort so you can understand how your users organize the content in a way that makes sense to them. This will also show you the language your participants use to name categories, which will help you when you’re creating your new IA.
Finally, once your card sort is done you can conduct another tree test on your new, proposed IA. By using the same (or very similar) tasks from your initial tree test, you will be able to see that any changes in the results can be directly attributed to your new and improved IA.
Once your test has concluded, you can use this data to compare the performance from the tree test for your original information architecture — hopefully it is much better now!
We love getting stuck into scary, hairy problems to make things better here at Trade Me. One challenge for us in particular is how best to navigate customer reaction to any change we make to the site, the app, the terms and conditions, and so on. Our customers are passionate both about the service we provide — an online auction and marketplace — and its place in their lives, and are rightly forthcoming when they're displeased or frustrated. We therefore rely on our Customer Service (CS) team to give customers a voice, and to respond with patience and skill to customer problems ranging from incorrectly listed items to reports of abusive behavior.
The CS team uses a Customer Relationship Management (CRM) system, Trade Me Admin, to monitor support requests and manage customer accounts. As the spectrum of Trade Me's services and the complexity of the public website have grown rapidly, the CRM system has, to be blunt, been updated in ways which have not always been the prettiest. Links for new tools and reports have simply been added to existing pages, and old tools for services we no longer operate have not always been removed. Thus, our latest focus has been to improve the user experience of the CRM system for our CS team.
And though on the surface it looks like we're working on a product with only 90 internal users, our changes will have flow on effects to tens of thousands of our members at any given time (from a total number of around 3.6 million members).
The challenges of designing customer service systems
We face unique challenges designing customer service systems. Robert Schumacher from GfK summarizes these problems well. I’ve paraphrased him here and added an issue of my own:
1. Customer service centres are high volume environments — Our CS team has thousands of customer interactions every day, and and each team member travels similar paths in the CRM system.
2. Wrong turns are amplified — With so many similar interactions, a system change that adds a minute more to processing customer queries could slow down the whole team and result in delays for customers.
3. Two people relying on the same system — When the CS team takes a phone call from a customer, the CRM system is serving both people: the CS person who is interacting with it, and the caller who directs the interaction. Trouble is, the caller can't see the paths the system is forcing the CS person to take. For example, in a previous job a client’s CS team would always ask callers two or three extra security questions — not to confirm identites, but to cover up the delay between answering the call and the right page loading in the system.
4. Desktop clutter — As a result of the plethora of tools and reports and systems, the desktop of the average CS team member is crowded with open windows and tabs. They have to remember where things are and also how to interact with the different tools and reports, all of which may have been created independently (ie. work differently). This presents quite the cognitive load.
5. CS team members are expert users — They use the system every day, and will all have their own techniques for interacting with it quickly and accurately. They've also probably come up with their own solutions to system problems, which they might be very comfortable with. As Schumacher says, 'A critical mistake is to discount the expert and design for the novice. In contact centers, novices become experts very quickly.'
6. Co-design is risky — Co-design workshops, where the users become the designers, are all the rage, and are usually pretty effective at getting great ideas quickly into systems. But expert users almost always end up regurgitating the system they're familiar with, as they've been trained by repeated use of systems to think in fixed ways.
7. Training is expensive — Complex systems require more training so if your call centre has high churn (ours doesn’t – most staff stick around for years) then you’ll be spending a lot of money. …and the one I’ve added:
8. Powerful does not mean easy to learn — The ‘it must be easy to use and intuitive’ design rationale is often the cause of badly designed CRM systems. Designers mistakenly design something simple when they should be designing something powerful. Powerful is complicated, dense, and often less easy to learn, but once mastered lets staff really motor.
Our project focus
Our improvement of Trade Me Admin is focused on fixing the shattered IA and restructuring the key pages to make them perform even better, bringing them into a new code framework. We're not redesigning the reports, tools, code or even the interaction for most of the reports, as this will be many years of effort. Watching our own staff use Trade Me Admin is like watching someone juggling six or seven things.
The system requires them to visit multiple pages, hold multiple facts in their head, pattern and problem-match across those pages, and follow their professional intuition to get to the heart of a problem. Where the system works well is on some key, densely detailed hub pages. Where it works badly, staff have to navigate click farms with arbitrary link names, have to type across the URL to get to hidden reports, and generally expend more effort on finding the answer than on comprehending the answer.
Groundwork
The first thing that we did was to sit with CS and watch them work and get to know the common actions they perform. The random nature of the IA and the plethora of dead links and superseded reports became apparent. We surveyed teams, providing them with screen printouts and three highlighter pens to colour things as green (use heaps), orange (use sometimes) and red (never use). From this, we were able to immediately remove a lot of noise from the new IA. We also saw that specific teams used certain links but that everyone used a core set. Initially focussing on the core set, we set about understanding the tasks under those links.
The complexity of the job soon became apparent – with a complex system like Trade Me Admin, it is possible to do the same thing in many different ways. Most CRM systems are complex and detailed enough for there to be more than one way to achieve the same end and often, it’s not possible to get a definitive answer, only possible to ‘build a picture’. There’s no one-to-one mapping of task to link. Links were also often arbitrarily named: ‘SQL Lookup’ being an example. The highly-trained user base are dependent on muscle memory in finding these links. This meant that when asked something like: “What and where is the policing enquiry function?”, many couldn’t tell us what or where it was, but when they needed the report it contained they found it straight away.
Sort of difficult
Therefore, it came as little surprise that staff found the subsequent card sort task quite hard. We renamed the links to better describe their associated actions, and of course, they weren't in the same location as in Trade Me Admin. So instead of taking the predicted 20 minutes, the sort was taking upwards of 40 minutes. Not great when staff are supposed to be answering customer enquiries!
We noticed some strong trends in the results, with links clustering around some of the key pages and tasks (like 'member', 'listing', 'review member financials', and so on). The results also confirmed something that we had observed — that there is a strong split between two types of information: emails/tickets/notes and member info/listing info/reports.
We built and tested two IAs
After card sorting, we created two new IAs, and then customized one of the IAs for each of the three CS teams, giving us IAs to test. Each team was then asked to complete two tree tests, with 50% doing one first and 50% doing the other first. At first glance, the results of the tree test were okay — around 61% — but 'Could try harder'. We saw very little overall difference between the success of the two structures, but definitely some differences in task success. And we also came across an interesting quirk in the results.
Closer analysis of the pie charts with an expert in Trade Me Admin showed that some ‘wrong’ answers would give part of the picture required. In some cases so much so that I reclassified answers as ‘correct’ as they were more right than wrong. Typically, in a real world situation, staff might check several reports in order to build a picture. This ambiguous nature is hard to replicate in a tree test which wants definitive yes or no answers. Keeping the tasks both simple to follow and comprehensive proved harder than we expected.
For example, we set a task that asked participants to investigate whether two customers had been bidding on each other's auctions. When we looked at the pietree (see screenshot below), we noticed some participants had clicked on 'Search Members', thinking they needed to locate the customer accounts, when the task had presumed that the customers had already been found. This is a useful insight into writing more comprehensive tasks that we can take with us into our next tests.
What’s clear from analysis is that although it’s possible to provide definitive answers for a typical site’s IAs, for a CRM like Trade Me Admin this is a lot harder. Devising and testing the structure of a CRM has proved a challenge for our highly trained audience, who are used to the current system and naturally find it difficult to see and do things differently. Once we had reclassified some of the answers as ‘correct’ one of the two trees was a clear winner — it had gone from 61% to 69%. The other tree had only improved slightly, from 61% to 63%.
There were still elements with it that were performing sub-optimally in our winning structure, though. Generally, the problems were to do with labelling, where, in some cases, we had attempted to disambiguate those ‘SQL lookup’-type labels but in the process, confused the team. We were left with the dilemma of whether to go with the new labels and make the system initially harder to use for staff but easier to learn for new staff, or stick with the old labels, which are harder to learn. My view is that any new system is going to see an initial performance dip, so we might as well change the labels now and make it better.
The importance of carefully structuring questions in a tree test has been highlighted, particularly in light of the ‘start anywhere/go anywhere’ nature of a CRM. The diffuse but powerful nature of a CRM means that careful consideration of tree test answer options needs to be made, in order to decide ‘how close to 100% correct answer’ you want to get.
Development work has begun so watch this space
It's great to see that our research is influencing the next stage of the CRM system, and we're looking forward to seeing it go live. Of course, our work isn't over— and nor would we want it to be! Alongside the redevelopment of the IA, I've been redesigning the key pages from Trade Me Admin, and continuing to conduct user research, including first click testing using Chalkmark.
This project has been governed by a steadily developing set of design principles, focused on complex CRM systems and the specific needs of their audience. Two of these principles are to reduce navigation and to design for experts, not novices, which means creating dense, detailed pages. It's intense, complex, and rewarding design work, and we'll be exploring this exciting space in more depth in upcoming posts.
When it comes to designing and testing in the world of information architecture, it’s hard to beat card sorting. As a usability testing method, card sorting is easy to set up, simple to recruit for and can supply you with a range of useful insights. But there’s a long-standing debate in the world of card sorting, and that’s whether it’s better to run card sorts in person (moderated) or remotely over the internet (unmoderated).
This article should give you some insight into the world of online card sorting. We've included an analysis of the benefits (and the downsides) as well as why people use this approach. Let's take a look!
How an online card sort works
Running a card sort remotely has quickly become a popular option just because of how time-intensive in-person card sorting is. Instead of needing to bring your participants in for dedicated card sorting sessions, you can simply set up your card sort using an online tool (like our very own OptimalSort) and then wait for the results to roll in.
So what’s involved in a typical online card sort? At a very high level, here’s what’s required. We’re going to assume you’re already set up with an online card sorting tool at this point.
Define the cards: Depending on what you’re testing, add the items (cards) to your study. If you were testing the navigation menu of a hotel website, your cards might be things like “Home”, “Book a room”, “Our facilities” and “Contact us”.
Work out whether to run a closed or open sort: Determine whether you’ll set the groups for participants to sort cards into (closed) or leave it up to them (open). You may also opt for a mix, where you create some categories but leave the option open for participants to create their own.
Recruit your participants: Whether using a participant recruitment service or by recruiting through your own channels, send out invites to your online card sort.
Wait for the data: Once you’ve sent out your invites, all that’s left to do is wait for the data to come in and then analyze the results.
Online card sorting has a few distinct advantages over in-person card sorting that help to make it a popular option among information architects and user researchers. There are downsides too (as there are with any remote usability testing option), but we’ll get to those in a moment.
Where remote (unmoderated) card sorting excels:
Time savings: Online card sorting is essentially ‘set and forget’, meaning you can set up the study, send out invites to your participants and then sit back and wait for the results to come in. In-person card sorting requires you to moderate each session and collate the data at the end.
Easier for participants: It’s not often that researchers are on the other side of the table, but it’s important to consider the participant’s viewpoint. It’s much easier for someone to spend 15 minutes completing your online card sort in their own time instead of trekking across town to your office for an exercise that could take well over an hour.
Cheaper: In a similar vein, online card sorting is much cheaper than in-person testing. While it’s true that you may still need to recruit participants, you won’t need to reimburse people for travel expenses.
Analytics: Last but certainly not least, online card sorting tools (like OptimalSort) can take much of the analytical burden off you by transforming your data into actionable insights. Other tools will differ, but OptimalSort can generate a similarity matrix, dendrograms and a participant-centric analysis using your study data.
Where in-person (moderated) card sorting excels:
Qualitative insights: For all intents and purposes, online card sorting is the most effective way to run a card sort. It’s cheaper, faster and easier for you. But, there’s one area where in-person card sorting excels, and that’s qualitative feedback. When you’re sitting directly across the table from your participant you’re far more likely to learn about the why as well as the what. You can ask participants directly why they grouped certain cards together.
Online card sorting: Participant numbers
So that’s online card sorting in a nutshell, as well as some of the reasons why you should actually use this method. But what about participant numbers? Well, there’s no one right answer, but the general rule is that you need more people than you’d typically bring in for a usability test.
This all comes down to the fact that card sorting is what’s known as a generative method, whereas usability testing is an evaluation method. Here’s a little breakdown of what we mean by these terms:
Generative method: There’s no design, and you need to get a sense of how people think about the problem you’re trying to solve. For example, how people would arrange the items that need to go into your website’s navigation. As Nielsen Norman Group explains: “There is great variability in different people's mental models and in the vocabulary they use to describe the same concepts. We must collect data from a fair number of users before we can achieve a stable picture of the users' preferred structure and determine how to accommodate differences among users”.
Evaluation method: There’s already a design, and you basically need to work out whether it’s a good fit for your users. Any major problems are likely to crop up even after testing 5 or so users. For example, you have a wireframe of your website and need to identify any major usability issues.
Basically, because you’ll typically be using card sorting to generate a new design or structure from nothing, you need to sample a larger number of people. If you were testing an existing website structure, you could get by with a smaller group.
Where to from here?
Following on from our discussion of generative versus evaluation methods, you’ve really got a choice of 2 paths from here if you’re in the midst of a project. For those developing new structures, the best course of action is likely to be a card sort. However, if you’ve got an existing structure that you need to test in order to usability problems and possible areas of improvement, you’re likely best to run a tree test. We’ve got some useful information on getting started with a tree test right here on the blog.