July 29, 2021
4 min

How to get started with tree testing 🌱

Are your visitors really getting the most out of your website? Tree testing (or sometimes referred to as reverse card sorting) takes away the guesswork by telling you how easily, or not, people can find information on your website. Discover why Treejack is the tool of choice for website architects.

What’s tree testing and why does it matter? 🌲 👀

Whether you’re building a website from scratch or improving an existing website, tree testing helps you design your website architecture with confidence. How? Tools like Treejack use analysis to help assess how findable your content is for people visiting your website. 

It helps answer burning questions  like:

  • Do my labels make sense?
  • Is my content grouped logically?
  • Can people find what they want easily and quickly?  If not, why not?

Treejack provides invaluable intel for any Information Architect. Why? Knowing where and why people get lost trying to find your content, gives you a much better chance of fixing the actual problem. And the more easily people can find what they’re looking for, the better their experience which is ultimately better for everyone.

How’s tree testing work? 🌲🌳🌿

Tree testing can be broken down into two main parts: 

  • The Tree - Your tree is essentially your site map – a text-only version of your website structure.
  • The Task - Your task is the activity you ask participants to complete by clicking through your tree and choosing the information they think is right. Tools like Treejack analyse the data generated from doing the task to build a picture of how people actually navigated your content in order to try and achieve your task.  It tells you if they got it right or wrong, the path they took and the time it took them.

Whether you’re new to tree testing or already a convert, effective tree testing using Treejack has some key steps.

Step 1.  The ‘ Why’:  Purpose and goals of tree testing

Ask yourself what part of your information architecture needs improvement – is it your whole website or just parts of it? Also think about your audience, they’re the ones you’re trying to improve the website for so the more you know about their needs the better. 

Tip:  Make the most of what tree testing offers to improve your website by building it into your overall design project plan

Step 2.  The ‘How’:   Build your tree

You can build your tree using two main approaches: 

  • Create your tree in spreadsheet and import it into Treejack or
  • Build your tree in Treejack itself, using the labels and structure of your website.

Tip:  Your category labels are known as ‘parent nodes’. Your information labels are known as ‘child nodes’.

Step 3. The ‘What’: Write your tasks

The quality of your tasks will be reflected in the usefulness of your data so it’s worth making sure you create tasks that really test what you want to improve.

Tip:  Use plain language that feels natural and try to write your tasks in a way that reflects the way people who visit your website might actually think when they are trying to find information on your site.

Step 4.  The ‘Who’:  Recruit participants

The quality of your data will largely depend on the quality of your participants. You want people who are as close to your target audience as possible and with the right attitude - willing and committed to being involved.

Tip:  Consider offering some kind of incentive to participants – it shows you value their involvement.

Step 5.  The ‘insights’: Interpret your results

Now for the fun part – making sense of the results. Treejack presents the data from your tree testing as a series of tables and visualizations. You can download them in a spreadsheet in their raw format or customized to your needs.

Tip:  Use the results to gain quick, practical insights you can act on right away or as a starter to dive deeper into the data.

When should I use tree testing? ⌛

Tree testing is useful whenever you want to find out if your website content is labelled and organised in a way that’s easy to understand.  What’s more it can be applied for any website, big (10+ levels with 10000s of labels) or small (3 levels and 22 labels) and any size in between.  Our advice for using Treejack is simply this: test big, test small, test often.

Share this article
Author
Optimal
Workshop

Related articles

View all blog articles
Learn more
1 min read

How to interpret your card sort results Part 2: closed card sorts and next steps

In Part 1 of this series we looked at how to interpret results from open and hybrid card sorts and now in Part 2, we’re going to talk about closed card sorts. In closed card sorts, participants are asked to sort the cards into predetermined categories and are not allowed to create any of their own. You might use this approach when you are constrained by specific category names or as a quick checkup before launching a new or newly redesigned website.In Part 1, we also discussed the two different - but complementary - types of analysis that are generally used together for interpreting card sort results: exploratory and statistical. Exploratory analysis is intuitive and creative while statistical analysis is all about the numbers. Check out Part 1 for a refresher or learn more about exploratory and statistical analysis in Donna Spencer’s book.

Getting started

Closed card sort analysis is generally much quicker and easier than open and hybrid card sorts because there are no participant created category names to analyze - it’s really just about where the cards were placed. There are some similarities about how you might start to approach your analysis process but overall there’s a lot less information to take in and there isn’t much in the way of drilling down into the details like we did in Part 1.Just like with an open card sort, kick off your analysis process by taking an overall look at the results as a whole. Quickly cast your eye over each individual card sort and just take it all in. Look for common patterns in how the cards have been sorted. Does anything jump out as surprising? Are there similarities or differences between participant sorts?

If you’re redesigning an existing information architecture (IA), how do your results compare to the current state? If this is a final check up before launching a live website, how do these results compare to what you learned during your previous research studies?If you ran your card sort using information architecture tool OptimalSort, head straight to the Overview and Participants Table presented in the results section of the tool. If you ran a moderated card sort using OptimalSort’s printed cards, you’ve probably been scanning them in after each completed session, but now is a good time to double check you got them all. And if you didn’t know about this handy feature of OptimalSort, it’s something to keep in mind for next time!

The Participants Table shows a breakdown of your card sorting data by individual participant. Start by reviewing each individual card sort one by one by clicking on the arrow in the far left column next to the Participants numbers. From here you can easily flick back and forth between participants without needing to close that modal window. Don’t spend too much time on this — you’re just trying to get a general impression of how the cards were sorted into your predetermined categories. Keep an eye out for any card sorts that you might like to exclude from the results. For example participants who have lumped everything into one group and haven’t actually sorted the cards.

Don’t worry- excluding or including participants isn’t permanent and can be toggled on or off at anytime.Once you’re happy with the individual card sorts that will and won’t be included in your results visualizations, it’s time to take a look at the Results Matrix in OptimalSort. The Results Matrix shows the number of times each card was sorted into each of your predetermined categories- the higher the number, the darker the shade of blue (see below).

A screenshot of the Results Matrix tab in OptimalSort.
Results Matrix in OptimalSort.

This table enables you to quickly and easily get across how the cards were sorted and gauge the highest and lowest levels of agreement among your participants. This will tell you if you’re on the right track or highlight opportunities for further refinement of your categories.If we take a closer look (see below) we can see that in this example closed card sort conducted on the Dewey Decimal Classification system commonly used in libraries, The Interpretation of Dreams by Sigmund Freud was sorted into ‘Philosophy and psychology’ 38 times in study a completed by 51 participants.

A screenshot of the Results Matrix in OptimalSort zoomed in.
Results Matrix in OptimalSort zoomed in with hover.

In the real world, that is exactly where that content lives and this is useful to know because it shows that the current state is supporting user expectations around findability reasonably well. Note: this particular example study used image based cards instead of word label based cards so the description that appears in both the grey box and down the left hand side of the matrix is for reference purposes only and was hidden from the participants.Sometimes you may come across cards that are popular in multiple categories. In our example study, How to win friends and influence people by Dale Carnegie, is popular in two categories: ‘Philosophy & psychology’ and ‘Social sciences’ with 22 and 21 placements respectively. The remaining card placements are scattered across a further 5 categories although in much smaller numbers.

A screenshot of the Results Matrix in OptimalSort showing cards popular in multiple categories.
Results Matrix showing cards popular in multiple categories.

When this happens, it’s up to you to determine what your number thresholds are. If it’s a tie or really close like it is in this case, you might review the results against any previous research studies to see if anything has changed or if this is something that comes up often. It might be a new category that you’ve just introduced, it might be an issue that hasn’t been resolved yet or it might just be limited to this one study. If you’re really not sure, it’s a good idea to run some in-person card sorts as well so you can ask questions and gain clarification around why your participants felt a card belonged in a particular category. If you’ve already done that great! Time to review those notes and recordings!You may also find yourself in a situation where no category is any more popular than the others for a particular card. This means there’s not much agreement among your participants about where that card actually belongs. In our example closed card sort study, the World Book Encyclopedia was placed into 9 of 10 categories. While it was placed in ‘History & geography’ 18 times, that’s still only 35% of the total placements for that card- it’s hardly conclusive.

A screenshot of the Results Matrix showing a card with a lack of agreement.
Results Matrix showing a card with a lack of agreement.

Sometimes this happens when the card label or image is quite general and could logically belong in many of the categories. In this case, an encyclopedia could easily fit into any of those categories and I suspect this happened because people may not be aware that encyclopedias make up a very large part of the category on the far left of the above matrix: ‘Computer science, information & general works’. You may also see this happening when a card is ambiguous and people have to guess where it might belong. Again - if you haven’t already - if in doubt, run some in-person card sorts so you can ask questions and get to the bottom of it!After reviewing the Results Matrix in OptimalSort, visit the Popular Placements Matrix to see which cards were most popular for each of your categories based on how your participants sorted them (see below 2 images).

A screenshot of the Popular Placements Matrix in OptimalSort, with the top half of the diagram showing.
Popular Placements Matrix in OptimalSort- top half of the diagram.

A screenshot of the Popular Placements Matrix in OptimalSort, with the top half of the diagram showing.
Popular Placements Matrix in OptimalSort- scrolled to show the bottom half of the diagram.

The diagram shades the most popular placements for each category in blue making it very easy to spot what belongs where in the eyes of your participants. It’s useful for quickly identifying clusters and also highlights the categories that didn’t get a lot of card sorting love. In our example study (2 images above) we can see that ‘Technology’ wasn’t a popular card category choice potentially indicating ambiguity around that particular category name. As someone familiar with the Dewey Decimal Classification system I know that ‘Technology’ is a bit of a tricky one because it contains a wide variety of content that includes topics on medicine and food science - sometimes it will appear as ‘Technology & applied sciences’. These results appear to support the case for exploring that alternative further!

Where to from here?

Now that we’ve looked at how to interpret your open, hybrid and closed card sorts, here are some next steps to help you turn those insights into action!Once you’ve analyzed your card sort results, it’s time to feed those insights into your design process and create your taxonomy which goes hand in hand with your information architecture. You can build your taxonomy out in Post-it notes before popping it into a spreadsheet for review. This is also a great time to identify any alternate labelling and placement options that came out of your card sorting process for further testing.From here, you might move into tree testing your new IA or you might run another card sort focussing on a specific area of your website. You can learn more about card sorting in general via our 101 guide.

When interpreting card sort results, don’t forget to have fun! It’s easy to get overwhelmed and bogged down in the results but don’t lose sight of the magic that is uncovering user insights.I’m going to leave you with this quote from Donna Spencer that summarizes the essence of card sort analysis quite nicely:Remember that you are the one who is doing the thinking, not the technique... you are the one who puts it all together into a great solution. Follow your instincts, take some risks, and try new approaches. - Donna Spencer

Further reading

  • Card Sorting 101 – Learn about the differences between open, closed and hybrid card sorts, and how to run your own using OptimalSort.

Learn more
1 min read

How to Spot and Destroy Evil Attractors in Your Tree (Part 1)

Usability guru Jared Spool has written extensively about the 'scent of information'. This term describes how users are always 'on the hunt' through a site, click by click, to find the content they’re looking for. Tree testing helps you deliver a strong scent by improving organisation (how you group your headings and subheadings) and labelling (what you call each of them).

Anyone who’s seen a spy film knows there are always false scents and red herrings to lead the hero astray. And anyone who’s run a few tree tests has probably seen the same thing — headings and labels that lure participants to the wrong answer. We call these 'evil attractors'.In Part 1 of this article, we’ll look at what evil attractors are, how to spot them at the answer end of your tree, and how to fix them. In Part 2, we’ll look at how to spot them in the higher levels of your tree.

The false scent — what it looks like in practice

One of my favourite examples of an evil attractor comes from a tree test we ran for consumer.org.nz, a New Zealand consumer-review website (similar to Consumer Reports in the USA). Their site listed a wide range of consumer products in a tree several levels deep, and they wanted to try out a few ideas to make things easier to find as the site grew bigger.We ran the tests and got some useful answers, but we also noticed there was one particular subheading (Home > Appliances > Personal) that got clicks from participants looking for very different things — mobile phones, vacuum cleaners, home-theatre systems, and so on:

pic1

The website intended the Personal appliance category to be for products like electric shavers and curling irons. But apparently, Personal meant many things to our participants: they also went there for 'personal' items like mobile phones and cordless drills that actually lived somewhere else.This is the false scent — the heading that attracts clicks when it shouldn’t, leading participants astray. Hence this definition: an evil attractor is a heading that draws unwanted traffic across several unrelated tasks.

Evil attractors lead your users astray

Attracting clicks isn’t a bad thing in itself. After all, that’s what a good heading does — it attracts clicks for the content it contains (and discourages clicks for everything else). Evil attractors, on the other hand, attract clicks for things they shouldn’t. These attractors lure users down the wrong path, and when users find themselves in the wrong place they'll either back up and try elsewhere (if they’re patient) or give up (if they’re not). Because these attractor topics are magnets for the user’s attention, they make it less likely that your user will get to the place you intended. The other evil part of these attractors is the way they hide in the shadows. Most of the time, they don’t get the lion’s share of traffic for a given task. Instead, they’ll poach 5–10% of the responses, luring away a fraction of users who might otherwise have found the right answer.

Find evil attractors easily in your data

The easiest attractors to spot are those at the answer end of your tree (where participants ended up for each task). If we can look across tasks for similar wrong answers, then we can see which of these might be evil attractors.In your Treejack results, the Destinations tab lets you do just that. Here’s more of the consumer.org.nz example:

Pic2

Normally, when you look at this view, you’re looking down a column for big hits and misses for a specific task. To look for evil attractors, however, you’re looking for patterns across rows. In other words, you’re looking horizontally, not vertically. If we do that here, we immediately notice the row for Personal (highlighted yellow). See all those hits along the row? Those hits indicate an attractor — steady traffic across many tasks that seem to have little in common. But remember, traffic alone is not enough. We’re looking for unwanted traffic across unrelated tasks. Do we see that here? Well, it looks like the tasks (about cameras, drills, laptops, vacuums, and so on) are not that closely related. We wouldn’t expect users to go to the same topic for each of these. And the answer they chose, Personal, certainly doesn’t seem to be the destination we intended. While we could rationalise why they chose this answer, it is definitely unwanted from an IA perspective. So yes, in this case, we seem to have caught an evil attractor red-handed. Here’s a heading that’s getting steady traffic where it shouldn’t.

Evil attractors are usually the result of ambiguity

It’s usually quite simple to figure out why an item in your tree is an evil attractor. In almost all cases, it’s because the item is vague or ambiguous — a word or phrase that could mean different things to different people. Look at our example above. In the context of a consumer-review site, Personal is too general to be a good heading. It could mean products you wear, or carry, or use in the bathroom, or a number of things. So, when those participants come along clutching a task, and they see Personal, a few of them think 'That looks like it might be what I’m looking for', and they go that way.Individually, those choices may be defensible, but as an information architect, are you really going to group mobile phones with vacuum cleaners? The 'personal' link between them is tenuous at best.

Destroy evil attractors by being specific

Just as it’s easy to see why most attractors attract, it’s usually easy to fix them. Evil attractors trade in vagueness and ambiguity, so the obvious remedy is to make those headings more concrete and specific. In the consumer-site example, we looked at the actual content under the Personal heading. It turned out to be items like shavers, curling irons, and hair dryers. A quick discussion yielded Personal care as a promising replacement — one that should deter people looking for mobile phones and jewellery and the like.In the second round of tree testing, among the other changes we made to the tree, we replaced Personal with Personal Care. A few days later, the results confirmed our thinking. Our former evil attractor was no longer luring participants away from the correct answers:

Pic3

Testing once is good, testing twice is magic

This brings up a final point about tree testing (and about any kind of user testing, really): you need to iterate your testing —  once is not enough.The first round of testing shows you where your tree is doing well (yay!) and where it needs more work so you can make some thoughtful revisions. Be careful though. Even if the problems you found seem to have obvious solutions, you still need to make sure your revisions actually work for users, and don’t cause further problems. The good news is, it’s dead easy to run a second test, because it’s just a small revision of the first. You already have the tasks and all the other bits worked out, so it’s just a matter of making a copy in Treejack, pasting in your revised tree, and hooking up the correct answers. In an hour or two, you’re ready to pilot it again (to err is human, remember) and send it off to a fresh batch of participants.

Two possible outcomes await.

  • Your fixes are spot-on, the participants find the correct answers more frequently and easily, and your overall score climbs. You could have skipped this second test, but confirming that your changes worked is both good practice and a good feeling. It’s also something concrete to show your boss.
  • Some of your fixes didn’t work, or (given the tangled nature of IA work) they worked for the problems you saw in Round 1, but now they’ve caused more problems of their own. Bad news, for sure. But better that you uncover them now in the design phase (when it takes a few days to revise and re-test) instead of further down the track when the IA has been signed off and changes become painful.

Stay tuned for more on evil attractors

In Part 1, we’ve covered what evil attractors are and how to spot them at the answer end of your tree: that is, evil attractors that participants chose as their destination when performing tasks. Hopefully, a future version of Treejack will be able to highlight these attractors to make your analysis that much easier.

In Part 2, we’ll look at how to spot evil attractors in the intermediate levels of your tree, where they lure participants into a section of the site that you didn’t intend. These are harder to spot, but we’ll see if we can ferret them out.Let us know if you've caught any evil attractors red-handed in your projects.

Learn more
1 min read

Which comes first: card sorting or tree testing?

“Dear Optimal Workshop,I want to test the structure of a university website (well certain sections anyway). My gut instinct is that it's pretty 'broken'. Lots of sections feel like they're in the wrong place. I want to test my hypotheses before proposing a new structure. I'm definitely going to do some card sorting, and was planning a mixture of online and offline. My question is about when to bring in tree testing. Should I do this first to test the existing IA? Or is card sorting sufficient? I do intend to tree test my new proposed IA in order to validate it, but is it worth doing it upfront too?" — Matt

Dear Matt,

Ah, the classic chicken or the egg scenario: Which should come first — tree testing or card sorting?

It’s a question that many researchers often ask themselves, but I’m here to help clear the air!You should always use both methods when changing up your information architecture (IA) in order to capture the most information.

Tree testing and card sorting, when used together, can give you fantastic insight into the way your users interact with your site. First of all, I’ll run through some of the benefits of each testing method.

What is card sorting and why should I use it?

Card sorting is a great method to gauge the way in which your users organize the content on your site. It helps you figure out which things go together and which things don’t. There are two main types of card sorting: open and closed.

Closed card sorting involves providing participants with pre-defined categories into which they sort their cards. For example, you might be reorganizing the categories for your online clothing store for women. Your cards would have all the names of your products (e.g., “socks”, “skirts” and “singlets”) and you also provide the categories (e.g.,“outerwear”, “tops” and “bottoms”).

Open card sorting involves providing participants with cards and leaving them to organize the content in a way that makes sense to them. It’s the opposite to closed card sorting, in that participants dictate the categories themselves and also label them. This means you’d provide them with the cards only — no categories.

Card sorting, whether open or closed, is very user focused. It involves a lot of thought, input, and evaluation from each participant, helping you to form the structure of your new IA.

What is tree testing and why should I use it?

Tree testing is a fantastic way to determine how your users are navigating your site and how they’re finding information. Your site is organised into a tree structure, sorted into topics and subtopics, and participants are provided with some tasks that they need to perform. The results will show you how your participants performed those tasks, if they were successful or unsuccessful, and which route they took to complete the tasks. This data is extremely useful for creating a new and improved IA.

Tree testing is an activity that requires participants to seek information, which is quite the contrast to card sorting — an activity that requires participants to sort and organize information. Each activity requires users to behave in different ways, so each method will give its own valuable results.

Should you run a card or tree test first?

In this scenario, I’d recommend running a tree test first in order to find out how your existing IA currently performs. You said your gut instinct is telling you that your existing IA is pretty “broken”, but it’s good to have the data that proves this and shows you where your users get lost.

An initial tree test will give you a benchmark to work with — after all, how will you know your shiny, new IA is performing better if you don’t have any stats to compare it with? Your results from your first tree test will also show you which parts of your current IA are the biggest pain points and from there you can work on fixing them. Make sure you keep these tasks on hand — you’ll need them later!

Once your initial tree test is done, you can start your card sort, based on the results from your tree test. Here, I recommend conducting an open card sort so you can understand how your users organize the content in a way that makes sense to them. This will also show you the language your participants use to name categories, which will help you when you’re creating your new IA.

Finally, once your card sort is done you can conduct another tree test on your new, proposed IA. By using the same (or very similar) tasks from your initial tree test, you will be able to see that any changes in the results can be directly attributed to your new and improved IA.

Once your test has concluded, you can use this data to compare the performance from the tree test for your original information architecture — hopefully it is much better now!

Seeing is believing

Explore our tools and see how Optimal makes gathering insights simple, powerful, and impactful.