The definitive guide to creating effective Tree Testing


Information architecture is a term that describes what is the organization of information (content) on your website. It's the information backbone of the site - typically embodied by a site map or menu. Your website's users rely on information architecture — how you label and organize your content - in order to use the website properly.

Tree testing is a method that tells you how easily users can find information on your website (or application, or any other product where information architecture may also be present). If users get lost, it tells you exactly where that is.

Tree testing can answer the following questions for you:

  • Do users understand labels as they're intended?
    Is content split into groups that seem natural to users? Is it grouped logically for users?
  • Can users find the information that they want, easily and quickly? Are they looking for it somewhere else? What’s stopping them from finding the content they want?
  • With the use of UXtweak Tree Testing, you will be able to find answers to these questions and pinpoint any problems with your information architecture. Using the gathered insights on how users interact with your information structures, you can tweak the structures to aim for best performance.

How it works 

First, to prepare for a tree test study in Tree Testing, you will need two things - a tree and tasks for the users.

What is tree?

Your tree is a text-only version of your website structure (similar to a site map or hierarchical navigation - menu).

Of course, you can test any other information structures or even your whole information architecture, which the main menu is only a part of. Imagine basically a map of your website/app/product that can be tested either whole or in smaller parts.

If we were doing a tree test for a regional bus company, our tree might look something like this:

Which in UXtweak Tree Testing editor will look like this:

You can create your tree in the Tree Testing editor by either making it from scratch, by importing it as a CSV file, or pulling it automated from your website.

What are the tasks?

In the tasks, we ask respondents to find the location of some piece of content or functionality within the tree.

During a Tree Testing, the job of the respondents is to click through the tree and find the right solutions to the tasks they've been given. At first, the respondent can only see the top layer of the tree. More of the tree's lower layers reveals itself as respondent opens category labels to see their children.

Continuing with our previous example of a tree test for a regional bus company, we can formulate a task such as this:

You'd like to buy a bus ticket tomorrow early in the morning. Where would you look for the time when the ticket sales office opens?

The respondents have to navigate to the right answer by clicking through all the layers of the tree. Of course, if they think they chose a wrong direction, they can always backtrack.

Once some respondents are finished with the tasks, we can view the results, which contain information such as:

  • Success rate - What percentage of people resolved the tasks correctly?
  • Directness rate - How many people found the answer without backtracking?
  • Time taken - How long did it take respondents to fulfill the tasks?
  • Paths - What paths did the respondents take? Where did they click first and what were their final answers?

When should I use it? 

The elegance of tree tests lies in their flexibility regardless of the size of your project or the stage of development. It doesn't matter if you're a fledgling new startup just trying to design first prototypes of your new world-changing app, or if you're a well established company looking to tune up the user experience of a website almost as old as the internet itself.

In UXtweak Tree Testing, how you set up tree tests is entirely up to you. You can create a simple standard tree with minimum effort in just a few minutes. You can run tests with tens of tasks that cover most important use cases, or you can do it with just a single task. You can run tests with only a few people, or you can collect results from hundreds of users. It all depends on your wants and needs.

Set your objectives 

So, you've decided use tree tests to gain insight into your information architecture design. Congratulations! But before you start digging into building your first tree test. you should probably decide on the goals of your study. The clearer the objectives for the testing, the clearer the insights that you are going to get. The main questions you should probably ask are:

  • What do you want to test and improve?
  • Who are the tree tests going to be aimed at and why?
  • When in the project's life cycle will the tree tests be used?

What do you want to test and improve? 

To prepare a tree test, you should first settle on what's going to be its intended scope. Maybe your entire website needs Tree Testing, because you want to tweak everything to perfection. Or maybe you just want to test a partial information structure, because you want to know whether users can find that new feature that you just added.

In the case of large scale tests, you might decide to split them into several smaller tree tests, focused on smaller parts of your information architecture. You can also do just one big tree test with broad coverage of the entire website, because you know something in there is just not working and you want to get a general idea of what's works and what doesn't.

Running one tree test at a time might work out for you. But you can also test a few alternatives in parallel, to gather more data and to be able to make the most informed decisions possible. A/B tree test studies can help you find the knowledge you need to judge between design options and to find solutions that combine the best aspects of them all. In any case, you should have a clear idea of what you want to test and for what reason.

Who are the tree tests going to be aimed at and why? 

The entire point of tree tests is to help you make the structure of information in your designs more user centered. Knowing your audience - the reason why you’re improving your web or product - will help you focus your tree testing.

You can apply information from your customer personas, analytics data from real customers, customer support, surveys and other sources of feedback, etc. It will help you formulate your tree test tasks, assign them priorities and decide on their order as well as recruit test respondents. Pay mind to who your users are and what are their goals.

When in the project's life cycle will the tree tests be used? 

Aside from being familiar with both the objects of your tree tests (what is it that you'll be testing) and the subjects of your tree tests (who are you going to be testing it with), you must also think about the way of integrating tree tests into your project.

Thanks to their flexibility, tree tests can be added at any stage of development. They can be used to support informed decisions but also to provide data about value of your design changes, in handy visuals that you can show off in front of the upper management or shareholders.

If you're just starting to design a menu of a website, you can use Tree Testing to validate and further tweak the results of your card sort (which you can - and should - use as the first step). Learn more about Why card sorting loves tree testing?

If you introduce Tree Testing into a running project to improve the product's usability, you can start the first iteration with a tree test, which will determine the degree of changes that will be needed and pinpoint the most problematic areas of the design. From that point onward, tree tests can be used in each iteration, as an important marker of the effects of changes on the usability. New iterative tree tests may also be added later on, as changes in the project arise.

Build your tree 

The tree in a tree test is the representation of an information structure that the respondents are going to be interacting with. The tree consists of labels, organized in a tree-like structure. Each labeled node represents either a group or content. Groups (categories or subcategories) wrap more labels inside them, allowing the tree to branch, which is why they're also known as parent nodes. Content labels are found at the end of these branches (which is why they're also known as leaf nodes).

Respondents click the group nodes to open them and view their contents. Only content (leaf) nodes can be selected by respondents as the answers to tasks. Learn more about Why you can't select parent nodes as correct answers?

Trees are defined purely by text labels organized in a tree-like structure, so they're easy to create and are ideal for testing the information structure without the need of it being implemented on a website.

Use the Tree Testing editor for creating completely new information structures or for adjusting existing ones. When you already have a previous copy of the tree that you'd like to reuse and modify, you can import it as a base for a new one.

If you're using the same tree tests as benchmarks in each iteration, with the only adjustments being to the tree, you can clone the entire tree test and then change only what needs to be changed. A cloned Tree Testing study will include the same tree, tasks and questionnaires.

When designing your tree, think about how the respondents will be selecting from the labels to give their answers to the tasks. Since a tree test doesn't show the content of the pages that would be shown to users as they're clicking the equivalent labels in an actual menu, the respondents have only the text of the labels to go on.

If your menu contains a content node that represents a variety of content, the node should be turned into a group node, listing the individual pieces of content within it as its children in the tree. For example, an About page with the information on the company's services, history, board, chief executives and location should be turned into a group. The tree doesn't need to be a replica of your menu, it's more important that it reflects your information architecture, which can but doesn't always have to look the same.

Write your tasks 

The second part of your tree test (that you need to prepare along with the tree) are the tasks. Tasks in your tree tests should represent the user stories on your website or product. Writing your tasks properly is essential in order to have the respondents behave as naturally as possible. You want the respondents to interact with menus and other information structures just like they would in a real life situation.

Writing tasks that cover your objectives 

Consider the objectives that you decided on for the testing and use them as a basis for formulating your tasks. Let's say that we have an internet store menu and one of your objectives is to find out whether your customers can find a refund form. The task you formulate for this objective could say: "One of the items in your order was damaged. Find where you could resolve this issue.". The results for this task will then answer questions like:

  • How many people have successfully found the refund form?
  • How directly have the people chosen their answers?
  • How long did it take them to find the refund form?
  • What paths in the menu did the people take to find refund form?

If you write your tasks so that they elaborately cover the different areas that you want to improve, the data that you receive from the Tree Testing will also show you exactly to what extent you've succeeded in meeting your objectives. If only 50% of the respondents have found the refund form and the other 50% looked for it in other parts of the menu first, we can safely say that these metrics prove that the refund form needs a relocation.

Writing tasks as scenarios 

You want your tasks to gently nudge your respondents towards acting like in a real situation that might occur on your website. Reading your tasks should help people get into the right mindset. A good way to do this is to present the tasks as scenarios using simple and informal language to set up the situation and prompt the respondents to seek a solution.

For example, instead of

Select where you think you'd find the post office business hours.

you'd write

You'd like to send a package by mail, but you don't know when the post office is open. Where would you look for the information that would help you?

With this, the task provides the respondents with context and meaning to focus their attention and to make them process the information more deeply, instead of giving them the exact directions for how to solve the task without any actual rhyme or reason behind their actions.

Writing tasks without giving away the solutions 

When given a task, your respondents might (and will) catch onto any hints that the wording of the tasks might provide them to find the right answer. If you use the same language in both the tree and the tasks, the people might just match the text without even thinking about the task more deeply. Don't use the exact names of your labels when describing the task scenarios.

There is another reason why you should try to avoid using the same language in tasks and labels while trying to simulate real life situations in your tasks. For any given situation, there is no telling what is on the minds of the users when approaching the same problem. Different users can be looking for the exact same piece of information, yet when describing it, they might inadvertently use very different language. Use different language than your tree.

For example:

Let's say that you're designing a website for an indie music distribution platform. A function you call "Explore" is supposed to help the users find new interesting musicians and bands when they themselves don't yet know what exactly they're looking for. However, when put into words, the tasks your target users have on their minds would be very different and probably not involve the word "Explore" at all, despite the fact that they could all use this function:

  • Can I search for some bands that I haven't even heard of yet?
  • What kinds of other music is there? I'm in the mood for something different.
  • How do I look up some random music?

If you want to know whether people can arrive at "Explore" as the solution to a problem by their own innate understanding what the problem is, your task shouldn't plant the word "Explore" in their minds in the first place.

Limiting the number of tasks 

When preparing a tree test, one of the decisions you have to make is how many tasks you want to test. In tree tests, the number of your tasks determines the size of your test. You can have only one task, or you can have more - UXtweak Tree Testing doesn't limit your number of tasks per test. However, while it is correct to want to include tasks for all of the objectives of your testing, with the growing number of tasks, you should stop to reconsider splitting the tasks between multiple tree tests instead.

Our recommended maximum number of tasks is 10. There is a practical reason to not letting your tree tests grow too big. If your respondents complete too many tasks in your tree, they will start solving later tasks with more skill than normal, thanks to getting accustomed to the tree's structure.

Even with fewer than 10 tasks, it might be a good approach to enable randomization of the order of tasks, so the results aren't biased in favor of the later tasks. For example, with more tasks, it becomes more likely that the respondent will remember a solution to their current task from the time when they were trying to solve a different one.

A high number of tasks may also lead to people abandoning the study without completing it. Depending on how you recruit your respondents, consider using their time efficiently, so they don't quit on you because you got too greedy.

Prepare your questionnaire 

Apart from the tasks themselves, you may also have other questions for your respondents. When analyzing the results of a tree test study, additional information about the users, such as their demographics, their stances, user experiences (or experiences in general) can be useful. You can later use such data for sorting respondents into groups and filtering out anyone who's irrelevant. Or, if you only want to admit a certain kind of people to take part in the study, you can set up a screening question and filter your target group in advance. This is what the questionnaires are for.

In UXtweak Tree Testing, you can create questionnaires to be used before the study, after the study or even immediately after each task. The answers to the questions can be either voluntary or required. You can use several types of questions in your questionnaire:

  • Single line text
  • Multi-line text
  • Radio button (single answer multi-choice)
  • Checkbox select (multiple answer multi-choice)
  • Dropdown select

When asking the respondent to evaluate something (such as their own computer skills, or how content they are with navigating your website), use multiple choice questions, such as:

  • 5 - Great
  • 4 - Good
  • 3 - Average
  • 2 - Bad
  • 1 - Awful

How to use a pre-study questionnaire? 

The questionnaire before the Tree Testing is usually used to get information about the respondent, such as their age, occupation, level of ICT (information-communication technology) skills, etc. You can also ask about experiences with your company, website or even particular life situations (if you're testing an application for a car renting company, you might ask about their experiences with renting cars).

If you want the respondents to come into the tasks partially uninformed about what exactly it is they're going to be doing, you may want to move the more detailed questions about the domain ("When renting a car online, was the brand of the car important to you?") to the questionnaire at the end of the Tree Testing.

How to use a post-study questionnaire? 

The questionnaire after the Tree Testing is a good place for collecting additional feedback. Give your respondents enough space to express themselves and you might get additional feedback that you will find useful during qualitative Tree Testing analysis. If the respondents have more to say than what you were originally asking, you may even get feedback that - while totally unrelated to the main objectives of the test - is still be relevant or even enlightening.

How to use per-task questions? 

When you want to gather some detailed feedback about any specific task, it is usually better to ask about it just after said task is finished. (Unless the questions could also infuence the respondent's natural behavior while taking the following tasks.) Respondents tend to quickly forget the specifics of the tasks and of their own user experience. To avoid losing vital information and various mixups, the sooner you ask for details, the better.

Recruit your respondents 

After you launched your tree test, it is now time to share it to your respondents. UXtweak Tree Testing gives you the option to easily share your tree tests on social networks (Facebook, Twitter, LinkedIn), or you can share the study link to your tree test in any way you like.

Of course, first you need to know how exactly you're going to recruit those respondents. The quality of your results will depend greatly on the quality of your respondents, so the recruitment process is not to be neglected. There are a few things to keep in mind while recruiting:

  • who are the people you want to recruit,
  • how many people you want to recruit,
  • what's the information that you want to share with your respondents.

Recruiting the right people 

In the fourth chapter of this guide, we've talked about about how it's important to know your users and who they are. When recruiting respondents for tree tests, try to match their demographics with the demographics of your intended users. Depending on what that exact demographics that is, there are several ways to go about achieving this.

If you already have an existing audience who are already using your product or website, you can spread out invitations to your newsletter subscribers, or to people who are in your customer mailing lists.

If you don't have any users lists yet, or if you just want to test an information structure (like a part of a menu) that even first-timers should be able to use, a good way to find respondents would be through social media. You can survey your respondents for details about themselves using the Tree Testing questionnaires and for anyone who doesn't meet your target demographics, you can leave them out of the results.

If your website already has access to your target audience, you can use the perfect way of recruiting respondents. This one is a UXtweak original - UXtweak Recruiting Widget for recruiting respondents directly from your website.

Recruiting enough people 

Regarding the ideal number of respondents, as with any other aspect of tree tests, the shoe doesn't always fit all. One design might be more suited for a public tree test with a general audience, where generally the more respondents you have, the better. Another design in earlier design stages, might be better fit for conducting a small internal Tree Testing with just a couple of respondents which could still uncover most of its glaring main UX issues, while saving their company time and money. Another useful approach is to combine both smaller tree tests (which can be done more often) with larger scale ones (more demanding, provides more reliable quantitative data) to get the benefits of both.

There is, however, a rule of thumb regarding the number of respondents. If you want to rely on the statistical significance of the collected data and the numerical metrics like success rate and directness, we do recommend to aim for a number of respondents in the range between 40 to 60 users, with 30 as the bare minimum.

Please, don't forget that unless you provide an incentive (be it a financial reward, a competition or some other benefits) no-one has the obligation to feel at all motivated to actually spend their time by doing your testing. This goes for your own users and customers, and even more for random people on social media. Without an incentive, your invite needs to reach a lot more people than you actually need to surpass the required minimum.

Also, if you plan to conduct the same Tree Testing in each iteration as a benchmark of improvement, the number of test users between tests should always be similar (or growing), or else the results might not be comparable. The demographic representation within the selection of respondents should also remain the same. For example, if one tree test has a significantly higher percentage of men than women compared to previous tree tests, the test results might be affected.

Informing and engaging your respondents 

Your respondents should know in advance what they're getting themselves into. Inform them about what will be required of them and how much of their time it's going to take. Don't forget to pepper the test instructions and your invitation with phrases about how important this test is for you and how it will help you to make your website better. Doing this will help you engage the respondents so they feel comfortable while doing the tree test and so they don't leave before finishing their tasks.

Using UXtweak Recruiting Widget 

Good respondents can be a difficult to find and incentives can dig deep into your pocket. With the UXtweak Recruiting Widget, recruiting becomes cheap and simple. Recruiting Widget turns visitors into respondents. Does your website already have existing users? Would you like to do tree testing for your e-commerce website with real customers? Then add the Recruiting Widget script to your website and let Recruiter handle the recruiting for you.

"Would you like to help us improve our website and get a nice reward for just a few minutes of your time?"

This question (or something else like it) is what visitors are asked when they come to your website and see the Recruiting Widget. Rewards in the form of coupons can be imported into UXtweak and automatically given out to respondents after they complete the study. This direct recruitment between you and your testers cuts out any middlemen, making the process strightforward and beneficial to both sides. Of course, you can also forgo a reward. The Recruiting Widget is fully customizable, including its looks, messages and when and where on the website it appears. You can have the recruiter appear only on certain pages, have it appear immediately or with some delay (after time passes, after scrolling down, etc.).

Interpret your results 

You can start viewing the results of your tree test from the moment that you launch the test and the first respondents start coming in. The data is presented in such a way as to provide you with quick insights, with ways for digging deeper into the data as you deem necessary. You can use the results overview to quickly pinpoint problems, or you can use a deeper data analysis approach for a more complex user experience evaluation.

Checking the big picture in the results Overview 

In the Overview, you can view the summarized information about the progress of the tree test (number of respondents and last activity) and the overall current aggregated results (success and directness rate, time taken). The success and directness scores are also visualized and comparable between tasks in the Tasks chart.

  • The success rate means how many of the people successfully found the right answer to the task.
  • The directness rate means how many of the people fulfilled the tasks without getting lost in other branches of the tree and backtracking.

Checking and filtering Respondents 

The Respondents list shows you a table of all the Tree Testing study respondents. For each respondent, we have a basic summary - the time it took them to complete the study and the number of successfully completed and skipped tasks.

You can also view the details to see more data on the respondents themselves - their device, browser, resolution and location, their concrete questionnaire and task answers and any comments that they may have left behind.

Before you start working with the collected data, you may want to clean it up by excluding the respondents, who don't provide useful data or who haven't met your respondent criteria. They may have skipped too many tasks, took too much time to complete the study, or maybe just based off their odd behaviour, they could be considered an outlier.

The respondents' questionnaire could also point to them not being from the right demographics. (See chapter 7 on how to use questionnaires). Respondents, who abandoned the study before completing it are excluded by default, but you can decide to re-include them, if you find the data from the tasks they did complete to be useful.

Working with Questionnaire results 

Questionnaires provide you with additional information about your respondents.

While the study is still running, you can use a tabled overview of questionnaire answers to keep watch on the demographics-oriented questions so the respondents reflect your target demographic and you can adjust the recruiting strategy if such need arises.

After the study concludes, depending on your questionnaire, you can use the filters to analyze your respondents in groups based on their demographics, experience, personality types, etc.

Detecting UX problems in Task Statistics 

The Task Statistics view is a good place to start analyzing study data. For each task, you can look at the data visualizations to see if the respondents were having any trouble or not.

Example of a high-scoring task 

The overall score of this task is outstanding. A plain look at the results from this task tells us why - 95.2% of people found a right answer. This task is therefore a good representation of when labels on a website do their job well.

Notice however, that almost the whole pie chart is light green - 95.2% of people chose the correct answer, but not without either looking back or exploring other parts of the tree first. Even though the overall score for this task is outstanding, it would be appropriate to explore the results further and find what caused this, as it would seem something in the tree was confusing to the users. You could go to the Paths view to look at the respondent's various paths, or you could track the flow of respondents to wrong destinations visually, in the PieTree.

Example of a low-scoring task 

Just a quick look reveals that only 14.3% of respondents selected the right answer, one third of whom have taken a different path at first (and so achieved only Indirect Success).

Furthermore, the directness rate tells us that 61.9% of respondents chose their answer without any backtracking, including the 52.4% of respondents who arrived at a wrong destination and chose is as their answer, meaning that respondents were actually feeling more sure during this task, than in the previous task with good score.

81% percent of people picking a wrong answer is an obvious pointer that something is wrong. To examine these results further, we can take a look at the First Clicks and the reached Destinations.

Exploring tree traversal in Pietree 

Pietree is a great visual tool for studying how respondents behaved in the tree. Nodes in the tree that are in green circles and connected by green lines represent correct paths in the tree. If they're big and the pies inside them contain a lot of green (and yellow at the end of them, meaning the respondents nominated the right answers), it's a sign that respondents fared well with completing the task.

By the opposite, if these green paths are small or not present at all, and if there are lots of larger nodes colored red (incorrect path), blue (backtracking) or grey (task skip), it would seem that the task gave your respondents some trouble. Yellow found outside green paths means that respondents nominated incorrect answers.

Looking at first steps of the respondents via First Click 

The first click often determines the likelihood of whether your users can find what they're looking for. In the First Click view, you will get a quick insight into the first clicks of your respondents.

A wrong first click can either lead the user to a wrong destination altogether, or the confused user will have to backtrack to the very beginning in order to start searching all over again. This is especially true for wide spanning information architectures, like menus of large websites with a lot of content spread into many categories, although it does affect smaller trees as well.

Example of a high-scoring task 

In this continuation of evaluating the well-scoring task from above, we can see that the overwhelming majority of respondents' first move was to click an incorrect category label Stations. This is a proof, that at least for this task, Bus lines might be a good label, but Stations is the immediate choice that respondents made.

Example of a low-scoring task 

Only 10% of people first clicked on the intended label Tickets. The rest was distributed among the other top level categories, the Station category being the most prominent. This data makes it apparent that the menu needs to be redesigned from the top down.

When renaming the labels on the top level, the first click data provides us with useful hints about what labels the respondents associated with the tasks the most. One of the labels having a significantly higher total of first clicks points to the fact that this label would be more suitable to house the subject of this task.

Analyzing the roads walked by the respondents in Paths 

When the high-level plane of statistics and aggregate visuals just doesn't cut it anymore, the pie charts are flashing red, and you'd just like to know what exactly were those unfortunate souls doing to your poor innocent tree, switch to the Paths view.

In Paths, you will find a table of all the respondents' tree traversals, with the ability to filter them by type (direct success, indirect success, direct failure, indirect failure, direct skip, indirect skip). With this, you can dig deeper even into the successful tasks, filter out the failures and analyze which parts of your tree might have been misleading to the respondents.

The same applies to indirect successes - you can take a look at how far they went in other direction before backtracking or what were the parts of the tree that the respondents were the most hesitant about.

Analyzing the reached Destinations 

The Destinations view provides you with the data on all reached destinations from all tasks in a single table. Use it as a an overview of user activity during the study, which shows you which tasks were the most problematic as well as where in the tree the respondents seeked correct answers.

Your tree is presented down the vertical axis, and the task number is presented on the horizontal axis.