How do I interpret the Task Statistics in Tree Testing?

In this help, we'll discuss how to use the Task Statistics analysis to:

  • View Task Statistics per task
  • Evaluate Time Taken
  • Interpret Success and Directness scores and gauge their statistical accuracy

View Task Statistics per task 

  • This analysis displays Task Statistics one task at a time
  • Use the task selection dropdown list in the top section of the screen to switch to the task you want to analyze.
  • Alternatively, use the arrow buttons to the left and right of the dropdown list to move to the previous/next task.

For each task, Task Statistics contain:

  • The task's text and the correct answers
  • The task's Overall score (calculated as a rounded 3-to-1 weighted average of the squares of success and directness scores) is colored based on how acceptable it's considered to be: green (great score - 8 and higher), light green (good score with some caveats - 6, 7), orange (bad score - 4, 5) or red (terrible score - 3 and lower)
  • Pie chart of task results

The pie chart and its legend display the types of results as such:

  • Direct success - the respondent went directly for the correct answer
  • Indirect success - the respondent took a different path and backtracked before going for the correct answer
  • Direct failure - the respondent went directly for a wrong answer
  • Indirect failure - the respondent took a different path and backtracked before going for a wrong answer
  • Direct skip - the respondents skipped the task immediately, without clicking anywhere in the tree first
  • Indirect skip - the respondent explored the tree before choosing to skip the task

Evaluate Time Taken 

The bottom section of the Task Statistics contains the Time Taken, Success and Directness scores.

  • Time Taken - The time from when the respondent sees the task until they submit their answer. If you selected Prompt respondents to start each task with a button, this time is calculated from the press of the button
  • Time Taken is represented by a candlestick chart. The bar illustrates the upper and lower quartiles with the line inside marking the median value
  • T-shaped 'wicks' in the candle chart mark the highest and lowest times that it took to complete the task

Interpret Success and Directness scores and gauge their statistical accuracy 

  • Success is the percentage of answers that were correct, calculated for all answers from all respondents. It shows how well the respondents handled the tasks
  • Directness is the percentage of answers that were direct (without backtracking), calculated for all answers from all respondents. It shows how sure the respondents were with their answers
  • Aside from the scores themselves, the charts also contain information about statistical accuracy of the scores:

The upper and lower limit represent the bounds of the confidence interval - an interval which tells you how accurately the metric represents all of your respondents. The more narrow this interval is, the more we can be sure that the metrics of success or directness are statistically accurate. (The confidence interval is calculated through the adjusted Wald method with the confidence level of 95%. Adjusted Wald was chosen because it’s the best method to use when there are 150 result samples or less. For information, try out this calculator and read the text below)

Naturally, to achieve results with more statistical accuracy, you need to recruit more respondents to your study.