How to set up smartphones and PCs. Informational portal
  • home
  • Security
  • Usability testing: step-by-step instructions using Yelp as an example. Usability testing

Usability testing: step-by-step instructions using Yelp as an example. Usability testing

Most site owners create them in order to either promote their products to the product market, or make a profit directly by monetizing their site. That is, advertisements are placed on it, and for its views by users, the owner of the site receives a certain profit.

That is why administrators are interested in high traffic. The larger it is, the greater the advertising profit will be. A full-fledged schedule is provided due to the high level of "usability" -pages. So what is "usability"? What are the ways you can make your site the most attractive to users and potential customers?

Introduction to the concept

"Usability" is essentially a measure of the degree to which a site visitor is comfortable interacting with its interface. Methods that allow you to improve the site at the stage of its development are also called this term.

Let's say that a user opens a particular web page. What does he expect to see? Of course, first of all, a clear menu. And along with it - a clear structure. This also includes readable content. However, nowadays you can find many resources on which there is a confusion of hyperlinks, pictures and buttons. It would seem that all the necessary elements are there. However, they can take time to use correctly.

It is unlikely that any client will want to waste their precious time to do this. "Usability" is pretty serious nowadays. Forrester Research has conducted research on this topic. It showed that almost half of buyers leave an online store if they cannot quickly find a particular product in its sections. In addition, after such experience with the resource, half of the buyers refused to return to it again.

Research and related to "usability"

Sociologists also note the fact that people are already too impatient in our time. Especially Internet audience users. Another study was conducted by the Nielsen Norman Group. It showed that the average average user spends only about half a minute on the page. It was noted that in most cases, users do not even scan a web page to the end.

Jacob Nielsen believes that the reason for this behavior lies in a large amount of really useless information. So, people constantly need to filter a wide information flow in order to select from it only those fragments that really interest them.

Of course, a detailed study of the site can take so much time that it simply won't be left for anything else. Customers do not want and will not wait if the site is loading slowly, and they will not use it if it has a complex interface. This is where "usability" comes in. This helps to deal with many problems.

Many experts in this field speak about the same. For example, it is said that a clear parallel can be drawn between a simple, understandable for any user, website and a well-lit storefront. "Usability" is the "amount of light" in this case. The developer must clearly understand that he has only a few seconds to interest the visitor and subtly make it clear that his site is better than others.

This can be done through the use of a well-thought-out interface. In a short period of time, he should make it clear to the user on an intuitive level what and how to do next. A person should be interested in the site, feel that here he will find what he needs. If this does not happen, the client will leave the site.

Testing

This process is in any case a mandatory step in the development of "usability". Its essence, in principle, is straightforward: several respondents solve a set of certain problems using only the prototype of the system. The specialist, who is at the same time nearby, records all actions and words. When the data collection is finished, they are analyzed. If necessary, adjustments are made to the project.

Usability testing is done with the aim of discovering gaps or problems. They may well arise among respondents while working with the site, and there is no need to be afraid of this. The main thing is to make the correct adjustments.

What does site analysis include?

Usability analysis provides a comprehensive picture of the entire site. This usually includes design conclusions, test results, comparison of the prototypes of the analyzed pages, as well as the issuance of various kinds of recommendations that will help eliminate existing problems.

The design opinion defines the so-called "usability" of the interface. That is, how convenient and at the same time comfortable the user will be working with this website. "Usability" examples are provided by many services that are always ready to provide their services in this regard.

Differences in resource levels

If the advertising costs are the same, then the site with the higher level of "usability" will accept more potential customers. They will also be more loyal to the company. And this is important, since customers will visit the site and make purchases on an ongoing basis.

It is quite possible to unload company managers, since with a high level of "usability" clients can find answers to their questions themselves, on the resource pages. Consequently, management can reduce the costs required to maintain managers, which will have a positive impact on the business.

What is learned to get "usability"?

This is a whole complex of factors. These include the following criteria:

  1. Efficiency of use. How fast will a client be able to resolve his issues if he learns to use the resource?
  2. Ease of learning. How quickly will a first-time visitor learn to use the interface?
  3. Memorability. Will the client be able to reproduce the first-learned usage pattern when re-opening the pages?
  4. Errors. How often does a visitor make mistakes while on the site? How can these errors be corrected, and to what extent are they serious?
  5. Satisfaction with the work done. Did the user like the resource? If you liked it, how much?

Conclusion and conclusions

Statistics show that out of 10 users, more than half (or to be more precise, 6) cannot find the necessary information on the international network right away. The reason for this lies in the wrong construction of "usability". The Internet was created for the exchange of a total flow of information, but many sites do not follow the usual rules for such an exchange. This leads to the fact that the search for useful information takes much more time. The utility coefficient of the work is proportionally reduced.

Testing is recommended for all site owners. This is beneficial to them in the first place, because due to the correct setting of "usability" you can increase the constant audience of the site. Therefore, it is a good idea to monetize it.

From the author: Let's not be honest: any site is created to make a profit. The owners of the online store "sleep and see" how to sell more goods. Bloggers' minds are constantly searching. Even the websites of charitable organizations - and they are created with the aim of deriving benefits, in particular, to attract the attention of sponsors and society to the problem. However, the maximum impact can only be obtained if the site is convenient to use. What is the best way to check the usability of a website? We will talk about this today.

Agree, you want to come back to a site with easy navigation, beautiful design, high-quality information and good functionality again and again. The same cannot be said about inconvenient web resources that do not care about user comfort at all. And those, in turn, feeling disrespectful to themselves and their needs, quickly retreat from such a site without buying anything. To prevent this from happening, it is imperative to check the usability of the site.

Usability of a site (from the English Usability) is the convenience in its use. The higher the usability, the higher the quality of the site, the easier and faster a visitor can achieve his goal - for example, place an order or find some information.

The main problems that can be identified when checking the usability of the site are:

The presence of even one of the aforementioned problems can have a rather deplorable effect on the site's place in the search results, its traffic by potential customers and, in general, on their loyalty to the resource.

What is usability testing?

This is a study designed to determine the usability and suitability of a web page or user interface for its intended use. This process, also called ergonomics testing, involves using users as testers to produce objective data.

How does the testing process go in a “laboratory environment”?

An example of usability testing: the user is asked to solve some of the tasks for which this web resource was created, and during the test execution to record his comments. The recording of comments is usually carried out using video or audio recordings, so that they can then be analyzed in detail.

The process may involve observers who monitor the situation and take notes on the progress of testing. The following variant of checking the usability and functionality of the site is possible: an “ideal” scenario of the user's behavior and his “travel” around the site is developed, and then deviations from this route are recorded and appropriate edits are made.

After several circles of hell of test / revision iterations, you can get an interface that is acceptable for solving a particular problem, convenient for the client and making a profit for the site owner.

Using a focus group

One of the variations of the method described above is testing a product using a focus group. Its essence lies in the fact that real users visit the site and carry out various actions up to the conversion.

As a result, you will receive feedback "from the outside", a list of questions and difficulties encountered, and possibly even suggestions for improvement. Such a study should involve representatives of the target audience who differ among themselves in the level of computer skills, profession, socio-demographic and other characteristics.

This type of online website usability check will allow you to analyze the product from the perspective of the end user and get answers to the following questions:

whether the structure and design of the interface meets the goals of the business;

what elements the user pays most attention to;

is navigation convenient;

whether the user can quickly adapt to the interface and its functionality;

Is the feedback mechanism working properly?

Modern trends and approaches in web development

Learn the fast growth algorithm from scratch in site building

Does the product make the right impression?

are there any differences in real and conversion user routes.

Where to get these same focus groups? For this, there are many services for testing usability using focus groups (askusers.ru, usabilla.com, sitepolice.ru, etc.); you can also use the help of specialized companies.

The disadvantage of this method is that it is paid for. To get a normal sample (not 2 people) you need a lot of budget. In addition, by involving focus groups in the research, you will most likely receive only a list of problems, the solution of which you will have to look for either on your own or with the help of specialists.

Usability Assessment Tools

How else can you check the usability of a website online? There are a huge number of different tools on the Internet, both paid and free. Here are the most useful ones, in my opinion:

Google Analytics Content Experiments. It's a free and open source tool that you can use to test almost any change on your website to see if it is contributing to a lower bounce rate or an increase in conversions.

To connect the tool, go to Google Analytics and find the "Behavior" - "Experiments" tab. To set up an experiment, you need to perform several steps:

set the name and purpose of the experiment. The goal may be to visit a web page, purchase a product, or be present for a certain time. You can create your own goal or choose from those offered by Google Analytics;

select the percentage of traffic distribution. This is the number of visitors who will see a particular target option;

set the minimum experiment time. Usually it is about two weeks;

adjust the confidence threshold. The higher it is, the more accurately the winning option will be determined, however, this will increase the experiment time.

The disadvantages of using this tool include the impossibility of checking several tasks at once, the lack of polyvariant testing and traffic segmentation.

Visual Website Optimizer. It allows you to effectively increase conversions, has a simple online editor to customize many options for your landing page, and creates unique targeting links. For example, you can tell him: if a visitor came to us from Vkontakte and uses OS Windows, then please show him this web page, and if he comes from a MacBook, then this one ... Visual Website Optimizer also boasts a convenient reporting system and a bunch of different chips.

Qualaroo. A free user survey that allows you to find out an unbiased opinion about a web resource. With it, you can identify problems and increase your bottom line. This feedback tool greatly facilitates the work of usability analysts.

Yandex.Metrica. Allows you to analyze the stages of filling out the form and find out which of them caused the greatest difficulties for visitors, which users did not fill out this or that item, etc.

Heatmaps. These include the Yandex.Metrica "Clickmap", web page statistics Google Analytics, usabilitytools.com, feng-gui.com, etc. These tools allow you to assess the total user behavior on the site. With heatmaps, you can see where most people are clicking and where they are not. Corresponding adjustments are made depending on the result.

Checklist for verification

Are web pages loading fast? Overloading with animations, heavy pictures and flash screensavers can affect your site's search rankings and, of course, user loyalty.

Is there a logo on every page? It must contain a link to the main resource.

How is the content framed? The text should have good structure: heading, subheadings, lists, etc. And there should be no mistakes. Inadequate text with errors is poorly readable and perceived by both people and search engine robots. Headings and subheadings should be succinct and concise, consistent with the advertising that is posted on the network.

I hope this article was useful to you in assessing the usability of the site. Subscribe to our blog updates and learn more about usability, web design, and other areas. See you soon!

Modern trends and approaches in web development

Learn the fast growth algorithm from scratch in site building

Preparation, interviews and data collection

To bookmarks

Head of UX Research at Mail.Ru Group Natalia Sprogis in the company's blog at Habrahabr spoke about the preparation and conduct of usability testing: what to include in the test scenario, how to choose a data collection method, compose tasks and collect the impressions of respondents.

A test plan is, on the one hand, a set of tasks, questions and questionnaires that you give to each respondent, and on the other hand, the methodological base of the research: metrics and hypotheses that you test and record, the selected instruments.

Is testing exactly necessary

First, you need to be sure that at this stage the project needs usability testing. Therefore, clarify for what purpose the project team is contacting you. Usability testing is not omnipotent, and already at the start you need to understand what it can bring to the product. Prepare the project team right away for which questions you can answer and which you cannot. There have been cases when we either offered customers a different method (for example, in-depth interviews or diary research are now better suited), or recommended that we abandon the research altogether, and instead do a split test.

For example, we never undertake qualitative research to test the “attractiveness” of a feature or design option. We can collect feedback from users, but the risk is too great that their responses will be influenced by social desirability. People are always inclined to say that they would use even what they will not use. And the small sample size does not allow such answers to be trusted. For example, we had a bad experience of testing game landing pages: the landing page that was chosen as the most attractive on the test performed much worse in A / B testing.

Testing prototypes and concepts also has a number of limitations. When planning, you must understand what you can really "squeeze" out of this test. It's great when a project has the opportunity to test prototypes or designs prior to implementation. However, the less detailed and working prototype, the higher the level of abstraction for the respondent, the less data can be obtained from the test. Testing prototypes best reveals naming and metaphor problems for icons, that is, all issues of clarity. The ability to test something beyond this strongly depends on the essence of the project and the detail of the prototype.

Basis for writing a usability test script

Test planning begins not with drafting the text of the assignments, but with a detailed study of the goals and research questions together with the project team. Here's the basis for making your plan:

Important scenarios. These are those user scenarios (tasks, or use cases) that affect the business or are related to the purpose of testing. Even if the team suspects problems in specific locations, it is often worth checking the main cases. In this case, the following scenarios can be considered important for the test:

  • the most frequent (for example, sending a message in a messenger);
  • affecting business goals (for example, working with a payment form);
  • related to the update (those that were affected by the redesign or the introduction of new functionality).

Known Issues. Often research is needed to answer the root cause of a service's business problem. For example, a producer is worried about a large churn of players after the first hour of play. And sometimes the problem areas of the interface are already known to the team, and you need to collect details and specifics. For example, the support service is often asked about the form of payment.

Questions. The team may also have research questions: for example, do users notice a banner advertising additional services; whether a specific section is clearly named.

Hypotheses. This is what the team's known issues and questions translate into. It is good if the customer comes to you with ready-made hypotheses - for example, “Our customers pay only from the phone with a commission. Perhaps users do not see the choice of a more advantageous payment method. " If there are no hypotheses, but there is only a desire to test the project abstractly "for usability", your task is to formulate these hypotheses.

Think with the project team about places where users do not behave as expected (if available). Find out if there are design elements that have been controversial and that may be problematic. Do your own product audit to find potential user issues that are important to test on. All this will help you make a list of those elements (tasks, questions, checks) that should be included in the final scenario.

Data collection method

It is important to consider how you will collect data about what happens during the test for later analysis. The following options are traditionally used:

Observation. While completing tasks, the respondent is left alone with the product and behaves as he sees fit. The respondent's comments are collected through questionnaires and communication with the moderator after the test. This is the "cleanest" method, it provides a more natural behavior of the respondent and the ability to correctly measure a number of metrics (for example, task execution time).

However, a lot of useful quality data remains behind the scenes. Having seen this or that behavior of the respondent, you cannot understand why he acts this way. Of course, you can ask about this at the end of the test, but, most likely, the respondent will remember well only the last task. In addition, during the execution of tasks, his opinion about the system may change, and you will get only the final picture, and not the first impressions.

Think Aloud (thinking out loud). For a long time, this method was used most often in usability testing. Jacob Nielsen once called it the main tool for evaluating usability. The bottom line is that you ask the respondent to voice all the thoughts that arise when working with the interface, and comment on all their actions. It looks like this: “Now I'm going to add this product to the cart. Where is the button? Oh, here she is. Oh, I forgot to see what color there was. "

The method helps to understand why the user behaves in one way or another and what emotions the current interaction evokes in him. It is cheap and simple, even an inexperienced researcher can handle it.

However, it has its drawbacks. First, it’s not natural for people to “think out loud” all the time. They will fall silent often and you will have to constantly remind them to keep talking. Secondly, tasks with this method take a little longer than in real life. In addition, some of the respondents are beginning to use the product more thoughtfully. Speaking the reasons for their actions, they try to act more rationally, and they just do not want to look like idiots, and you may not catch some intuitive moments of their behavior.

Active Moderator Intervention... The method is ideal for testing concepts and prototypes. During the execution of tasks, the moderator actively interacts with the user: at the right moments he finds out the reasons for his behavior and asks clarifying questions. In some cases, the moderator can even issue unscheduled tasks arising from the dialogue.

This method allows you to collect the maximum amount of quality data. However, it can only be used if you trust the professionalism of your moderator. Incorrectly worded or at the wrong time asked questions can greatly affect the behavior and impressions of the respondent and even make the test results invalid. Also, when using this method, almost no metrics can be measured.

Retrospective think aloud, RTA (retrospective). This is a combination of the first two methods. The user first performs all tasks without interference, and then a video of his work is played in front of him, and he comments on his behavior and answers the questions of the moderator. The main disadvantage of this method is that the testing time is greatly increased. However, there are times when it is optimal.

For example, once we were faced with the task of testing several types of mobs (game monsters) in one RPG. Naturally, we could neither distract the respondents with questions, nor force them to comment on their actions during the battle. This would make it impossible to play where concentration is needed to win. On the other hand, the user would hardly be able to remember after a series of fights whether he noticed the first rat's ax on fire with a red ax. Therefore, in this test we used the RTA method. With each user, we reviewed his fights and discussed what effects of monsters he noticed and how he understood them.

Try to think about how to get enough data while keeping the respondent as natural as possible. Despite the simplicity and versatility of the "thinking out loud" method, which has long been the most popular in usability testing, we are increasingly trying to replace it with observation. If the moderator sees an interesting behavior of the respondent, he will wait until he completes the task and ask a question after. Immediately after the assignment, it is more likely that the respondent remembers why he did this.

The eye tracker helps a lot in this matter. By seeing the focus of the respondent's current attention, you can better understand their behavior without asking unnecessary questions. In general, the eye tracker significantly improves the quality of moderation, and this role, in my opinion, is no less important than the ability to build hitmaps.

Metrics

Metrics are quantitative indicators of usability. As a result of testing, you always get a set of problems found in the interface. Metrics allow you to understand how good or bad everything is, as well as compare with another project or previous versions of the design.

What are the metrics

According to ISO 9241-11, the main characteristics of usability are efficiency, productivity and satisfaction. Different metrics may be relevant for different projects, but they are all tied in one way or another to these three characteristics. I will write about the most commonly used metrics.

Successfulness of assignments. You can use a binary code: you did the job or you didn’t. We often follow the Nielsen approach and distinguish three types of success assessments:

  • coped with the task with almost no problems - 100%;
  • encountered problems, but completed the task on its own - 50%;
  • did not cope with the task - 0%.

If 4 out of 12 respondents coped with the task easily, 6 - with problems, and 2 failed, then the average success on this task will be 58%.

Sometimes you will come across a situation when respondents who are very different in terms of the degree of “problematicity” fall into the middle group. For example, one respondent struggled with each field of the form, and the second made only a slight mistake at the very end. You can give the grade at your own discretion, depending on what happened on the test. For example, 25% - if the respondent has just started to complete the task, or 80% - if he made a minor mistake.

To avoid too much subjectivity, think about the rating scales in advance, rather than decide for each respondent after the test. It is also worth considering what to do with errors. For example, you gave the task to buy cinema tickets on the project "Kino Mail.Ru". One of the respondents accidentally bought a ticket not for tomorrow, but for today, and did not notice it. He is confident that he has coped with the task and has the ticket on hand. But his mistake is so critical that he will not get into the cinema, so I would put "0%", despite the fact that the ticket was bought.

Success rate is a very simple and straightforward metric, and I recommend using it if your assignments have clear goals. A glance at the assignment success graph allows you to quickly identify the most problematic areas of the interface.

Time for completing tasks. This metric is indicative only by comparison. How do you know if it's good or bad if a user completes a task in 30 seconds? But the fact that the time has decreased compared to the previous version of the design is already good. Or the fact that registration on our project takes less time than competitors. There are interfaces where reducing the time to complete tasks is critical - for example, the working interface of a call center employee.

However, this metric is not applicable for all tasks. Let's take the task of selecting a product in an online store. Users should quickly find filters and other interface elements related to product search, but the selection process itself will take them different time, and this is completely normal. When choosing shoes, women are ready to look at 20 pages of issue. And that doesn't necessarily mean that there were no matching products on the front pages or that they don't see filters. Often they just want to see all the options.

Frequency of problems. Any usability test report contains a list of issues encountered by respondents. The number of respondents who encountered a problem is an indicator of its frequency within the test. This metric can only be used if your users performed exactly the same tasks.

If there were variations in the test or the tasks were not clearly formulated, but compiled on the basis of interviews, then it will be difficult to calculate the frequency. It will be necessary not only to count those faced, but also to estimate how many respondents could face the problem (performed a similar task, entered the same section). However, this characteristic allows the team to understand which problems should be fixed first.

Subjective satisfaction. This is a subjective assessment by the user of the convenience or comfort of working with the system. It is revealed using questionnaires that respondents fill out during or after testing. There are standard questionnaires. For example, System Usability Scale, Post-Study Usability Questionnaire, or Game Experience Questionnaire for games. Or you can create your own questionnaire.

These are far from the only possible metrics. For example, here's a list of 10 UX metrics that Jeff Sauro highlights. For your product, the metrics may be different: for example, at what level do the respondents understand the rules of the game, how many mistakes they make when filling out long forms. Remember that the decision to use multiple metrics imposes a number of limitations on testing. The respondents should act as naturally as possible and in the same conditions. Therefore, it would be good to provide:

  • Single starting points. The same tasks for different respondents should start from the same point in the interface. You can ask the respondents to return to the home page after each assignment.
  • Lack of intervention. Any communication with the moderator can affect the performance metrics if the moderator unwittingly prompts the respondent to do something, and increases the time it takes to complete the task.
  • Order of tasks. To compensate for the learning effect of comparative testing, be sure to reverse the order in which the comparison products are introduced for different respondents. Have half start with your project and half with a competitive one.
  • Success criteria. Think in advance what kind of behavior you consider successful for the assignment: for example, is it permissible for the respondent not to use filters when selecting a product in an online store.

Interpretation of metrics

Remember that classic usability testing is qualitative research and the metrics you get are primarily illustrative. They provide an overview of the different scenarios in the product, allowing you to see pain points. For example, account settings are more complex than registration in the system. They can show the dynamics of change if you measure them regularly. That is, the metrics make it possible to understand that in the new design, the task has become faster. It is these relationships that are much more indicative and reliable than the found absolute values ​​of the metrics.

Jeff Sauro, a UX research statistician, advises not to represent metrics as averages, but to always consider confidence intervals. This is much more correct, especially if there is a variance in the results of the respondents. To do this, you can use its free online calculators: for success and for assignment time. You cannot do without statistical processing and when comparing the results.

When metrics are needed

Not every usability test report contains metrics. Collecting and analyzing them takes time and imposes restrictions on the test method. Here are the cases when they are really needed:

  • Prove. There is often a need to prove that changes need to be made to the product - especially in large companies. For decision-makers, the numbers are clear, understandable and familiar. When you show that 10 out of 12 respondents were unable to pay for an item, or that registration in the system takes on average twice as long as competitors, it gives the research results more weight.
  • Compare. If you are comparing your product to others on the market, you also need metrics. Otherwise, you will see the advantages and disadvantages of different projects, but you will not be able to assess where your product occupies among them.
  • See changes. Metrics are good for regularly testing the same product after changes are made. They allow you to see the progress after the redesign, to draw attention to those places that were left without improvement. You can use these indicators again as an evidence base that will show the management the weight of investments in the redesign. Or just to understand that you have achieved results and are moving in the right direction.
  • Illustrate, accentuate. The numbers help illustrate important issues well. Sometimes we count them for the brightest and most important points of the test, even if we do not use metrics in all tasks.

However, we do not use metrics in every test. You can do without them if the researcher works closely with the project team, there is internal trust and the team is mature enough to correctly prioritize problem solving.

Data capture method

It would seem, what's wrong with a notebook and a pen or just an open Word document? In today's Agile development world, UX researchers should try to deliver their observations to the team as quickly as possible.

To optimize analysis time, it is a good idea to prepare a template in advance for entering notes during the test. We tried to do this in specialized software (for example, Noldus Observer or Morae Manager), but in practice, tables turned out to be the most flexible and versatile. In advance, mark in the table the questions that you plan to ask exactly, the places for entering the problems found in the tasks, as well as the hypothesis (on each respondent you will mark whether it was confirmed or not). Our plates look like this:

What else can you use:

  • ... Customizable Excel template for entering observations for each respondent. A built-in timer that measures the execution time of tasks, time and success graphs are automatically generated.
  • Rainbow Spreadsheet by Tomer Sharon of Google. A visual table for collaboration between the researcher and the team. The link leads to an article describing the method, and there is also a link to a Google spreadsheet with a template.

With experience, most of the recordings can be done right during the test. If you are not in time, then it is better to write down everything that you remember immediately after the test. If you return to the analysis in a few days, you will most likely have to review the video and spend much more time.

Preparing for testing

In addition to the method, metrics and the testing protocol itself, you need to decide on the following things:

Format of communication with the moderator. The moderator can be in the same room as the test participant. In this case, it will be easy for him to ask questions on time. However, the presence of the moderator can influence the respondent: he will start asking questions to the moderator, provoking him to prompt him, either explicitly or implicitly.

We try to leave the respondent alone with the product for at least part of the test. So his behavior becomes more relaxed and natural. And in order not to run back and forth if something goes wrong, you can leave any messenger with audio connection turned on so that the moderator can contact the respondent from the observation room.

Method of setting tasks. Tasks can be voiced by a moderator. But in this case, despite the uniform testing protocol, the text of the assignment can be pronounced slightly differently each time. This is especially true if the test is conducted by several moderators. Sometimes even small differences in wording can put respondents in different starting conditions.

To avoid this, you can either "train" the moderators to always read the texts of the assignment, or give the respondents assignments on pieces of paper or on the screen. The difference in wording ceases to be a problem if you use a flexible scenario, when tasks are formulated during the test, based on an interview with a moderator.

You can use the product tools for setting assignments. For example, when testing ICQ, respondents received tasks through a chat window with a moderator, and when testing Mail.Ru Mail, they received them in letters. This way of setting tasks was as natural as possible for these projects, and we also tested the basic correspondence scripts many times.

Creating a natural context. Even if we are talking about laboratory research, think about how to bring the use of the product on the test closer to real conditions. For example, if you are testing mobile devices, how will respondents hold them? For a good image on video, it is better when the phone or tablet is fixed on a stand or lying on a table. However, this does not make it clear whether all the zones are accessible and convenient for pressing, because phones are often held with one hand, and with tablets they lie on the couch.

It is worth thinking about the environment in which the product will be used: whether something distracts a person, is it noisy, is the Internet good. All of this can be simulated in the laboratory.

Test plan for the customer. This is also an important preparation step as it involves the project team. You may not tell the customer about all the methodological features of the test (how you will communicate with the respondent, record data, etc.). But be sure to show him what the tasks will be and what you are going to check on them. Perhaps you did not take into account some features of the project, or maybe the project team will have additional ideas and hypotheses. We usually get a similar plate:

Report outline. Naturally, the report is written based on the research results. But there is a good practice - to draw up a report plan even before the tests, based on the goals and objectives of the study. With such a plan in front of your eyes, you can check your scenario for completeness, as well as prepare the most convenient forms in order to record the data for subsequent analysis. Perhaps you decide that the report is not needed and that a common observation file is enough for you and the team. And if you motivate the team to complete it with you, it will be great.

Of course, you can just “let your friend use the product” and watch what difficulties they have. But a well-written scenario will allow you not to miss important problems and not accidentally push the respondent to the answers you need. After all, usability testing is a simplified experiment, and in any experiment, preliminary preparation is important.

Any usability testing protocol consists of the following parts:

  • Briefing or briefing (greeting, description of the event, signing of documents).
  • Introductory interview (screening check, short interview about product use, context and scenarios).
  • Working with the product (testing tasks).
  • Collecting final product impressions based on testing experience.

Briefing or briefing

Regardless of the subject matter of testing, any research starts the same way. What should be done:

Create an atmosphere. Get to know the person, offer him tea, coffee or water, show him where the toilet is. Try to relax the respondent a little, as he may be nervous before the event. Find out if it was easy to find you, ask your mood.

Describe the process. Tell us what kind of event the respondent is waiting for, how long it will take, what parts it consists of, what you will do. Be sure to point out to the respondent that their input will help improve the product and that you are not testing a person's ability. If you are videotaping, alert the respondent and tell him that the data will not appear on the network. I say something like this:

We are located in the office of Mail.Ru Group. Today we will talk about the XXX project. It will take about an hour. First, we will talk a little, then I will ask you to try to do something in the project itself, and then we will discuss your impressions. We will record what is happening in the room and on the computer screen. The recording is needed solely for analysis, you will not see yourself on the Internet.

We are conducting research to make the XXX project better, to understand what needs to be corrected in it and in which direction it should develop. Therefore, I ask you to openly express any comments: both positive and negative. Don't be afraid to offend us. If, when studying the project, something does not work out for you, take it easy. This means that we have found a problem that the project team needs to fix. The main thing is to remember that we are not testing you, you are testing the product. If you're ready, I suggest you get started.

To sign documents. As a rule, this is consent to the processing of personal data, and sometimes also an agreement on non-disclosure of information about testing. For tests with minors, parental consent is required for their child to participate in the study. We usually send it to the parents in advance and ask them to bring it with you. Be sure to explain why you are asking to sign documents, and give time to study them. In Russia, people are wary of any papers that need to be signed.

Configure equipment. Whether you are using eye tracking, biometric equipment, or simply recording video, it's time to turn it on. Warn the respondent when you start recording.

Introductory interview

It solves the following tasks:

Check recruiting. Just in case, always start with this - even if you trust the agency or the person who found the respondent. More than once already during the test, we found out that the respondent misunderstood the questions and, in fact, does not use the product quite the way we need it. Try to move away from formality and not ask questions from the screening questionnaire: the person may already know what to answer to them.

Product use scenarios and context. Even if you have little time for the test, do not skip this point. At least in general, ask the respondent what tasks he solves with the help of the product, whether he uses similar projects, in what conditions he interacts with them and from what devices. The answers will help you better understand the reasons for the respondent's behavior, and if you are using flexible scenarios, then formulate appropriate tasks. If there is enough time, ask the respondent to show what and how he usually does. This serves as a source of further questions and insights.

Expectations and attitudes. The beginning of testing is a good time to find out what the respondent knows about the product, how he feels about it and what he expects from him. After the test, you will be able to compare the expectations with the final impression.

For most tests, this introductory interview structure will work. If you are testing a new product, you might want to skip the introductory questions. Going into too much detail on a topic can create certain expectations for the user of the product. Therefore, leave only a couple of general questions in order to establish contact with the respondent, and immediately proceed to the assignments, and it is better to discuss scenarios, relationships and context after the user first learns the product.

Working with the product, drawing up tasks

What are the tasks

Let's say you want to test an online store. You have important scenarios (product search and selection, checkout process), known problems (frequent mistakes in the form of payment), and even the hypothesis that the designer tricky with the price filter. How to formulate tasks?

Focused assignments. It seems obvious to do something like this: "Choose a dishwasher 45 centimeters wide with a" beam on the floor "function, which costs no more than 30 thousand rubles." This motivates the respondent to use filters and compare products with each other. You will be able to check the filter by price on all respondents and look at the key scenario of the item selection. Such tasks are quite right for life and are good for testing specific hypotheses (as with a filter by price).

However, if the entire test consists of them, then you risk the following:

  • Point check of the interface. You will only find problems related to job details (filter by price and width). You will not see other problems - for example, sorting products or other filters - if you do not specify them too. And you are unlikely to be able to do tasks for all the elements of the site.
  • Lack of involvement. Users often perform such tasks mechanically. When they see the first item that matches the criteria, they stop. Perhaps in his life the respondent has never chosen a dishwasher and does not care what a "ray on the floor" is. The more the task resembles a real-life situation and the more context it contains that is understandable to the user, the higher the chances of engaging a respondent who will imagine that he is actually choosing a product. And the involved user “lives” the interface better, leaves more comments, increases his chances of finding problems and providing useful knowledge about the behavior and characteristics of the audience.
  • Narrowed spectrum of insights. In real life, the user might not have picked the product at all. For example, I would not use filters at all (and here you pointed to them). Or I would search for a product according to criteria that are not on the site. By giving tough, focused tasks, you will not learn about the real context of using the product, you will not find scenarios that the project team may not have foreseen, you will not collect data on content and functionality needs.

Assignments with context. One way to better engage users is to add real story and context to the dry task. For example, instead of “Find a recipe for a plum cake on the website,” suggest the following: “Guests will come to you in an hour. Find what you can bake during this time. You have everything for the biscuit in your fridge, as well as a little plum. But, unfortunately, there is no butter. "

A similar approach can be used with an online store. For example: “Imagine you are choosing a gift for your sister. Her hair dryer recently broke and would be delighted to have a new one. You need to keep within 7 thousand rubles. " It is important that the respondent actually chooses a real person to whom he will “buy” the gift (if there is no sister, suggest another relative or girlfriend). The key factor for such assignments is reality and clarity of context. It is easy to imagine that you are choosing a gift for your family, much more difficult - that you are "an accountant who prepares an annual report."

A striking example of this approach is the “Bollywood Method”, which was invented by the Indian UX expert Apala Lahiri Chavan. She argues that it is difficult for Indians, like many Asians, to openly express opinions about the interface. But, presenting themselves as heroes of fictional dramatic situations (as in their favorite films), they open up and begin to actively participate in testing. Therefore, tasks for Indians should look something like this:

Imagine that your beloved young niece is about to get married. And then you find out that her future husband is a swindler, and even married. You urgently need to buy two tickets for the flight to Bangalore for yourself and for the cheater's wife in order to upset the wedding and save the family from shame. Hurry up!

Assignments based on the experience of the respondents. Recall: for successful testing, respondents must match the audience of the project. Therefore, to check the online store of household appliances, we recruit those who have recently chosen appliances or are choosing them now. This is what we will use when compiling assignments based on the experience of the respondents. There are two options for using this approach:

  • Respondent parameters. In this case, you adapt the fixed tasks to the respondents. For example, in the case of a home appliance store and the task of working with filters, ask the person what exactly he recently purchased. Find out the criteria (price, functions) and offer to repeat the "purchase" on your site.
  • Respondent scenarios. The assignments are fully formed based on the experience of the participants. To understand which scenarios to test, the moderator finds out exactly how the person solved the problem in life, and suggests doing it on the site. For example, a buyer, before making a choice, compared several models with each other for a long time. Even if the site does not have a suitable function, ask the respondent to compare products in order to understand what parameters he will rely on. You might get an idea of ​​what the compare function should look like, and you can also adapt the product page for this scenario.

Such tasks provide many real-life examples of performing basic operations in the product. This often gives rise to a much wider range of problems and findings. In addition, it allows you to test the product on new scenarios that you did not consider basic or even thought out.

When we tested the project "Mail.Ru Real Estate", many discoveries helped to do exactly the tasks based on the experience of the respondents. We saw that people, when looking for an apartment in the Moscow region, indicate in the geofilter the end stations of the metro, meaning that these are stations that can be reached from the region. We expected that the metro filter is looking for an apartment near the station. We also learned how the scenarios for the search for new buildings differ from the secondary housing, which helped to move the search for new buildings to another section on the site - with its own filters and its own concept for describing apartments. I also advise you to read Jared Spool's excellent article on the benefits of such assignments.

Assignments without assignments. Sometimes it is better not to offer users tasks to work with a project at all, but to see how they themselves begin to get acquainted with the product. Give the respondent an introduction: “Imagine you decided to try this product. I'll leave you for a few minutes. Do what you would do in real life. I don’t give you any assignments. ”

It is important that the moderator leaves the room at the same time. Otherwise, the user is tempted to immediately ask something, to clarify: “Do I need to register? How can you do this? " etc.

This type of assignment is useful for completely new products. We often use it for mobile apps and games. This way we find out whether users read the training materials, what details immediately attract attention, what people understand in the concept of the product, and how they later describe its capabilities. After the free task, specific scenarios are planned.

Another area of ​​application for free assignments is content projects. If you want to understand how your articles are read (where they stay for a long time, what they miss, what elements on the page they pay attention to), then just leave the respondent alone with the project for a few minutes. Only without the moderator looking over his shoulder, the user will relax and read the text in the same way as usual. This is how we test the projects "News Mail.Ru", "Lady Mail.Ru" and others. This approach allowed us to highlight different patterns of behavior on the site, different patterns of reading articles and understand what types of materials should be styled differently.

We make good assignments

The first task is simple. Start testing with introductory and easy tasks. The respondent should become comfortable with the test format, especially if you are using the “thinking out loud” method: he needs to get used to the need to voice his thoughts and feelings. Do not immediately dump all the pain and suffering of the interface on him.

Don't tell me. Formulate the tasks in such a way that you do not prompt the respondent to do the right thing. If you want to test the ability to add products to favorites in an online store, do without the task "Let's add this TV to favorites", especially if the button is called that. After reading the assignment, the respondent will simply find a button with the required signature on the screen - perhaps without even understanding what he is doing.

It is better to explain the meaning of the task without resorting to terms in the interface. For example: “The site has the ability to save the products you like and then choose which of them to order. Let's try to do it with such and such a TV. "

Follow the terminology. Do not use incomprehensible words and designations. It seems obvious, but we, getting used to some terms, often forget that very few people outside the IT community know them. For example, when testing the new functionality of threads (message chains) in Mail.Ru Mail, we had a difficult time. After all, users who are unfamiliar with this function simply do not have a term in their heads that would denote threads.

As a result, we did not name them in any way. We just showed the respondents a box with connected chains and discussed this new feature, and let the users choose the word for the threads themselves. This helped us later to use the most understandable texts in educational promotional materials.

Follow not only the assignments, but also the moderator's questions, especially those that come from the team during testing. For example, when discussing functions, you should not use the word "toolbar": not everyone is familiar with it. A few years ago, not all users even knew the word "browser". How exactly the tasks are best formulated depends on the testing audience. You should not rush to the other extreme, explaining all the terms in a row. For example, experienced players do not need to explain what "buff", "frag", "respawn" and so on are.

Less test. It is often tempting to create a test account for the respondent in the system and conduct testing on it. After all, you can test everything in this account in advance, avoid overlaps and not waste time registering or authorizing a respondent. It is often also technically much easier to incorporate a new design on test data rather than on real data.

However, with this approach, you run the risk of getting much less useful results, because the test actions have no real consequences. The situation becomes completely artificial, it is difficult for users to project it into a real experience.

For example, when working in their own account on the social network, respondents, as in real life, will carefully do everything that their friends can see (post links, send messages). When setting up their own mailbox, they will try not to delete important letters. When testing online stores, an approach is sometimes used when the reward must be spent right on the test. In this case, the respondent will not point to the first suitable product for the assignment, but will pick up what he really needs.

With only test data, you will find problems related only to them, and not test functionality on different variations. For example, when we tested the social panel of the Amigo browser, one of the respondents, who connected his VKontakte account to the panel, immediately noted that it was inconvenient for him to read this way. Almost the entire feed consisted of subscriptions to groups with erotic photos. And in the narrow panel in the pictures it was simply impossible to see anything.

Another problem with the test data is that it is difficult to understand the system, since everything around is unusual. For example, a social network user is used to recognizing his page by his own photo. Even when testing prototypes, we try to personalize them as much as possible. For example, when testing clickable prototypes in Odnoklassniki, we always adapt them for each user, inserting his name and photo, and sometimes the latest news on the page.

Don't be limited by the interface. Keep in mind that interactions with a product are often more than just the interface. If possible, test related products or services and the links between them. When testing games, we try to check not only the game, but also its website and related downloads, registration in the game, and search for information on the forum. And when testing one online store, I also checked the operator's call after placing an order, which gave recommendations for the call center.

Think about timing. For a good script, it's important to prioritize tasks. Most likely, if the system is large and the test has many goals, you will want to do a lot of tasks. However, a tired respondent will no longer be useful. A good test lasts no more than an hour and a half, two is the maximum. The only exception is games. And remember that your goals are not just assignments, but interviews, questionnaires, setting up equipment, and signing documents. All this usually takes at least half an hour.

If there are too many tasks, and you don't want to give up some, you can put the least priority ones in rotation, that is, show only a part of the respondents. Or make part of the test compulsory for everyone, and watch the rest only with those with whom there is enough time. But these will most likely be the most successful respondents.

Evaluate the usefulness of the assignment. Consider if it really matches your hypotheses. For example, you want to test the news subscription feature on a website. The task "Subscribe to the newsletter" will only check whether those who will search for it will be able to find the newsletter. However, people rarely come to the site to subscribe to news. The assignment does not apply to real life. You need to understand if the subscription option is noticed by those who perform completely different tasks.

You can check this in different ways, depending on the implementation of the function. If the person was engaged in tasks in which he might have seen the possibility of a subscription, ask him if it is on the site. Just be sure to clarify where he saw this opportunity or how it was implemented to make sure that the respondent does not just agree with you.

If a subscription offer is built into the registration or checkout process, see if the respondent will use it, and then discuss it after the assignment. There is very little chance that in a laboratory setting people will actually subscribe to mailing lists, but you can check whether a person has paid attention to this possibility, what he expects from the mailing list, and so on.

Collecting final impressions

The goal of the final testing phase is to collect impressions of working with the product, to understand what the user liked and what upset him, to assess subjective satisfaction. Typically, this part of the test uses a combination of an interview with a moderator and filling out formal questionnaires.

Moderator interview

In the final interview, we always ask the respondents about the same questions: "What impressions did you have?", "What did you like and what did not?" , "What would you like to change in the product?" It's time to clarify the incomprehensible moments of the respondent's behavior, if you did not do this during the test. If, before the test, you learned from users about the attitude and expectations of a brand or product, find out if anything has changed. When interviewing, pay attention to the following:

Social desirability. Handle your interview results very carefully. If during the test you often hear impulsive comments under the influence of problems, then in the final interview, social desirability flourishes with might and main.

Some people think that when they talk about problems in the product, they admit their own incompetence. Others just don't want to upset an enjoyable moderator. Very often the respondents (especially women), who suffered through the whole test, say that everything is, in principle, normal. Negative reviews can also be dictated by social desirability: if the respondent is confident that the purpose of the test is to find flaws, he diligently tries to find them.

Quotes and Priorities. Although all the words of the test participants in the final interview often need to be divided by two or even ten, this does not mean that they are useless. By the way respondents summarize their impressions, you can infer priorities. The product sucks? What exactly influenced this? Which of the many problems did the respondent remember the most and consider the most annoying?

However, make allowances for the last task that is remembered best. It is also very useful to keep track of what adjectives the respondents use to describe the product, to what they compare their experience.

Let's not forget about the good. Very often, a usability test report is a long list of problems found during the test. In general, the search for problems is one of the main tasks of research. But don't forget about the positive aspects of the product.

First, a report with no positive results simply demotivates the team. And secondly, it is useful to know what users like about the product: what if, at the next redesign, they decide to remove the function that everyone liked so much. Therefore, be sure to ask respondents about the positive aspects of the product, even if they scolded the interface during the entire testing.

Attitude towards "Wishlist". Most likely, the respondents, in addition to their impressions, will express wishes and ideas. Your task is to understand what is the problem behind the proposals. Because the solutions that users suggest will most likely not work for you. After all, test participants are not designers, they are not aware of the features and limitations of development. However, there is a need behind any such request that you must capture. If a respondent says that he definitely needs a big green button here, be sure to ask: why?

Satisfaction measure

Often, according to the respondent in the final interview, it is difficult to understand whether he liked the product or not, and even more so it is difficult to compare the attitude of several respondents who noted both advantages and disadvantages. Here questionnaires come to the aid of the researcher. First, when filling out the questionnaire (especially before talking with the moderator), the influence of the notorious social desirability is slightly less, although you will not get rid of it completely. Second, the questionnaire gives you clear parameters for comparing scenarios, products, or project stages.

Writing a good questionnaire is a separate and very large topic. Formulations, scales, and much more are important here. Ready-made and tested questionnaires can be a good help: they have already been refined and tested many times. The only problem is that almost all of these questionnaires do not have official translations into Russian. Naturally, you can translate them yourself, but from a methodological point of view, translations need to be tested in order to check the correctness of the wording. Nonetheless, the questionnaires can serve as a guideline for the compilation of your own questionnaires.

There are questionnaires that are given after each assignment to assess satisfaction with specific scenarios. For example:

  • After Scenario Questionnaire (ASQ). Three questions about complexity, productivity, and prompts in the system.
  • Single Ease Question (SEQ). One question about the complexity of the script.

And there are questionnaires that are used in the final phase of testing. Here are some examples that we use when needed:

  • System Usability Scale and Post-Study System Usability Questionnaire. Two classic and popular questionnaires created over 20 years ago. Both are made up of statements. The respondents should indicate the degree of agreement with them. All these statements characterize the usability of the product from different angles. For example: “I could easily find the information I needed”, “Various system capabilities are easily accessible” and so on.
  • ... A questionnaire that often helps us on tests. The user is provided with a set of adjectives, from which he chooses those that can characterize the product. As a result, you get a cloud of words - the characteristics of your project. This technique often produces very interesting results.
  • Game Experience Questionnaire. Classic usability questionnaires cannot be applied to games: engagement in the game process is much more important than the clarity of interfaces. Therefore, for games, you should always compose special questionnaires or use the Game Experience Questionnaire. The questionnaire contains several modules: a basic module, an in-game block, a post-questionnaire and a questionnaire of social possibilities of the game.
  • The material was published by the user. Click the "Write" button to share your opinion or tell about your project.

Preparation, interviews and data collection

To bookmarks

Head of UX Research at Mail.Ru Group Natalia Sprogis in the company's blog at Habrahabr spoke about the preparation and conduct of usability testing: what to include in the test scenario, how to choose a data collection method, compose tasks and collect the impressions of respondents.

A test plan is, on the one hand, a set of tasks, questions and questionnaires that you give to each respondent, and on the other hand, the methodological base of the research: metrics and hypotheses that you test and record, the selected instruments.

Is testing exactly necessary

First, you need to be sure that at this stage the project needs usability testing. Therefore, clarify for what purpose the project team is contacting you. Usability testing is not omnipotent, and already at the start you need to understand what it can bring to the product. Prepare the project team right away for which questions you can answer and which you cannot. There have been cases when we either offered customers a different method (for example, in-depth interviews or diary research are now better suited), or recommended that we abandon the research altogether, and instead do a split test.

For example, we never undertake qualitative research to test the “attractiveness” of a feature or design option. We can collect feedback from users, but the risk is too great that their responses will be influenced by social desirability. People are always inclined to say that they would use even what they will not use. And the small sample size does not allow such answers to be trusted. For example, we had a bad experience of testing game landing pages: the landing page that was chosen as the most attractive on the test performed much worse in A / B testing.

Testing prototypes and concepts also has a number of limitations. When planning, you must understand what you can really "squeeze" out of this test. It's great when a project has the opportunity to test prototypes or designs prior to implementation. However, the less detailed and working prototype, the higher the level of abstraction for the respondent, the less data can be obtained from the test. Testing prototypes best reveals naming and metaphor problems for icons, that is, all issues of clarity. The ability to test something beyond this strongly depends on the essence of the project and the detail of the prototype.

Basis for writing a usability test script

Test planning begins not with drafting the text of the assignments, but with a detailed study of the goals and research questions together with the project team. Here's the basis for making your plan:

Important scenarios. These are those user scenarios (tasks, or use cases) that affect the business or are related to the purpose of testing. Even if the team suspects problems in specific locations, it is often worth checking the main cases. In this case, the following scenarios can be considered important for the test:

  • the most frequent (for example, sending a message in a messenger);
  • affecting business goals (for example, working with a payment form);
  • related to the update (those that were affected by the redesign or the introduction of new functionality).

Known Issues. Often research is needed to answer the root cause of a service's business problem. For example, a producer is worried about a large churn of players after the first hour of play. And sometimes the problem areas of the interface are already known to the team, and you need to collect details and specifics. For example, the support service is often asked about the form of payment.

Questions. The team may also have research questions: for example, do users notice a banner advertising additional services; whether a specific section is clearly named.

Hypotheses. This is what the team's known issues and questions translate into. It is good if the customer comes to you with ready-made hypotheses - for example, “Our customers pay only from the phone with a commission. Perhaps users do not see the choice of a more advantageous payment method. " If there are no hypotheses, but there is only a desire to test the project abstractly "for usability", your task is to formulate these hypotheses.

Think with the project team about places where users do not behave as expected (if available). Find out if there are design elements that have been controversial and that may be problematic. Do your own product audit to find potential user issues that are important to test on. All this will help you make a list of those elements (tasks, questions, checks) that should be included in the final scenario.

Data collection method

It is important to consider how you will collect data about what happens during the test for later analysis. The following options are traditionally used:

Observation. While completing tasks, the respondent is left alone with the product and behaves as he sees fit. The respondent's comments are collected through questionnaires and communication with the moderator after the test. This is the "cleanest" method, it provides a more natural behavior of the respondent and the ability to correctly measure a number of metrics (for example, task execution time).

However, a lot of useful quality data remains behind the scenes. Having seen this or that behavior of the respondent, you cannot understand why he acts this way. Of course, you can ask about this at the end of the test, but, most likely, the respondent will remember well only the last task. In addition, during the execution of tasks, his opinion about the system may change, and you will get only the final picture, and not the first impressions.

Think Aloud (thinking out loud). For a long time, this method was used most often in usability testing. Jacob Nielsen once called it the main tool for evaluating usability. The bottom line is that you ask the respondent to voice all the thoughts that arise when working with the interface, and comment on all their actions. It looks like this: “Now I'm going to add this product to the cart. Where is the button? Oh, here she is. Oh, I forgot to see what color there was. "

The method helps to understand why the user behaves in one way or another and what emotions the current interaction evokes in him. It is cheap and simple, even an inexperienced researcher can handle it.

However, it has its drawbacks. First, it’s not natural for people to “think out loud” all the time. They will fall silent often and you will have to constantly remind them to keep talking. Secondly, tasks with this method take a little longer than in real life. In addition, some of the respondents are beginning to use the product more thoughtfully. Speaking the reasons for their actions, they try to act more rationally, and they just do not want to look like idiots, and you may not catch some intuitive moments of their behavior.

Active Moderator Intervention... The method is ideal for testing concepts and prototypes. During the execution of tasks, the moderator actively interacts with the user: at the right moments he finds out the reasons for his behavior and asks clarifying questions. In some cases, the moderator can even issue unscheduled tasks arising from the dialogue.

This method allows you to collect the maximum amount of quality data. However, it can only be used if you trust the professionalism of your moderator. Incorrectly worded or at the wrong time asked questions can greatly affect the behavior and impressions of the respondent and even make the test results invalid. Also, when using this method, almost no metrics can be measured.

Retrospective think aloud, RTA (retrospective). This is a combination of the first two methods. The user first performs all tasks without interference, and then a video of his work is played in front of him, and he comments on his behavior and answers the questions of the moderator. The main disadvantage of this method is that the testing time is greatly increased. However, there are times when it is optimal.

For example, once we were faced with the task of testing several types of mobs (game monsters) in one RPG. Naturally, we could neither distract the respondents with questions, nor force them to comment on their actions during the battle. This would make it impossible to play where concentration is needed to win. On the other hand, the user would hardly be able to remember after a series of fights whether he noticed the first rat's ax on fire with a red ax. Therefore, in this test we used the RTA method. With each user, we reviewed his fights and discussed what effects of monsters he noticed and how he understood them.

Try to think about how to get enough data while keeping the respondent as natural as possible. Despite the simplicity and versatility of the "thinking out loud" method, which has long been the most popular in usability testing, we are increasingly trying to replace it with observation. If the moderator sees an interesting behavior of the respondent, he will wait until he completes the task and ask a question after. Immediately after the assignment, it is more likely that the respondent remembers why he did this.

The eye tracker helps a lot in this matter. By seeing the focus of the respondent's current attention, you can better understand their behavior without asking unnecessary questions. In general, the eye tracker significantly improves the quality of moderation, and this role, in my opinion, is no less important than the ability to build hitmaps.

Metrics

Metrics are quantitative indicators of usability. As a result of testing, you always get a set of problems found in the interface. Metrics allow you to understand how good or bad everything is, as well as compare with another project or previous versions of the design.

What are the metrics

According to ISO 9241-11, the main characteristics of usability are efficiency, productivity and satisfaction. Different metrics may be relevant for different projects, but they are all tied in one way or another to these three characteristics. I will write about the most commonly used metrics.

Successfulness of assignments. You can use a binary code: you did the job or you didn’t. We often follow the Nielsen approach and distinguish three types of success assessments:

  • coped with the task with almost no problems - 100%;
  • encountered problems, but completed the task on its own - 50%;
  • did not cope with the task - 0%.

If 4 out of 12 respondents coped with the task easily, 6 - with problems, and 2 failed, then the average success on this task will be 58%.

Sometimes you will come across a situation when respondents who are very different in terms of the degree of “problematicity” fall into the middle group. For example, one respondent struggled with each field of the form, and the second made only a slight mistake at the very end. You can give the grade at your own discretion, depending on what happened on the test. For example, 25% - if the respondent has just started to complete the task, or 80% - if he made a minor mistake.

To avoid too much subjectivity, think about the rating scales in advance, rather than decide for each respondent after the test. It is also worth considering what to do with errors. For example, you gave the task to buy cinema tickets on the project "Kino Mail.Ru". One of the respondents accidentally bought a ticket not for tomorrow, but for today, and did not notice it. He is confident that he has coped with the task and has the ticket on hand. But his mistake is so critical that he will not get into the cinema, so I would put "0%", despite the fact that the ticket was bought.

Success rate is a very simple and straightforward metric, and I recommend using it if your assignments have clear goals. A glance at the assignment success graph allows you to quickly identify the most problematic areas of the interface.

Time for completing tasks. This metric is indicative only by comparison. How do you know if it's good or bad if a user completes a task in 30 seconds? But the fact that the time has decreased compared to the previous version of the design is already good. Or the fact that registration on our project takes less time than competitors. There are interfaces where reducing the time to complete tasks is critical - for example, the working interface of a call center employee.

However, this metric is not applicable for all tasks. Let's take the task of selecting a product in an online store. Users should quickly find filters and other interface elements related to product search, but the selection process itself will take them different time, and this is completely normal. When choosing shoes, women are ready to look at 20 pages of issue. And that doesn't necessarily mean that there were no matching products on the front pages or that they don't see filters. Often they just want to see all the options.

Frequency of problems. Any usability test report contains a list of issues encountered by respondents. The number of respondents who encountered a problem is an indicator of its frequency within the test. This metric can only be used if your users performed exactly the same tasks.

If there were variations in the test or the tasks were not clearly formulated, but compiled on the basis of interviews, then it will be difficult to calculate the frequency. It will be necessary not only to count those faced, but also to estimate how many respondents could face the problem (performed a similar task, entered the same section). However, this characteristic allows the team to understand which problems should be fixed first.

Subjective satisfaction. This is a subjective assessment by the user of the convenience or comfort of working with the system. It is revealed using questionnaires that respondents fill out during or after testing. There are standard questionnaires. For example, System Usability Scale, Post-Study Usability Questionnaire, or Game Experience Questionnaire for games. Or you can create your own questionnaire.

These are far from the only possible metrics. For example, here's a list of 10 UX metrics that Jeff Sauro highlights. For your product, the metrics may be different: for example, at what level do the respondents understand the rules of the game, how many mistakes they make when filling out long forms. Remember that the decision to use multiple metrics imposes a number of limitations on testing. The respondents should act as naturally as possible and in the same conditions. Therefore, it would be good to provide:

  • Single starting points. The same tasks for different respondents should start from the same point in the interface. You can ask the respondents to return to the home page after each assignment.
  • Lack of intervention. Any communication with the moderator can affect the performance metrics if the moderator unwittingly prompts the respondent to do something, and increases the time it takes to complete the task.
  • Order of tasks. To compensate for the learning effect of comparative testing, be sure to reverse the order in which the comparison products are introduced for different respondents. Have half start with your project and half with a competitive one.
  • Success criteria. Think in advance what kind of behavior you consider successful for the assignment: for example, is it permissible for the respondent not to use filters when selecting a product in an online store.

Interpretation of metrics

Remember that classic usability testing is qualitative research and the metrics you get are primarily illustrative. They provide an overview of the different scenarios in the product, allowing you to see pain points. For example, account settings are more complex than registration in the system. They can show the dynamics of change if you measure them regularly. That is, the metrics make it possible to understand that in the new design, the task has become faster. It is these relationships that are much more indicative and reliable than the found absolute values ​​of the metrics.

Jeff Sauro, a UX research statistician, advises not to represent metrics as averages, but to always consider confidence intervals. This is much more correct, especially if there is a variance in the results of the respondents. To do this, you can use its free online calculators: for success and for assignment time. You cannot do without statistical processing and when comparing the results.

When metrics are needed

Not every usability test report contains metrics. Collecting and analyzing them takes time and imposes restrictions on the test method. Here are the cases when they are really needed:

  • Prove. There is often a need to prove that changes need to be made to the product - especially in large companies. For decision-makers, the numbers are clear, understandable and familiar. When you show that 10 out of 12 respondents were unable to pay for an item, or that registration in the system takes on average twice as long as competitors, it gives the research results more weight.
  • Compare. If you are comparing your product to others on the market, you also need metrics. Otherwise, you will see the advantages and disadvantages of different projects, but you will not be able to assess where your product occupies among them.
  • See changes. Metrics are good for regularly testing the same product after changes are made. They allow you to see the progress after the redesign, to draw attention to those places that were left without improvement. You can use these indicators again as an evidence base that will show the management the weight of investments in the redesign. Or just to understand that you have achieved results and are moving in the right direction.
  • Illustrate, accentuate. The numbers help illustrate important issues well. Sometimes we count them for the brightest and most important points of the test, even if we do not use metrics in all tasks.

However, we do not use metrics in every test. You can do without them if the researcher works closely with the project team, there is internal trust and the team is mature enough to correctly prioritize problem solving.

Data capture method

It would seem, what's wrong with a notebook and a pen or just an open Word document? In today's Agile development world, UX researchers should try to deliver their observations to the team as quickly as possible.

To optimize analysis time, it is a good idea to prepare a template in advance for entering notes during the test. We tried to do this in specialized software (for example, Noldus Observer or Morae Manager), but in practice, tables turned out to be the most flexible and versatile. In advance, mark in the table the questions that you plan to ask exactly, the places for entering the problems found in the tasks, as well as the hypothesis (on each respondent you will mark whether it was confirmed or not). Our plates look like this:

What else can you use:

  • ... Customizable Excel template for entering observations for each respondent. A built-in timer that measures the execution time of tasks, time and success graphs are automatically generated.
  • Rainbow Spreadsheet by Tomer Sharon of Google. A visual table for collaboration between the researcher and the team. The link leads to an article describing the method, and there is also a link to a Google spreadsheet with a template.

With experience, most of the recordings can be done right during the test. If you are not in time, then it is better to write down everything that you remember immediately after the test. If you return to the analysis in a few days, you will most likely have to review the video and spend much more time.

Preparing for testing

In addition to the method, metrics and the testing protocol itself, you need to decide on the following things:

Format of communication with the moderator. The moderator can be in the same room as the test participant. In this case, it will be easy for him to ask questions on time. However, the presence of the moderator can influence the respondent: he will start asking questions to the moderator, provoking him to prompt him, either explicitly or implicitly.

We try to leave the respondent alone with the product for at least part of the test. So his behavior becomes more relaxed and natural. And in order not to run back and forth if something goes wrong, you can leave any messenger with audio connection turned on so that the moderator can contact the respondent from the observation room.

Method of setting tasks. Tasks can be voiced by a moderator. But in this case, despite the uniform testing protocol, the text of the assignment can be pronounced slightly differently each time. This is especially true if the test is conducted by several moderators. Sometimes even small differences in wording can put respondents in different starting conditions.

To avoid this, you can either "train" the moderators to always read the texts of the assignment, or give the respondents assignments on pieces of paper or on the screen. The difference in wording ceases to be a problem if you use a flexible scenario, when tasks are formulated during the test, based on an interview with a moderator.

You can use the product tools for setting assignments. For example, when testing ICQ, respondents received tasks through a chat window with a moderator, and when testing Mail.Ru Mail, they received them in letters. This way of setting tasks was as natural as possible for these projects, and we also tested the basic correspondence scripts many times.

Creating a natural context. Even if we are talking about laboratory research, think about how to bring the use of the product on the test closer to real conditions. For example, if you are testing mobile devices, how will respondents hold them? For a good image on video, it is better when the phone or tablet is fixed on a stand or lying on a table. However, this does not make it clear whether all the zones are accessible and convenient for pressing, because phones are often held with one hand, and with tablets they lie on the couch.

It is worth thinking about the environment in which the product will be used: whether something distracts a person, is it noisy, is the Internet good. All of this can be simulated in the laboratory.

Test plan for the customer. This is also an important preparation step as it involves the project team. You may not tell the customer about all the methodological features of the test (how you will communicate with the respondent, record data, etc.). But be sure to show him what the tasks will be and what you are going to check on them. Perhaps you did not take into account some features of the project, or maybe the project team will have additional ideas and hypotheses. We usually get a similar plate:

Report outline. Naturally, the report is written based on the research results. But there is a good practice - to draw up a report plan even before the tests, based on the goals and objectives of the study. With such a plan in front of your eyes, you can check your scenario for completeness, as well as prepare the most convenient forms in order to record the data for subsequent analysis. Perhaps you decide that the report is not needed and that a common observation file is enough for you and the team. And if you motivate the team to complete it with you, it will be great.

Of course, you can just “let your friend use the product” and watch what difficulties they have. But a well-written scenario will allow you not to miss important problems and not accidentally push the respondent to the answers you need. After all, usability testing is a simplified experiment, and in any experiment, preliminary preparation is important.

Any usability testing protocol consists of the following parts:

  • Briefing or briefing (greeting, description of the event, signing of documents).
  • Introductory interview (screening check, short interview about product use, context and scenarios).
  • Working with the product (testing tasks).
  • Collecting final product impressions based on testing experience.

Briefing or briefing

Regardless of the subject matter of testing, any research starts the same way. What should be done:

Create an atmosphere. Get to know the person, offer him tea, coffee or water, show him where the toilet is. Try to relax the respondent a little, as he may be nervous before the event. Find out if it was easy to find you, ask your mood.

Describe the process. Tell us what kind of event the respondent is waiting for, how long it will take, what parts it consists of, what you will do. Be sure to point out to the respondent that their input will help improve the product and that you are not testing a person's ability. If you are videotaping, alert the respondent and tell him that the data will not appear on the network. I say something like this:

We are located in the office of Mail.Ru Group. Today we will talk about the XXX project. It will take about an hour. First, we will talk a little, then I will ask you to try to do something in the project itself, and then we will discuss your impressions. We will record what is happening in the room and on the computer screen. The recording is needed solely for analysis, you will not see yourself on the Internet.

We are conducting research to make the XXX project better, to understand what needs to be corrected in it and in which direction it should develop. Therefore, I ask you to openly express any comments: both positive and negative. Don't be afraid to offend us. If, when studying the project, something does not work out for you, take it easy. This means that we have found a problem that the project team needs to fix. The main thing is to remember that we are not testing you, you are testing the product. If you're ready, I suggest you get started.

To sign documents. As a rule, this is consent to the processing of personal data, and sometimes also an agreement on non-disclosure of information about testing. For tests with minors, parental consent is required for their child to participate in the study. We usually send it to the parents in advance and ask them to bring it with you. Be sure to explain why you are asking to sign documents, and give time to study them. In Russia, people are wary of any papers that need to be signed.

Configure equipment. Whether you are using eye tracking, biometric equipment, or simply recording video, it's time to turn it on. Warn the respondent when you start recording.

Introductory interview

It solves the following tasks:

Check recruiting. Just in case, always start with this - even if you trust the agency or the person who found the respondent. More than once already during the test, we found out that the respondent misunderstood the questions and, in fact, does not use the product quite the way we need it. Try to move away from formality and not ask questions from the screening questionnaire: the person may already know what to answer to them.

Product use scenarios and context. Even if you have little time for the test, do not skip this point. At least in general, ask the respondent what tasks he solves with the help of the product, whether he uses similar projects, in what conditions he interacts with them and from what devices. The answers will help you better understand the reasons for the respondent's behavior, and if you are using flexible scenarios, then formulate appropriate tasks. If there is enough time, ask the respondent to show what and how he usually does. This serves as a source of further questions and insights.

Expectations and attitudes. The beginning of testing is a good time to find out what the respondent knows about the product, how he feels about it and what he expects from him. After the test, you will be able to compare the expectations with the final impression.

For most tests, this introductory interview structure will work. If you are testing a new product, you might want to skip the introductory questions. Going into too much detail on a topic can create certain expectations for the user of the product. Therefore, leave only a couple of general questions in order to establish contact with the respondent, and immediately proceed to the assignments, and it is better to discuss scenarios, relationships and context after the user first learns the product.

Working with the product, drawing up tasks

What are the tasks

Let's say you want to test an online store. You have important scenarios (product search and selection, checkout process), known problems (frequent mistakes in the form of payment), and even the hypothesis that the designer tricky with the price filter. How to formulate tasks?

Focused assignments. It seems obvious to do something like this: "Choose a dishwasher 45 centimeters wide with a" beam on the floor "function, which costs no more than 30 thousand rubles." This motivates the respondent to use filters and compare products with each other. You will be able to check the filter by price on all respondents and look at the key scenario of the item selection. Such tasks are quite right for life and are good for testing specific hypotheses (as with a filter by price).

However, if the entire test consists of them, then you risk the following:

  • Point check of the interface. You will only find problems related to job details (filter by price and width). You will not see other problems - for example, sorting products or other filters - if you do not specify them too. And you are unlikely to be able to do tasks for all the elements of the site.
  • Lack of involvement. Users often perform such tasks mechanically. When they see the first item that matches the criteria, they stop. Perhaps in his life the respondent has never chosen a dishwasher and does not care what a "ray on the floor" is. The more the task resembles a real-life situation and the more context it contains that is understandable to the user, the higher the chances of engaging a respondent who will imagine that he is actually choosing a product. And the involved user “lives” the interface better, leaves more comments, increases his chances of finding problems and providing useful knowledge about the behavior and characteristics of the audience.
  • Narrowed spectrum of insights. In real life, the user might not have picked the product at all. For example, I would not use filters at all (and here you pointed to them). Or I would search for a product according to criteria that are not on the site. By giving tough, focused tasks, you will not learn about the real context of using the product, you will not find scenarios that the project team may not have foreseen, you will not collect data on content and functionality needs.

Assignments with context. One way to better engage users is to add real story and context to the dry task. For example, instead of “Find a recipe for a plum cake on the website,” suggest the following: “Guests will come to you in an hour. Find what you can bake during this time. You have everything for the biscuit in your fridge, as well as a little plum. But, unfortunately, there is no butter. "

A similar approach can be used with an online store. For example: “Imagine you are choosing a gift for your sister. Her hair dryer recently broke and would be delighted to have a new one. You need to keep within 7 thousand rubles. " It is important that the respondent actually chooses a real person to whom he will “buy” the gift (if there is no sister, suggest another relative or girlfriend). The key factor for such assignments is reality and clarity of context. It is easy to imagine that you are choosing a gift for your family, much more difficult - that you are "an accountant who prepares an annual report."

A striking example of this approach is the “Bollywood Method”, which was invented by the Indian UX expert Apala Lahiri Chavan. She argues that it is difficult for Indians, like many Asians, to openly express opinions about the interface. But, presenting themselves as heroes of fictional dramatic situations (as in their favorite films), they open up and begin to actively participate in testing. Therefore, tasks for Indians should look something like this:

Imagine that your beloved young niece is about to get married. And then you find out that her future husband is a swindler, and even married. You urgently need to buy two tickets for the flight to Bangalore for yourself and for the cheater's wife in order to upset the wedding and save the family from shame. Hurry up!

Assignments based on the experience of the respondents. Recall: for successful testing, respondents must match the audience of the project. Therefore, to check the online store of household appliances, we recruit those who have recently chosen appliances or are choosing them now. This is what we will use when compiling assignments based on the experience of the respondents. There are two options for using this approach:

  • Respondent parameters. In this case, you adapt the fixed tasks to the respondents. For example, in the case of a home appliance store and the task of working with filters, ask the person what exactly he recently purchased. Find out the criteria (price, functions) and offer to repeat the "purchase" on your site.
  • Respondent scenarios. The assignments are fully formed based on the experience of the participants. To understand which scenarios to test, the moderator finds out exactly how the person solved the problem in life, and suggests doing it on the site. For example, a buyer, before making a choice, compared several models with each other for a long time. Even if the site does not have a suitable function, ask the respondent to compare products in order to understand what parameters he will rely on. You might get an idea of ​​what the compare function should look like, and you can also adapt the product page for this scenario.

Such tasks provide many real-life examples of performing basic operations in the product. This often gives rise to a much wider range of problems and findings. In addition, it allows you to test the product on new scenarios that you did not consider basic or even thought out.

When we tested the project "Mail.Ru Real Estate", many discoveries helped to do exactly the tasks based on the experience of the respondents. We saw that people, when looking for an apartment in the Moscow region, indicate in the geofilter the end stations of the metro, meaning that these are stations that can be reached from the region. We expected that the metro filter is looking for an apartment near the station. We also learned how the scenarios for the search for new buildings differ from the secondary housing, which helped to move the search for new buildings to another section on the site - with its own filters and its own concept for describing apartments. I also advise you to read Jared Spool's excellent article on the benefits of such assignments.

Assignments without assignments. Sometimes it is better not to offer users tasks to work with a project at all, but to see how they themselves begin to get acquainted with the product. Give the respondent an introduction: “Imagine you decided to try this product. I'll leave you for a few minutes. Do what you would do in real life. I don’t give you any assignments. ”

It is important that the moderator leaves the room at the same time. Otherwise, the user is tempted to immediately ask something, to clarify: “Do I need to register? How can you do this? " etc.

This type of assignment is useful for completely new products. We often use it for mobile apps and games. This way we find out whether users read the training materials, what details immediately attract attention, what people understand in the concept of the product, and how they later describe its capabilities. After the free task, specific scenarios are planned.

Another area of ​​application for free assignments is content projects. If you want to understand how your articles are read (where they stay for a long time, what they miss, what elements on the page they pay attention to), then just leave the respondent alone with the project for a few minutes. Only without the moderator looking over his shoulder, the user will relax and read the text in the same way as usual. This is how we test the projects "News Mail.Ru", "Lady Mail.Ru" and others. This approach allowed us to highlight different patterns of behavior on the site, different patterns of reading articles and understand what types of materials should be styled differently.

We make good assignments

The first task is simple. Start testing with introductory and easy tasks. The respondent should become comfortable with the test format, especially if you are using the “thinking out loud” method: he needs to get used to the need to voice his thoughts and feelings. Do not immediately dump all the pain and suffering of the interface on him.

Don't tell me. Formulate the tasks in such a way that you do not prompt the respondent to do the right thing. If you want to test the ability to add products to favorites in an online store, do without the task "Let's add this TV to favorites", especially if the button is called that. After reading the assignment, the respondent will simply find a button with the required signature on the screen - perhaps without even understanding what he is doing.

It is better to explain the meaning of the task without resorting to terms in the interface. For example: “The site has the ability to save the products you like and then choose which of them to order. Let's try to do it with such and such a TV. "

Follow the terminology. Do not use incomprehensible words and designations. It seems obvious, but we, getting used to some terms, often forget that very few people outside the IT community know them. For example, when testing the new functionality of threads (message chains) in Mail.Ru Mail, we had a difficult time. After all, users who are unfamiliar with this function simply do not have a term in their heads that would denote threads.

As a result, we did not name them in any way. We just showed the respondents a box with connected chains and discussed this new feature, and let the users choose the word for the threads themselves. This helped us later to use the most understandable texts in educational promotional materials.

Follow not only the assignments, but also the moderator's questions, especially those that come from the team during testing. For example, when discussing functions, you should not use the word "toolbar": not everyone is familiar with it. A few years ago, not all users even knew the word "browser". How exactly the tasks are best formulated depends on the testing audience. You should not rush to the other extreme, explaining all the terms in a row. For example, experienced players do not need to explain what "buff", "frag", "respawn" and so on are.

Less test. It is often tempting to create a test account for the respondent in the system and conduct testing on it. After all, you can test everything in this account in advance, avoid overlaps and not waste time registering or authorizing a respondent. It is often also technically much easier to incorporate a new design on test data rather than on real data.

However, with this approach, you run the risk of getting much less useful results, because the test actions have no real consequences. The situation becomes completely artificial, it is difficult for users to project it into a real experience.

For example, when working in their own account on the social network, respondents, as in real life, will carefully do everything that their friends can see (post links, send messages). When setting up their own mailbox, they will try not to delete important letters. When testing online stores, an approach is sometimes used when the reward must be spent right on the test. In this case, the respondent will not point to the first suitable product for the assignment, but will pick up what he really needs.

With only test data, you will find problems related only to them, and not test functionality on different variations. For example, when we tested the social panel of the Amigo browser, one of the respondents, who connected his VKontakte account to the panel, immediately noted that it was inconvenient for him to read this way. Almost the entire feed consisted of subscriptions to groups with erotic photos. And in the narrow panel in the pictures it was simply impossible to see anything.

Another problem with the test data is that it is difficult to understand the system, since everything around is unusual. For example, a social network user is used to recognizing his page by his own photo. Even when testing prototypes, we try to personalize them as much as possible. For example, when testing clickable prototypes in Odnoklassniki, we always adapt them for each user, inserting his name and photo, and sometimes the latest news on the page.

Don't be limited by the interface. Keep in mind that interactions with a product are often more than just the interface. If possible, test related products or services and the links between them. When testing games, we try to check not only the game, but also its website and related downloads, registration in the game, and search for information on the forum. And when testing one online store, I also checked the operator's call after placing an order, which gave recommendations for the call center.

Think about timing. For a good script, it's important to prioritize tasks. Most likely, if the system is large and the test has many goals, you will want to do a lot of tasks. However, a tired respondent will no longer be useful. A good test lasts no more than an hour and a half, two is the maximum. The only exception is games. And remember that your goals are not just assignments, but interviews, questionnaires, setting up equipment, and signing documents. All this usually takes at least half an hour.

If there are too many tasks, and you don't want to give up some, you can put the least priority ones in rotation, that is, show only a part of the respondents. Or make part of the test compulsory for everyone, and watch the rest only with those with whom there is enough time. But these will most likely be the most successful respondents.

Evaluate the usefulness of the assignment. Consider if it really matches your hypotheses. For example, you want to test the news subscription feature on a website. The task "Subscribe to the newsletter" will only check whether those who will search for it will be able to find the newsletter. However, people rarely come to the site to subscribe to news. The assignment does not apply to real life. You need to understand if the subscription option is noticed by those who perform completely different tasks.

You can check this in different ways, depending on the implementation of the function. If the person was engaged in tasks in which he might have seen the possibility of a subscription, ask him if it is on the site. Just be sure to clarify where he saw this opportunity or how it was implemented to make sure that the respondent does not just agree with you.

If a subscription offer is built into the registration or checkout process, see if the respondent will use it, and then discuss it after the assignment. There is very little chance that in a laboratory setting people will actually subscribe to mailing lists, but you can check whether a person has paid attention to this possibility, what he expects from the mailing list, and so on.

Collecting final impressions

The goal of the final testing phase is to collect impressions of working with the product, to understand what the user liked and what upset him, to assess subjective satisfaction. Typically, this part of the test uses a combination of an interview with a moderator and filling out formal questionnaires.

Moderator interview

In the final interview, we always ask the respondents about the same questions: "What impressions did you have?", "What did you like and what did not?" , "What would you like to change in the product?" It's time to clarify the incomprehensible moments of the respondent's behavior, if you did not do this during the test. If, before the test, you learned from users about the attitude and expectations of a brand or product, find out if anything has changed. When interviewing, pay attention to the following:

Social desirability. Handle your interview results very carefully. If during the test you often hear impulsive comments under the influence of problems, then in the final interview, social desirability flourishes with might and main.

Some people think that when they talk about problems in the product, they admit their own incompetence. Others just don't want to upset an enjoyable moderator. Very often the respondents (especially women), who suffered through the whole test, say that everything is, in principle, normal. Negative reviews can also be dictated by social desirability: if the respondent is confident that the purpose of the test is to find flaws, he diligently tries to find them.

Quotes and Priorities. Although all the words of the test participants in the final interview often need to be divided by two or even ten, this does not mean that they are useless. By the way respondents summarize their impressions, you can infer priorities. The product sucks? What exactly influenced this? Which of the many problems did the respondent remember the most and consider the most annoying?

However, make allowances for the last task that is remembered best. It is also very useful to keep track of what adjectives the respondents use to describe the product, to what they compare their experience.

Let's not forget about the good. Very often, a usability test report is a long list of problems found during the test. In general, the search for problems is one of the main tasks of research. But don't forget about the positive aspects of the product.

First, a report with no positive results simply demotivates the team. And secondly, it is useful to know what users like about the product: what if, at the next redesign, they decide to remove the function that everyone liked so much. Therefore, be sure to ask respondents about the positive aspects of the product, even if they scolded the interface during the entire testing.

Attitude towards "Wishlist". Most likely, the respondents, in addition to their impressions, will express wishes and ideas. Your task is to understand what is the problem behind the proposals. Because the solutions that users suggest will most likely not work for you. After all, test participants are not designers, they are not aware of the features and limitations of development. However, there is a need behind any such request that you must capture. If a respondent says that he definitely needs a big green button here, be sure to ask: why?

Satisfaction measure

Often, according to the respondent in the final interview, it is difficult to understand whether he liked the product or not, and even more so it is difficult to compare the attitude of several respondents who noted both advantages and disadvantages. Here questionnaires come to the aid of the researcher. First, when filling out the questionnaire (especially before talking with the moderator), the influence of the notorious social desirability is slightly less, although you will not get rid of it completely. Second, the questionnaire gives you clear parameters for comparing scenarios, products, or project stages.

Writing a good questionnaire is a separate and very large topic. Formulations, scales, and much more are important here. Ready-made and tested questionnaires can be a good help: they have already been refined and tested many times. The only problem is that almost all of these questionnaires do not have official translations into Russian. Naturally, you can translate them yourself, but from a methodological point of view, translations need to be tested in order to check the correctness of the wording. Nonetheless, the questionnaires can serve as a guideline for the compilation of your own questionnaires.

There are questionnaires that are given after each assignment to assess satisfaction with specific scenarios. For example:

  • After Scenario Questionnaire (ASQ). Three questions about complexity, productivity, and prompts in the system.
  • Single Ease Question (SEQ). One question about the complexity of the script.

And there are questionnaires that are used in the final phase of testing. Here are some examples that we use when needed:

  • System Usability Scale and Post-Study System Usability Questionnaire. Two classic and popular questionnaires created over 20 years ago. Both are made up of statements. The respondents should indicate the degree of agreement with them. All these statements characterize the usability of the product from different angles. For example: “I could easily find the information I needed”, “Various system capabilities are easily accessible” and so on.
  • ... A questionnaire that often helps us on tests. The user is provided with a set of adjectives, from which he chooses those that can characterize the product. As a result, you get a cloud of words - the characteristics of your project. This technique often produces very interesting results.
  • Game Experience Questionnaire. Classic usability questionnaires cannot be applied to games: engagement in the game process is much more important than the clarity of interfaces. Therefore, for games, you should always compose special questionnaires or use the Game Experience Questionnaire. The questionnaire contains several modules: a basic module, an in-game block, a post-questionnaire and a questionnaire of social possibilities of the game.
  • The material was published by the user. Click the "Write" button to share your opinion or tell about your project.

That about half of the sites are losing customers due to inconvenient design and other shortcomings in usability. Just imagine: an unsuccessful order form or an unclickable picture - and a potential buyer incurred the money to competitors. To understand whether your visitors like your site and whether it is convenient for them to shop on it, conduct usability testing.

What should be tested?

If the site is already working, what elements of it need to be tested will become clear after the analytics. A real example of a usability test by Imaginary Landscape: Google Analytics shows that users visit a page with a feedback form, but do not send a request for a call. After usability testing, it became clear that people didn't want to fill in that many lines. The form was changed, the number of lines was reduced from 11 to 4, after which the conversion increased by 140%. Profit.

it became

They are also testing pictures. They should grab the attention of users, evoke the right emotions, and encourage them to take the right action. If you can see from the heatmap that a picture is being clicked on, but it does not lead anywhere, it is logical to make it a link to something useful and relevant. Or vice versa: by clicking on the image, users will find interesting information, only they do not know about it. The problem is easy to solve. It is enough to make the picture react to the cursor hovering - it slightly enlarges or lights up, as if inviting to click on itself. But first, you need to find out about the existence of the problem, and for this you need to conduct usability testing.

Among other things, they test:

  • the structure of the site to make it simple and understandable for the user;
  • buttons on clickability, especially after the so-called flat design ;
  • menus to determine why users do not enter certain sections.

In general, any elements that a visitor interacts with on a site. Or it doesn't interact, but according to our plan it should.

Ideally, testing is carried out by specialists: they collect data, segment target audiences, form focus groups, draw up questionnaires, and after the research itself, analyze the results. It is time consuming and expensive, although well worth it. But you can get a general idea of ​​the interaction of users with your site on your own, and for free. There are at least 5 ways to do this.


1. Yandex Metrics tools

In runet, from free usability testing tools, they have not yet come up with anything more convenient and functional than Yandex Metrica counters. What is one "Webvisor" worth.

It can be used to reproduce the user's cursor movements, clicks, form filling and text selection. It is as if you are watching a video of the visit of a visitor to your site: where he clicked, what he entered in the search bar, or, for example, at what stage of the checkout he stopped.


Add spinner HTML to your website pages


This is how the "Webvisor" report looks like. And the real magic - viewing user actions - will begin when you click on the play icon.

Webvisor is not the only useful Yandex tool that can help you track user reactions to your site.


A heatmap of clicks will visually show you the most clickable places on the page. For example, on the Yandex click map, we see that the "Get counter" button is being pressed, but news (the right column) is not particularly popular. The hotter the color in a point, the more often users click on it.

The scrolling map will give an understanding of how visitors scroll the page, where they stay the longest, therefore, which part they are interested in. Form analytics will help you figure out how the user interacts with the forms on the site, what fields are filled in the application, what he writes in the search bar, etc. All tools are free, work great and, with a competent approach, will help you effectively analyze the usability of your site.

2. UsabilityHub

The online service offers 5 simple tests:

  • Five Second Test - a test of five seconds. The principle is as follows: you upload a screenshot of the page under test to the site, test participants look at it for 5 seconds, and then give their grade. You can ask different questions, for example, which element attracted the most attention, what was remembered, what the site was about, and the like. Upon completion, you will receive the full answers from the test participants + an automatically generated word cloud.

1. Load a screenshot of the page 2. Ask the participants questions

3. We receive a report with answers 4. For clarity, we study the cloud

  • Click Test - test of clicks. The actions are the same, only instead of responses we get a heatmap of clicks. On it we will see that users, for example, do not click on a strategically important (for us) button, but actively click on the picture. Additionally, the service will provide a report on the number of clicks and average click time.
  • Question Test is a test of questions. You ask about your site - real people answer.
  • Navigation Test - analysis of navigation. It allows you to understand how convenient it is for users to "climb" your site, whether its architecture and navigation are clear.
  • Preference Test - preference. It will help to conduct A / B testing of website design, applications, leaflets. Uploading two design options - users choose which one they like best. Everything is simple and efficient.

UsabilityHub is an English-language service, but Russian is among the proposed testing languages. In the paid version, you choose the number of participants and you will soon receive results. In the free version, you will have to participate in tests yourself, answer questions and evaluate designs. Collect the required number of points for participation - you will get the opportunity to conduct your test.

3. Usabilla

Offers 3 products: website, mobile app and email testing. Testing means good feedback from people in real time: participants give their marks, write comments, point out mistakes, take screenshots for greater clarity. All data is stored in handy statistics.

Among the testing languages ​​there is Russian, but the program itself is in English. In the free demo version, you can test 2 pages with 10 people.


4. Optimal Workshop

The following tools are available on the service:

  • Treejack - testing the "tree" of the site. It helps to understand how users navigate on your site, whether they perform the actions that you expect from them, or get lost in its wilds. For testing, you need to paint the information architecture of your site in a Treejack form. It sounds scary, but each line of the form is signed, if you know English, you can easily handle it. You also need to set tasks for the test participants, for example, to find a mobile phone on the site. As a result, get comprehensive statistics in the form of tables and charts. The free demo version allows you to interview 10 people.
  • OptimalSort - testing using the card sorting method. It helps to find out how the user thinks, what decisions he makes and how it is easier for him to achieve the final goal on your website. How it works: all the elements of the site's content are "written out" in separate cards, and the participants are invited to sort them in the way that would be understandable and convenient for them. As a result, you get reports in the form of tables, matrices, dendrograms and an understanding of the mental model of your users. In the free version, you can test 30 cards by interviewing 10 people.
  • Chalkmark - Helps you get the first impression of a design, shows a heatmap of clicks and analysis of the first click. To test, upload a screenshot of the page, set tasks for the participants and wait for the results in the form of a click map, a color grid and a scheme for counting the number of clicks.

Website layout test results using Chalkmark

In the paid version, you can conduct online surveys with quick feedback.

5. Feng-GUI

Unlike the previous three tools, this one does not provide feedback from living people. The application, based on its algorithms, generates itself a map of the user's attention. That is, it shows where he will look immediately after loading the page and in what sequence (according to the program) his gaze will move from one element to another. For testing, enter the url of the desired page of the site, click Analyze, after a few seconds we get the coveted view map of the user.

Each of the listed tools is good in its own way, but the free versions mostly use limited functionality, besides, the lion's share of services is in English.

Don't want to understand the interfaces and statistics of English-language resources? Go to the people. For example, to evaluate the design of a page, place a screenshot on revision.ru or similar sites, ask for the opinion of "local" - web-designers, developers and other specialists. Conduct polls on social networks, for example, in thematic communities, or gather a focus group among your acquaintances. For a more or less objective analysis of usability, it is enough to interview 5 respondents (according to some sources, 8) from your target audience.

Test the usability of your site. Use free software, interview acquaintances, hire pros - all methods take place. Sometimes you just need to move the button to another place, change some design elements or the application form, and the site will start to bring significantly more benefits. Simply due to the fact that it has become more convenient and understandable for visitors.

Top related articles