I am a starer. It doesn’t help that my eyes are on the large side. Yesterday, sitting in the airport, I was struck by how many people assumed I was looking at them, when instead I was just staring out into space. So, I have a natural bias to question eye-tracking studies. But, there is a real difference between the ways that your face (and your eyes) react when in open-ended learning situations and in information seeking moments.
In websites and mobile devices, you are using these tools for a certain end. You are seeking something specific. Much of your interactions with the interface could be summarized by the phrase, “how do I get to the next place, page, part, link, etc.” In other words, your gaze is often the moment before you take a navigational step.
Eye-tracking studies have real promise in understanding usage in an unmediated way. Even the smoothest researcher is putting their participant on the spot. In this case, the participant is acting in a somewhat normal way. Tools, like the Tobii, do require participants to sit very still–which is not terrible real-world. But, at least, they are not being artificially prompted by a person.
Eye-tracking studies are not just about where people look, but also understanding this in correlation to time. What did they look at first? What are the patterns of things they looked at? What didn’t they look at? In other words, one is assessing behavior. This can then be correlated with attitudinal data, from their talk alouds, for example. But, at the core, eye-tracking is about behavior.
I have continued to ruminate on mobile testing. In thinking about the pervasiveness of mobile, getting mobile right is imperative. But, at the same time, the testing options have major limitations. After all, no one actually hugs a laptop while searching for the ideal episode of Gilmore Girls on Netflix on their surface. And, they probably don’t use a sled when flipping through Pandora on their iPhone. Most testing scenarios just don’t mimic the real world. In fact, they are very different from the real world.
It makes me more sure that there has to be someone out there who can create the ideal mobile testing software. The big challenge with this is that fact that there are many different types of mobile. There is iOS; both phone and table. There are Androids and then there are the Windows tablets. Given the diversity, one might need to create a number of different mobile testing systems. (Apple has a vested interest in locking down their system. They have a controlled access mechanism, i.e. with their developer program.)
Mobile is ubiquitous. We use phones to check the weather, to read the paper, and take pictures. There are now more phones that adults on Earth. Despite the complete diffusion of Mobile, there are still challenges to creating ideal mobile experiences.
Testing remotely has some powerful pluses. Being a fly on the wall helps you understand the unmediated, natural course of actions of your users. Services like Loop11 make remote testing on a computer easy. But, there isn’t a perfect solution on mobile. Many resourceful testers have figured out work arounds to capture similar feedback.
It does make me feel like a resourceful entrepreneur needs to figure out a way to do remote testing of mobile apps in the way that one uses Loop11. After all, remote testing is a way to understand how people might really use something.
Remote testing is incredibly useful for websites. After all the worldwide web is just that–Global. Remote testing means that one can get feedback unencumbered by location of participants. Rather than intercepting people physically, one can grab people as they go about their business on the site you are testing, for example. Recruitment is no longer bound to location. And, with sites like Loop11, it is super easy to recruit users. Just one link, and you are ready. Without the need for a synchronous appointment, you can rack up numerous user rests.
There are drawbacks to remote testing. The most important is that one loses much in the way of emotion, expressions, and verbal feedback from users. This can make it challenging to understand the reasons that users click the buttons they click.
However, remote user testing can offer high volume feedback and identify trends. In other words, while you might not be able to say why someone did something, you can pretty clearly say categorically that certain trends are obvious.
Moderated User Testing is a useful way for testers to work with users who are not in the same location as themselves. There are certain challenges, such as passing on incentives, but at the same time there are enormous benefits, such as being able to reach testers globally.
For the tester, videotaping the session is essential. Moderated user testing can capture facial expressions and user quotes, but it is often challenging to read and assess all of that in real time.
One drawback is that appointments need to be made to run the test. This isn’t an asynchronous experience, in other words. Scheduling something with a remote tester can be challenging, and in certain projects one might find that you have a lot of no shows. So, this could have the challenge of being time consuming.
But, once everything is captured, even with a small subset of users, one can gain quite a lot of feedback, particularly attitudinal feedback. Moderate user testing is also useful in that it allows for the correlation of attitudinal feedback and behavioral feedback.
Testing can be incredibly useful–even essential to rolling out an new product. But it can be cost-prohibitive. Small firms might not have the resources to find the right users, employ testers, set up a room with specialized one-way glass, etc. Of course, people do testing in this way for important reasons: if you have a specialized set up, you need to have the testers on-site.
But, often remote research has significant advantages. If you are developing a website with global reach, testing remotely allows you to create a diverse testing base. You save money in terms of space and set up. Newer tools allow testers to remote in, see all the key strokes, but also the facial expressions of the testers.
Remote testing isn’t without its challenges. If you connection to your remote tester fails, you are out of luck. You might not be able to observe facial expressions clearly with the interface. You do have to find a way to send incentives to remote testers. And, you might get push back from your stakeholders as remote testing isn’t universally accepted.
Even with the possible downsides, the significant positive points make remote research an important tool for user experience researchers.
I remember feeling like my first semiotics class was eye opening. I had never considered that there could be an order to language or that that there was a science to understanding this order. Now, all this is a bit of an aside, but I bring it up because there is a parallel with usability testing. There is both an order to how people act and then a tandem act in which evaluators observe to make sense of what people do.
Video helps this latter act considerably. Without it, the evaluator will need to scribble notes and inevitably miss things. With the video, the tester has the audio, including all the textual responses, the gesture of the mouse, and the facial expressions. All of these tools are helpful in assessing usability.
The key is for the tester to create a framework where users feel comfortable testing the site, and sharing their ideas. Once that framework is in place, then one will find very useful information. But, without it, the user won’t feel comfortable sharing. A script helps the tester to be assured that they are saying the same thing each time. But, this script also helps the tester feel ready to put their user at ease.
Once that is done, one has the long task of many sense of the data. Often wading through all the information is almost as much fun as generating the data. Interpretation of evaluation data is the process of bringing into order disorder through noticing patterns. Once the patterns are clear, a good tester then develops a scheme to make sure that these patterns are obvious to anyone who reads the deliverable.
Talking is my occupation. Teaching is in a manner of speaking about talking and talking and talking. Or, I should say that teaching is about attempting to communicate an idea in multiple ways. Some of those ways are about your voice, others are about the hearing the voice of others, and sometimes its about reiterating their voice.
In this week, I have found my voice increasing muted by laryngitis, and it has made me think a little about the role of voice in my work, both in teaching but also in evaluation. it almost seems as if you might not need a voice at all in order to allow your participants to share theirs. But, really, evaluation and testing aren’t really about just listening, they are about sharing, framing, and positioning as well. In honor the time spent by participants, one must create a situation that sets up the participant’s experience.
It isn’t just about the words that one says, but also about the tone of voice, the pacing of the things that are said, and even the inherent emotion in the phrases that are said. The evaluator or user experience tester is not unlike a hostess, setting up everything to put their guest at ease. In a situation that is carefully organized, the participant is then able to share their ideas.
Testing and scholarly research are sort of similar. You have a problem, and you want to understand why that problem is occurring, for example. Both use quantitative and qualitative data. But, in research, you want to be conclusive, exhaustive, and categorical.
In testing, you just want to make the problem better. So, in that way, in testing, you don’t choose all the ways of understanding the problem, but a few methods. The key is to make sure to choose methods that actually help you assess the problem accurately. Success rate, for example, can help you assess if people are accomplishing a particular task. But, what if your goal is that users employ much of your site, then you want to measure how many pages they are viewing.
There is a useful diagram on the Neilsen Norman website that illustrates the ways that particular testing tools relate to behavioral or attitudinal data. The article also illustrates what issues can be best illustrated by quantitative data, for example, like how much or how often.
Quantitative data should likely be paired with qualitative data. After all, if you know that most of the people going to your app stop at a certain point, you don’t know why. It might be because it is so terribly boring, or because it is so terrible interesting. Or, it could be that the text becomes illegible. Or…well, it could be anything. So, pairing the quantitative data, often found in analysis, with qualitative data give you the information you need to understand the problem.
To go back to my original statement, testing help you know enough to fix a particular app or website. You can make the situation better for the user. Quantitative and qualitative data are the tools that you use to make these improvement decisions. But, in terms of scholarship, you would likely need to have many, many more point of feedback to make a categorical assessment. So, while you might be able to use a small study to fix a particular mobile app, this doesn’t necessarily help you make broad generalizations about all mobile apps.