The Dell Research team needed to better understand the state of its chatbot, Ava, and its competitive posture versus HP. My team and I was asked to provide suggestions for possible future state designs for the Ava chatbot.
I screened and recruited participants, moderated and observed tests, designed tasks and scripts, analyzed/synthesized data, reported and presented findings.
We first met with the Dell Research Lead to discuss the current state of Ava. We learned the chatbot was still in its infancy, and the study would be exploratory. The study objective was to learn how customers interact with Ava while shopping to identify potential opportunity spaces for the next iteration of Ava.
After, we created an interaction map of different ways customers could engage with Ava. Customers could begin a conversation by using a popup window or clicking the "chat with Ava" link in the header bar. When customers engage with Ava, they could select a link or could manually type their own messages. Either way, Ava guides users to finding, selecting, and purchasing a product on their site. This map enabled us to better understand how Ava works and the various ways customers could interact with it.
Primary research question
What is Ava’s current state of efficacy as an online shopping assistant? And in what direction should its design be evolved for better usability?
Secondary research questions
How (if at all) does Ava currently fit into the customers shopping process?
How effectively does Ava identify customers needs and preferences when recommending products?
How do customers prefer to interact with Ava (e.g. clicking links vs submitting custom messages)?
With these answers, the Dell team can make a case of whether or not they should continue to put resources into Ava, and how much.
Recruiting and testing logistics
Target participant (minimum qualities):
Performing IT work for a business or organization
Purchased an IT system in the past year
Ideal participants (nice to have):
Have used an automated chat feature before in the context of making purchases, managing accounts, or seeking technical support
Possess experience configuring hardware on data systems and platforms
Make or advise on direct purchase of IT equipment
We recruited 5 participants in our study, and the project did not include compensation to incentivize/reward participants. Therefore, we reached out to IT employees from my network or local organizations (mostly participants from school) via e-mail asking if they were willing to freely offer their time for the study.
We conducted a comparative analysis testing the usability of Dell's chatbot, Ava compared to HP's chatbot, the HP Virtual Agent. Both of these chatbots are designed to complete different tasks. Ava is designed to help customers shop, and HP's Virtual Agent is designed to provide technical support. The usability testing was conducted remotely, using Validately. This provided flexibility for our team and our participants. This also enabled us to record and revisit the session, and extract and share clips directly with the research team.
Prior to starting, we asked our participants to answer a few pre-test questions describing their recent IT purchase experiences. We then engaged our participants with task-based scenarios to complete the following:
Dell: You need to purchase a computer for your home office. Purchase your computer (1). Now use Ava to purchase your computer (2).
HP: You are having trouble connecting your wireless printer. Using HP’s virtual assistant, find a solution to your problem.
We had participants think out loud while engaging with the chatbots. Our evaluation criteria included: perceived ease of use, confidence, and degree of satisfaction. After each task, participants rated each chatbot by completing a SUS (System Usability Scale) survey. We concluded each session with a debrief consisting of a discussion around the participants impression, likes/dislikes, and clarify any questions they had. Each session lasted around one hour.
We evaluated the delights and pain points and incorporated those findings into our recommendations for Ava.
The research produced dozens of findings, many of which were new to the Dell team.
We discovered how participants doubted the competence and credibility of Ava (and chatbots in general). Participants criticized Ava for being unable to distinguish between high-priority and low-priority needs, and questioned Ava's interest between helping the customer and making a profit for Dell.
It’s not allowing me to put in things I think are important to me...I can answer her questions but it doesn’t let me prioritize the things I want.
You obviously are trying to sell me something. Are you really telling me the best thing for me, or is this a machine that Dell will make the most money on if I buy it?
Participants experienced similar issues with the HP Virtual Assistant as well, but less severe. This could be because tech support's criteria for success and failure may be more objective. The solution either works or does not work. On the other hand, Dell's chatbot is designed to support customers with shopping, which means it must make more subjective judgments. This is challenging for Ava.
Additionally, participants often reported that Ava's chat box obstructed the view of the web page. At times when Ava would automatically redirect participants to a new web page, they were unsure whether to continue chatting with Ava or begin interacting with the main web page.
I'm confused. Which one should I interact with, the website or the chatbox?
We recommend considering the following in the next iteration of Ava:
Enable Ava to address a wider spectrum of shopping needs and distinguish between “must-haves” and “nice-to-haves”
Demonstrate better credibility by referring to reviews and making explicit statements that show how Ava’s recommended product meets the customers' needs
Provide additional affordances to improve the interactions as users switch between chatting with Ava and navigating the Dell web page
No compensation or incentives to provide for our participants for completing the study
Technical issues encountered during the remote usability studies
Comparison study between Dell and HP chat features operate differently
What I learned
I learned to be scrappy and to hustle when recruiting for participants. I got comfortable reaching out to people in the community.
I learned to build a rapport with participants when conducting user studies remotely. It was helpful to start the conversation by learning more about their backgrounds in advance, such as their experiences with chatbots.
What I would do differently
More pre-test/screening items. Particularly around what products participants use currently (Dell, HP, Apple etc) to help check if participants have a biased toward a brand.
Re-think the HP task since it may have been too straightforward. An “easier” task may run the risk of providing an easier or better experience.
Recruit more and other participants. It would be interesting to test beyond IT business users, and get the opinions of a diverse group.