The truth in fiction

July 21, 2023

Relative & Absolute

Oftentimes, it seems like emerging technologies require computer-literate users to make use of them. However, with voice-user interfaces (VUIs), there is an opportunity to create truly human-centric, or human-literate, devices. The nuances of voice make it an interesting topic as so many variables are now introduced into the user’s experience; gender, tone, accent, semantics, and emotional intelligence are just a few of the aspects that must be understood when designing a voice system. Integrating VUIs into users’ lives will require careful evaluation of human-computer interaction principles. Predicting how we will interact with VUIs will take thoughtful conversational design as well. Because of this, academic researchers have already started to explore the use of VUIs in various contexts. In a brief review of the literature, we will look at what they have found and the applications of those findings to the design of future VUIs.

Socrates’ Divided Line

Voice assistants like Google Home, Amazon Alexa, Siri, etc. all have similar capabilities but are varied in their performance within these capabilities. For instance, Perficient Digital (2019) measured the accuracy of answers provided by seven different voice assistants, asking each one 4,999 questions. Looking at things like the source of the information, the answers attempted, and the accuracy of the answer, they found that Google Assistant achieved the most accurate and full answers, which might not be surprising with its vast and popular search engine database. In terms of language capabilities, Google Assistant currently supports 13 different languages and dialects within four of those languages, as well as bilingual features, but plans to expand its reach across 30 languages. Siri’s language capabilities are a bit more robust, at 21 different languages supported and dialects across seven of those languages supported. Amazon’s Alexa falls behind with seven different languages and dialects across three languages. Other functions of the voice assistants also greatly depend on its compatibility with other Internet-of-Things devices, such as smart TVs, music streaming services, smart light bulbs, and smart doorbells. Within this category, Amazon leads with 7,400 compatible brands, followed by Google Assistant at over 1,000 different brands, and Siri at over 50 (Amazon, 2020; Google, 2020; Apple 2020). This should be noted with careful consideration as the number of compatible brands only allows for flexibility and convenience in terms of compatibility; further research into the quality and performance of these various brands may be useful.

Among these functions, there are also limitations. Simple commands like setting a reminder, turning off the light, and making an appointment become obsolete if these voice assistants don’t understand the users of different demographics, such as L2 learners and those with diverse accessibility needs. The differences in cadence, slang, and accent renders the voice assistant useless, which bars potential users from joining the market for voice assistants.

All We Know

To begin to define a set of principles for designing VUIs, researchers looked at the usability heuristics that have already been developed for graphic user interfaces (GUIs). For example, visibility of system status can be translated into vocal feedback from a device, or aesthetic and minimalist design can mean unobtrusive speakers that will blend into any room. A literature review categorizing VUI studies by GUI usability heuristics found that although some heuristics can transfer to some aspects of VUI, other heuristics are not very applicable (Murad et al. 2019). For example, they found that recognition rather than recall was discussed extensively because VUIs tend to operate without displays, whereas consistency as a heuristic was not discussed as much, probably due to the fact that natural conversations are not always consistent. They found across literature that a lot of the issues fell within recognition over recall, control and freedom, and recovering from error, and they introduced two additional heuristics for the issues they were able to find; transparency/privacy and social context. The researchers also acknowledged the limitations of GUI guidelines in regards to its use for VUIs, but stated that these guidelines provided a map for starting to think about the usability of VUIs the same way that most designers have been thinking about GUIs.

References

Amazon. (n.d.). Develop Skills in Multiple Languages. https://developer.amazon.com/en-US/docs/alexa/custom-skills/develop-skills-in-multiple-languages.html

Amazon. (n.d.). Smart Home Products Compatible with Alexa. https://developer.amazon.com/en-US/alexa/connected-devices/compatible

Apple. (n.d.). Home accessories. The list keeps getting smarter. https://www.apple.com/ios/home/accessories/

Apple. (n.d.). Siri: Translations. https://www.apple.com/ios/feature-availability/#siri-translations

Elkins, A. C., & Derrick, D. C. (2013). The Sound of Trust: Voice as a Measurement of Trust During Interactions with Embodied Conversational Agents. Group Decision and Negotiation, 22(5), 897–913. doi: 10.1007/s10726-012-9339-x

Enge, E. (2019, October 24). Rating the Smarts of the Digital Personal Assistants in 2019. Perficient Digital. https://www.perficientdigital.com/insights/our-research/digital-personal-assistants-study

Google. (n.d.). Change the language of your Google Assistant. https://support.google.com/googlenest/answer/7550584?hl=en

Google. (n.d.). Services and smart devices that work with Google Home. https://support.google.com/googlenest/answer/7639952?hl=en

Han, S., & Yang, H. (2018). Understanding Adoption of Intelligent Personal Assistants: A Parasocial Relationship Perspective. Industrial Management & Data Systems, 118(3), 618-636. doi:10.1108/IMDS-05-2017-0214

Lee, S., & Choi, J. (2017). Enhancing User Experience with Conversational Agent for Movie Recommendation: Effects of Self-disclosure and Reciprocity. International Journal of Human-Computer Studies, 103, 95-105. doi:10.1016/j.ijhcs.2017.02.005

Lopatovska, I., Griffin, A., Gallagher, K., Ballingall, C., Rock, C., & Velazquez, M. (2019). User Recommendations for Intelligent Personal Assistants. Journal of Librarianship and Information Science. doi:10.1177/0961000619841107

Morris, R., Kouddous, K., Kshirsagar, R., & Schueller, S. (2018). Towards an Artificially Empathic Conversational Agent for Mental Health Applications: System Design and User Perceptions. Journal of Medical Internet Research, 20(6). doi:10.2196/10148

Murad, C., Munteanu, C., Cowan, B., & Clark, L. (2019). Revolution or evolution? Speech Interaction and HCI Design Guidelines. IEEE Pervasive Computing, 18(2), 33-45. doi:10.1109/MPRV.2019.2906991

Seeger, A. M., Pfeiffer, J., & Heinz, A. (2017). When Do We Need a Human? Anthropomorphic Design and Trustworthiness of Conversational Agents. SIGCHI 2017 Proceedings. 15. Retrieved February 5, 2020, from https://aisel.aisnet.org/sighci2017/15/

Wei, Z., & Landay, J. (2018). Evaluating Speech-Based Smart Devices Using New Usability Heuristics. IEEE Pervasive Computing, 17(2), 84-96. doi:10.1109/MPRV.2018.022511249

Yamashita, K., Kubota, H., & Nishida, T. (2006). Designing Conversational Agents: Effect of Conversational Form on our Comprehension. AI & Society: The Journal of Human-Centred Systems, 20(2), 125-137. doi:10.1007/s00146-005-0011-8