Abstract
Conversational interfaces have a long history, starting in the 1960s with text-based dialog systems for question answering and chatbots that simulated casual conversation. Speech-based dialog systems began to appear in the late 1980s and spoken dialog technology became a key area of research within the speech and language communities. At the same time commercially deployed spoken dialog systems, known in the industry as voice user interfaces (VUI), began to emerge. Embodied conversational agents (ECA) and social robots were also being developed. These systems combine facial expression, body stance, hand gestures, and speech in order to provide a more human-like and more engaging interaction. In this chapter we review developments in spoken dialog systems, VUI, embodied conversational agents, social robots, and chatbots, and outline findings and achievements from this work that will be important for the next generation of conversational interfaces.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
https://www.ldc.upenn.edu/. Accessed February 19, 2016.
- 2.
http://www.loebner.net/Prizef/loebner-prize.html. Accessed February 19, 2016.
- 3.
https://www.chatbots.org/virtual_assistant/anna3/. Accessed February 19, 2016.
- 4.
http://www.clt.gu.se/research/maharani. Accessed February 19, 2016.
- 5.
http://www.google.com/instant/. Accessed February 19, 2016.
- 6.
http://sourceforge.net/projects/inprotk/. Accessed February 19, 2016.
- 7.
http://www.speech.kth.se/prod/publications/files/3654.pdf. Accessed February 19, 2016.
- 8.
https://sourceforge.net/projects/dylan/. Accessed February 19, 2016.
- 9.
http://www.voicexml.org/about/frequently-asked-questions. Accessed February 19, 2016.
- 10.
https://evolution.voxeo.com/. Accessed February 19, 2016.
- 11.
https://cafe.bevocal.com/. Accessed February 19, 2016.
- 12.
http://www.pandorabots.com/. Accessed February 19, 2016.
- 13.
http://chatscript.sourceforge.net/. Accessed February 19, 2016.
- 14.
http://www.cs.ox.ac.uk/projects/companions/. Accessed February 2016.
- 15.
http://www.semaine-project.eu/. Accessed February 19, 2016.
- 16.
http://lirec.eu/project. Accessed February 19, 2016.
- 17.
http://www.chrisfp7.eu/index.html. Accessed February 19, 2016.
- 18.
http://www.speechtek.com. Accessed February 19, 2016.
- 19.
http://mobilevoiceconference.com/. Accessed February 19, 2016.
- 20.
http://www.speechtechmag.com/. Accessed February 19, 2016.
- 21.
https://youtu.be/RRYj0SMhfH0. Accessed February 19, 2016.
- 22.
https://youtu.be/6zcByHMw4jk. Accessed February 19, 2016.
- 23.
https://youtu.be/lHfLr1MF7DI. Accessed February 19, 2016.
- 24.
https://youtu.be/rYF68t4O_Xw. Accessed February 19, 2016.
- 25.
https://youtu.be/vphmJEpLXU0. Accessed February 19, 2016.
- 26.
http://www.masswerk.at/elizabot/. Accessed February 19, 2016.
References
Aist G, Allen JF, Campana E, Gallo CG, Stoness S, Swift M, Tanenhaus MK (2007) Incremental dialog system faster than and preferred to its nonincremental counterpart. In: Proceedings of the 29th annual conference of the cognitive science society. Cognitive Science Society, Austin, TX, 1–4 Aug 2007
Allen JF (1995) Natural language processing, 2nd edn. Benjamin Cummings Publishing Company Inc., Redwood, CA
Allen JF, Byron DK, Dzikovska M, Ferguson G, Galescu L, Stent A (2001) Towards conversational human-computer interaction. AI Mag 22(4):27–38
André E, Pelachaud C (2010) Interacting with embodied conversational agents. In: Chen F, Jokinen K (eds), Speech technology: theory and applications. Springer, New York, pp 122–149. doi:10.1007/978-0-387-73819-2_8
Balentine B (2007) It’s better to be a good machine than a bad person. ICMI Press, Annapolis, Maryland
Balentine B, Morgan DP (2001) How to build a speech recognition application: a style guide for telephony dialogs, 2nd edn. EIG Press, San Ramon, CA
Baumann T (2013) Incremental spoken dialog processing: architecture and lower-level components. Ph.D. dissertation. University of Bielefeld, Germany
Bobrow DG, Kaplan RM, Kay M, Norman DA, Thompson H, Winograd T (1977) GUS: a frame-driven dialog system. Artif Intell 8:155–173. doi:10.1016/0004-3702(77)90018-2
Bohus D (2007). Error awareness and recovery in conversational spoken language interfaces. Ph.D. dissertation. Carnegie Mellon University, Pittsburgh, PA
Bos J, Klein E, Lemon O, Oka T (2003) DIPPER: description and formalisation of an information-state update dialog system architecture. In: 4th SIGdial workshop on discourse and dialog, Sapporo, Japan, 5–6 July 2003. https://aclweb.org/anthology/W/W03/W03-2123.pdf
Brandt J (2008) Interactive voice response interfaces. In: Kortum P (ed) HCI beyond the GUI: design for haptic, speech, olfactory, and other non-traditional interfaces. Morgan Kaufmann, Burlington, MA:229-266. doi:10.1016/b978-0-12-374017-5.00007-9
Buß O, Schlangen D (2011) DIUM—an incremental dialog manager that can produce self-corrections. In: Proceedings of SemDial 2011. Los Angeles, CA, September 2011. https://pub.uni-bielefeld.de/publication/2300868. Accessed 20 Jan 2016
Cassell J, Sullivan J, Prevost S, Churchill E (eds) (2000) Embodied conversational agents. MIT Press, Cambridge, MA
Chen F, Jokinen K (eds) (2010) Speech technology: theory and applications. Springer, New York. doi:10.1007/978-0-387-73819-2
Clark HH (1996) Using language. Cambridge University Press, Cambridge. doi:10.1017/cbo9780511620539
Cohen MH, Giangola JP, Balogh J (2004) Voice user interface design. Addison Wesley, New York
Cohen P, Levesque H (1990) Rational interaction as the basis for communication. In: Cohen P, Morgan J, Pollack M (eds) Intentions in communication. MIT Press, Cambridge, MA:221–256. https://www.sri.com/work/publications/rational-interaction-basis-communication. Accessed 20 Jan 2016
Dahl DA (ed) (2004) Practical spoken dialog systems. Springer, New York. doi:10.1007/978-1-4020-2676-8
DeVault D, Sagae K, Traum DR (2011) Incremental interpretation and prediction of utterance meaning for interactive dialog. Dialog Discourse 2(1):143–170. doi:10.5087/dad.2011.107
Fernández R (2014). Dialog. In: Mitkov R (ed) Oxford handbook of computational linguistics, 2nd edn. Oxford University Press. Oxford. doi:10.1093/oxfordhb/9780199573691.013.25
Ginzburg J (1996) Interrogatives: questions, facts, and dialog. In: Lappin S (ed) Handbook of contemporary semantic theory. Blackwell, Oxford, pp 359–423
Ginzburg J (2015) The interactive stance. Oxford University Press, Oxford. doi:10.1093/acprof:oso/9780199697922.001.0001
Ginzburg J, Fernández R (2010) Computational models of dialog. In: Clark A, Fox C, Lappin S (eds) The handbook of computational linguistics and natural language processing. Wiley-Blackwell, Chichester, UK:429-481. doi:10.1002/9781444324044.ch16
Gorin AL, Riccardi G, Wright JH (1997) How may I help you? Speech Commun 23:113–127. doi:10.1016/s0167-6393(97)00040-x
Green BF, Wolf AW, Chomsky C, Laughery KR (1963) BASEBALL: an automatic question-answerer. In: Feigenbaum EA, Feldman J (eds) Computer and thought. McGraw-Hill, New York
Hempill CT, Godfrey JJ, Doddington GR (1990) The ATIS spoken language systems pilot corpus. In: Proceedings of the DARPA speech and natural language workshop, Hidden Valley, PA:96-101. doi:10.3115/116580.116613
Howes C, Purver M, Healey P, Mills G, Gregoromichelaki E (2011) On incrementality in dialog: evidence from compound contributions. Dialog Discourse 2(1):279–311. doi:10.5087/dad.2011.111
Hura S (2008) Voice user interfaces. In: Kortum P (ed) HCI beyond the GUI: design for haptic, speech, olfactory, and other non-traditional interfaces. Morgan Kaufmann, Burlington, MA:197-227. doi:10.1016/b978-0-12-374017-5.00006-7
Jokinen K (2009) Constructive dialog modelling: speech interaction and rational agents. Wiley, UK. doi:10.1002/9780470511275
Jokinen K, McTear M (2010) Spoken dialog systems. Synthesis lectures on human language technologies. Morgan and Claypool Publishers, San Rafael, CA. doi:10.2200/S00204ED1V01Y200910HLT005
Jurafsky D, Martin JH (2009) Speech and language processing: an introduction to natural language processing, computational linguistics, and speech recognition, 2nd edn. Prentice Hall, Upper Saddle River, NJ
Kortum P (ed) (2008) HCI beyond the GUI: design for haptic, speech, olfactory, and other non-traditional interfaces. Morgan Kaufmann, Burlington, MA
Larson JA (2005) Ten criteria for measuring effective voice user interfaces. Speech Technol Mag. November/December. http://www.speechtechmag.com/Articles/Editorial/Feature/Ten-Criteria-for-Measuring-Effective-Voice-User-Interfaces-29443.aspx. Accessed 20 Jan 2016
Larsson S, Bohlin P, Bos J, Traum DR (1999) TRINDIKIT 1.0 Manual. http://sourceforge.net/projects/trindikit/files/trindikit-doc/. Accessed 20 Jan 2016
Lemon O, Pietquin O (eds) (2012) Data-driven methods for adaptive spoken dialog systems: computational learning for conversational interfaces. Springer, New York. doi:10.1007/978-1-4614-4803-7
Lester J, Branting K, Mott B (2004) Conversational agents. In: Singh MP (ed) The practical handbook of internet computing. Chapman Hall, London. doi:10.1201/9780203507223.ch10
Levelt WJM (1989) Speaking. MIT Press, Cambridge, MA
Lewis JR (2011) Practical speech user interface design. CRC Press, Boca Raton. doi:10.1201/b10461
López Cózar R, Araki M (2005) Spoken, multilingual and multimodal dialog systems: development and assessment. Wiley, UK doi:10.1002/0470021578
Mariani J, Rosset S, Garnier-Rizet M, Devillers L (eds) (2014) Natural interaction with robots, knowbots and smartphones: putting spoken dialog systems into practice. Springer, New York doi:10.1007/978-1-4614-8280-2
McGlashan S. Fraser, N, Gilbert, N, Bilange E, Heisterkamp P, Youd N (1992) Dialogue management for telephone information systems. In: Proceedings of the third conference on applied language processing. Association for Computational Linguistics, Stroudsburg, PA:245-246. doi:10.3115/974499.974549
McTear M (1987) The articulate computer. Blackwell, Oxford
McTear M. (2004) Spoken dialogue technology: toward the conversational user interface. Springer, New York. doi:10.1007/978-0-85729-414-2
Nishida T, Nakazawa A, Ohmoto Y (eds) (2014) Conversational informatics: a data-intensive approach with emphasis on nonverbal communication. Springer, New York. doi:10.1007/978-4-431-55040-2
Paek T, Pieraccini R (2008) Automating spoken dialogue management design using machine learning: an industry perspective. Speech Commun 50:716–729. doi:10.1016/j.specom.2008.03.010
Perez-Martin D, Pascual-Nieto I (eds) (2011) Conversational agents and natural language interaction: techniques and effective practices. IGI Global, Hershey, PA doi:10.4018/978-1-60960-617-6
Pieraccini R (2012) The voice in the machine: building computers that understand speech. MIT Press, Cambridge, MA
Reilly RG (ed) (1987) Communication failure in dialog. North-Holland, Amsterdam
Rieser V, Lemon O (2011) Reinforcement learning for adaptive dialog systems: a data-driven methodology for dialog management and natural language generation. Springer, New York. doi:10.1007/978-3-642-24942-6
Rieser H, Schlangen D (2011) Introduction to the special issue on incremental processing in dialog. Dialog and Discourse 1:1–10. doi:10.5087/dad.2011.001
Sadek MD, De Mori R (1998) Dialog systems. In: De Mori R (ed) Spoken dialogs with computers. Academic Press, London, pp 523–561
Schlangen D, Skantze G (2011) A General, abstract model of incremental dialog processing. Dialog Discourse 2(1):83–111. doi:10.5087/dad.2011.105
Schulman D, Bickmore T (2009) Persuading users through counseling dialog with a conversational agent. In: Chatterjee S, Dev P (eds) Proceedings of the 4th international conference on persuasive technology, 350(25). ACM Press, New York. doi:10.1145/1541948.1541983
Seneff S, Polifroni J (2000) Dialog management in the mercury flight reservation system. In: Proceedings of ANLP-NAACL 2000, Stroudsburg, PA, USA, 11–16 May 2000. doi:10.3115/1117562.1117565
Skantze G, Hjalmarsson A (2013) Towards incremental speech generation in conversational systems. Comp Speech Lang 27(1):243–262. doi:10.1016/j.csl.2012.05.004
Suendermann D (2011) Advances in commercial deployment of spoken dialog systems. Springer, New York. doi:10.1007/978-1-4419-9610-7
Suendermann D, Pieraccini R (2012) One year of Contender: what have we learned about assessing and tuning industrial spoken dialog systems? In: Proceedings of the NAACL-HLT workshop on future directions and needs in the spoken dialog community: tools and data (SDCTD 2012), Montreal, Canada, 7 June 2012: 45–48. http://www.aclweb.org/anthology/W12-1818. Accessed 20 Jan 2016
Suendermann D, Evanini K, Liscombe J, Hunter P, Dayanidhi K, Pieraccini R (2009) From rule-based to statistical grammars: continuous improvement of large-scale spoken dialog systems. In: Proceedings of the IEEE international conference on acoustics, speech, and signal processing (ICASSP 2009), Taipei, Taiwan, 19–24 April 2009: 4713–4716. doi:10.1109/icassp.2009.4960683
Suendermann D, Liscombe J, Pieraccini R (2010a) Optimize the obvious: automatic call flow generation. In: Proceedings of the IEEE international conference on acoustics, speech, and signal processing (ICASSP 2010), Dallas, USA, 14-19 March 2010: 5370–5373. doi:10.1109/icassp.2010.5494936
Suendermann D, Liscombe J, Pieraccini R (2010b) Contender. In: Proceedings of the IEEE workshop on spoken language technology (SLT 2010), Berkeley, USA, 12–15 Dec 2010: 330–335. doi:10.1109/slt.2010.5700873
Suendermann D, Liscombe J, Bloom J, Li G, Pieraccini R (2011a) Large-scale experiments on data-driven design of commercial spoken dialog systems. In: Proceedings of the 12th annual conference of the international speech communication association (Interspeech 2011), Florence, Italy, 27–31 Aug 2011: 820–823. http://www.isca-speech.org/archive/interspeech_2011/i11_0813.html. Accessed 20 Jan 2016
Suendermann D, Liscombe J, Bloom J, Li G, Pieraccini R (2011b) Deploying Contender: early lessons in data, measurement, and testing of multiple call flow decisions. In: Proceedings of the IASTED international conference on human computer interaction (HCI 2011), Washington, USA, 16–18 May 2011: 747–038. doi:10.2316/P.2011.747-038
Sukthankar G, Goldman RP, Geib C, Pynadath DV, Bui HH (eds) (2014) Plan, activity, and intent recognition: theory and practice. Morgan Kaufmann, Burlington, MA
Tanenhaus MK (2004) On-line sentence processing: past, present and, future. The on-line study of sentence comprehension: ERPS, eye movements and beyond. In: Carreiras M, Clifton C Jr (eds) The on-line study of sentence comprehension. Psychology Press, New York: 371–392
Thomson B (2013) Statistical methods for spoken dialog management. Springer theses, Springer, New York. doi:10.1007/978-1-4471-4923-1
Trappl R (ed) (2013) Your virtual butler: the making-of. Springer, Berlin. doi:10.1007/978-3-642-37346-6
Traum DR, Larsson S (2003) The information state approach to dialog management. In: Smith R, Kuppevelt J (eds) Current and new directions in discourse and dialog. Kluwer Academic Publishers, Dordrecht: 325–353. doi:10.1007/978-94-010-0019-2_15
Turing AM (1950) Computing machinery and intelligence. Mind 59:433–460. doi:10.1093/mind/lix.236.433
Walker MA, Aberdeen J, Boland J, Bratt E, Garofolo J, Hirschman L, Le A, Lee S, Narayanan K, Papineni B, Pellom B, Polifroni J, Potamianos A, Prabhu P, Rudnicky A, Sanders G, Seneff S, Stallard D, Whittaker S (2001) DARPA communicator dialog travel planning systems: the June 2000 data collection. In: Proceedings of the 7th European conference on speech communication and technology (INTERSPEECH 2001), Aalborg, Denmark, 3–7 Sept 2001: 1371–1374. http://www.isca-speech.org/archive/eurospeech_2001/e01_1371.html
Weizenbaum J (1966) ELIZA—a computer program for the study of natural language communication between man and machine. Commun ACM 9(1):36–45. doi:10.1145/365153.365168
Wilpon JG, Rabiner LR, Lee CH, Goldman ER (1990) Automatic recognition of keywords in unconstrained speech using Hidden Markov models. IEEE T Speech Audi P 38(11):1870–1878. doi:10.1109/29.103088
Wilks Y (ed) (2010) Close engagements with artificial companions. Key social, psychological, ethical and design issues. John Benjamins Publishing Company, Amsterdam. doi:10.1075/nlp.8
Winograd T (1972) Understanding natural language. Academic Press, New York
W3C Specifications
Pronunciation Lexicon http://www.w3.org/TR/2008/REC-pronunciation-lexicon-20081014/
State Chart XML http://www.w3.org/TR/2008/WD-scxml-20080516/
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this chapter
Cite this chapter
McTear, M., Callejas, Z., Griol, D. (2016). Conversational Interfaces: Past and Present. In: The Conversational Interface. Springer, Cham. https://doi.org/10.1007/978-3-319-32967-3_4
Download citation
DOI: https://doi.org/10.1007/978-3-319-32967-3_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-32965-9
Online ISBN: 978-3-319-32967-3
eBook Packages: EngineeringEngineering (R0)