skip to main content
research-article
Public Access

RoboCOP: A Robotic Coach for Oral Presentations

Published: 30 June 2017 Publication History

Abstract

Rehearsing in front of a live audience is invaluable when preparing for important presentations. However, not all presenters take the opportunity to engage in such rehearsal, due to time constraints, availability of listeners who can provide constructive feedback, or public speaking anxiety. We present RoboCOP, an automated anthropomorphic robot head that acts as a coach to provide spoken feedback during presentation rehearsals at both the individual slide and overall presentation level. The robot offers conversational coaching on three key aspects of presentations: speech quality, content coverage, and audience orientation. The design of the feedback strategies was informed by findings from an exploratory study with academic professionals who were experienced in mentoring students on their presentations. In a within-subjects study comparing RoboCOP to visual feedback and spoken feedback without a robot, the robotic coach was shown to lead to significant improvement in the overall experience of presenters. Results of a second within-subjects evaluation study comparing RoboCOP with existing rehearsal practices show that our system creates a natural, interactive, and motivating rehearsal environment that leads to improved presentation quality.

References

[1]
Samer Al Moubayed, Jonas Beskow, Gabriel Skantze, and Björn Granström. 2012. Furhat: a back-projected human-like robot head for multiparty human-machine interaction. In Cognitive Behavioural Systems. Springer Berlin Heidelberg, 114--130.
[2]
Page L. Anderson, Elana Zimand, Larry F. Hodges, and Barbara O. Rothbaum. 2005. Cognitive behavioral therapy for public-speaking anxiety using virtual reality for exposure. Depression and anxiety 22, 3 (Jan. 2005), 156--158.
[3]
Reza Asadi, Harriet J. Fell, Timothy Bickmore, and Ha Trinh. 2016. Real-Time Presentation Tracking Using Semantic Keyword Spotting. In Proceedings of Interspeech 2016, 3081--3085.
[4]
Matthew P. Aylett and Christopher J. Pidcock. 2007. The CereVoice characterful speech synthesiser SDK. In Proceedings of IVA 2007, 413--414.
[5]
Joe Ayres. 1996. Speech preparation processes and speech apprehension. Communication Education 45, 3 (Jul. 1996), 228--235.
[6]
Ligia Batrinca, Giota Stratou, Ari Shapiro, Louis-Philippe Morency, and Stefan Scherer. 2013. Cicero-towards a multimodal virtual audience platform for public speaking training. In Proceedings of 2013 International Workshop on Intelligent Virtual Agents. Springer Berlin Heidelberg, 116--128.
[7]
John B. Bishop, Karen W. Bauer, and Elizabeth Trezise Becker. 1998. A survey of counseling needs of male and female college students. Journal of College Student Development 39, 2 (Mar. 1998), 205.
[8]
Paul Boersma and Vincent van Heuven. 2001. Speak and unSpeak with PRAAT. Glot International 5, 9-10 (Nov. 2001), 341--347.
[9]
Virginia Braun and Victoria Clarke. 2006. Using thematic analysis in psychology. Qualitative research in psychology 3, 2 (Jan. 2006), 77--101.
[10]
Mark Bubel, Ruiwen Jiang, Christine H. Lee, Wen Shi, and Audrey Tse. 2016. AwareMe: addressing fear of public speech through awareness. In Proceedings of the 2016 CHI Conference Extended Abstracts on Human Factors in Computing Systems. ACM, 68--73.
[11]
Justine Cassell, Hannes Högni Vilhjálmsson, and Timothy Bickmore. 2001. BEAT: the behavior expression animation toolkit. In Proceedings of 2001 ACM SIGGRAPH. Springer Berlin Heidelberg, 163--185.
[12]
Mathieu Chollet, Giota Sratou, Ari Shapiro, Louis-Philippe Morency, and Stefan Scherer. 2015. Exploring feedback strategies to improve public speaking: an interactive virtual audience framework. In Proceedings of the 2015 ACM International Joint Conference on Pervasive and Ubiquitous Computing (UbiComp’15). ACM, 1143--1154.
[13]
Ionut Damian, Chiew Seng Sean Tan, Tobias Baur, Johannes Schöning, Kris Luyten, and Elisabeth André. 2015. Augmenting social interactions: Realtime behavioural feedback using social signal processing techniques. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems (CHI’15). ACM, 565--574.
[14]
Nivja H. De Jong and Ton Wempe. 2009. Praat script to detect syllable nuclei and measure speech rate automatically. Behavior Research Methods 41, 2 (May 2009), 385--390.
[15]
Anne Dohrenwend. 2002. Serving up the feedback sandwich. Family Practice Management 9, 10 (Nov. 2002), 43--50.
[16]
Juan Fasola and Maja J. Mataric. 2012. Using socially assistive human--robot interaction to motivate physical exercise for older adults. In Proceedings of the IEEE 100, 8 (Aug. 2012), 2512--2526.
[17]
Albert Gatt and Ehud Reiter. 2009. SimpleNLG: A realisation engine for practical applications. In Proceedings of the 12th European Workshop on Natural Language Generation. Association for Computational Linguistics, 90--93.
[18]
Sabine Geldof. 2003. Corpus analysis for NLG. In 9th European Workshop on NLG, 31--38.
[19]
Andy Goodman. 2006. Why bad presentations happen to good causes, and how to ensure they won't happen to yours. Cause Communications.
[20]
Jeong-Hye Han, Mi-Heon Jo, Vicki Jones, and Jun-H. Jo. 2008. Comparative study on the educational use of home robots for children. Journal of Information Processing Systems 4, 4 (2008), 159--168.
[21]
Rebecca Hincks. 2004. Processing the prosody of oral presentations. In InSTIL/ICALL Symposium 2004.
[22]
Mohammed Ehsan Hoque, Matthieu Courgeon, Jean-Claude Martin, Bilge Mutlu, and Rosalind W. Picard. 2013. MACH: My automated conversation coach. In Proceedings of the 2013 ACM international joint conference on Pervasive and ubiquitous computing (UbiComp’13). ACM, 697--706.
[23]
IBM. Speech to Text | IBM Watson Developer Cloud. 2016. Retrieved August 20, 2016 from http://www.ibm.com/smarterplanet/us/en/ibmwatson/developercloud/speech-to-text.html
[24]
Mark Johnson, Eugene Charniak, and Matthew Lease. 2004. An improved model for recognizing disfluencies in conversational speech. In Proceedings of Rich Transcription Workshop.
[25]
Elizabeth S. Kim, Dan Leyzberg, Katherine M. Tsui, and Brian Scassellati. 2009. How people talk when teaching a robot. In Proceedings of the 4th ACM/IEEE International Conference on Human-Robot Interaction. IEEE, 23--30.
[26]
Min-Sun Kim, Jennifer Sur, and Li Gong. 2009. Humans and humanoid social robots in communication contexts. AI 8 society 24, 4 (Nov. 2009), 317--325.
[27]
Kazutaka Kurihara, Masataka Goto, Jun Ogata, Yosuke Matsusaka, and Takeo Igarashi. 2007. Presentation sensei: a presentation training system using speech and image processing. In Proceedings of the 9th international conference on Multimodal interfaces. ACM, 358--365.
[28]
Charlyn M. Laserna, Yi-Tai Seih, and James W. Pennebaker. 2014. Um… who like says you know: Filler word use as a function of age, gender, and personality. Journal of Language and Social Psychology 33, 3 (Jun. 2014), 328--338.
[29]
Kwan Min Lee, Younbo Jung, Jaywoo Kim, and Sang Ryong Kim. 2006. Are physically embodied social agents better than disembodied social agents?: The effects of physical embodiment, tactile interaction, and people's loneliness in human--robot interaction. International Journal of Human-Computer Studies 64, 10 (Oct. 2006), 962--973.
[30]
Severin Lemaignan, Fernando Garcia, Alexis Jacq, and Pierre Dillenbourg. 2016. From real-time attention assessment to with-meness in human-robot interaction. In Proceedings of the 11th ACM/IEEE International Conference on Human Robot Interaction. IEEE Press, 157--164.
[31]
Daniel Leyzberg, Samuel Spaulding, Mariya Toneva, and Brian Scassellati. 2012. The physical presence of a robot tutor increases cognitive learning gains.
[32]
Andrew Kwok-Fai Lui, Sin-Chun Ng, and Wing-Wah Wong. 2015. A novel mobile application for training oral presentation delivery skills. In International Conference on Technology in Education. Springer Berlin Heidelberg, 79--89.
[33]
R. M. Maatman, Jonathan Gratch, and Stacy Marsella. 2005. Natural behavior of a listening agent. In International Workshop on Intelligent Virtual Agents. Springer Berlin Heidelberg, 25--36.
[34]
Christopher D. Manning, Mihai Surdeanu, John Bauer, Jenny Rose Finkel, Steven Bethard, and David McClosky. 2014. The Stanford CoreNLP Natural Language Processing Toolkit. In ACL (System Demonstrations), 55--60.
[35]
James C. McCroskey and Linda L. McCroskey. 1988. Self‐report as an approach to measuring communication competence. 108--113.
[36]
Kent E. Menzel and Lori J. Carrell. 1994. The relationship between preparation and performance in public speaking. Communication Education 43, 1 (Jan. 1994), 17--26.
[37]
George A. Miller. 1995. WordNet: a lexical database for English. Communications of the ACM 38, 11 (Nov. 1995), 39--41.
[38]
Ye Pan and Anthony Steed. 2016. A comparison of avatar, video, and robot-mediated interaction on users’ trust in expertise. Frontiers in Robotics and AI 3 (Mar. 2016), 12.
[39]
Gordon L. Paul. 1966. Insight and desensitization in psychotherapy: An experiment in anxiety reduction.
[40]
Judy C. Pearson, Jeffrey T. Child, and David H. Kahl Jr. 2006. Preparation meeting opportunity: How do college students prepare for public speeches? Communication Quarterly 54, 3 (Aug. 2006), 351--366.
[41]
Jeffrey Pennington, Richard Socher, and Christopher D. Manning. 2014. Glove: Global Vectors for Word Representation. In EMNLP, vol. 14, 1532--43.
[42]
Erwan Pépiot. 2014. Male and female speech: a study of mean f0, f0 range, phonation type and speech rate in Parisian French and American English speakers. In Speech Prosody 7, 305--309.
[43]
André Pereira, Carlos Martinho, Iolanda Leite, and Ana Paiva. 2008. iCat, the chess player: the influence of embodiment in the enjoyment of a game. In Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems. International Foundation for Autonomous Agents and Multiagent Systems, 1253--1256.
[44]
David-Paul Pertaub, Mel Slater, and Chris Barker. 2002. An experiment on public speaking anxiety in response to three different types of virtual audience. Presence 11, 1 (Feb. 2002), 68--78.
[45]
Aaron Powers, Sara Kiesler, Susan Fussell, and Cristen Torrey. 2007. Comparing a computer agent with a humanoid robot. In Proceedings of the 2nd ACM/IEEE International Conference on Human-Robot Interaction. IEEE, 145--152.
[46]
Irene Rae, Leila Takayama, and Bilge Mutlu. 2013. In-body experiences: embodiment, control, and trust in robot-mediated communication. In Proceedings of the 31st Annual ACM Conference on Human Factors in Computing Systems (CHI’13). ACM, 1921--1930.
[47]
Ehud Reiter, Robert Dale, and Zhiwei Feng. 2000. Building natural language generation systems. Cambridge: Cambridge university press.
[48]
Steve Rubin, Floraine Berthouzoz, Gautham J. Mysore, and Maneesh Agrawala. 2015. Capture-time feedback for recording scripted narration. In Proceedings of the 28th Annual ACM Symposium on User Interface Software 8 Technology (UIST’15). ACM, 191--199.
[49]
Martin Saerbeck, Tom Schut, Christoph Bartneck, and Maddy D. Janse. 2010. Expressive robots in education: varying the degree of social supportive behavior of a robotic tutor. In Proceedings of the 28th Annual ACM Conference on Human Factors in Computing Systems (CHI’10). ACM, 1613--1622.
[50]
Jan Schneider, Dirk Börner, Peter Van Rosmalen, and Marcus Specht. 2015. Presentation Trainer, your public speaking multimodal coach. In Proceedings of the 2015 ACM on International Conference on Multimodal Interaction (ICMI’15). ACM, 539--546.
[51]
Sofia Serholt, Christina Anne Basedow, Wolmet Barendregt, and Mohammad Obaid. 2014. Comparing a humanoid tutor to a human tutor delivering an instructional task to children. In 2014 IEEE-RAS International Conference on Humanoid Robots. IEEE, 1134--1141.
[52]
Charles Donald Spielberger. 1989. State-trait anxiety inventory: a comprehensive bibliography. Consulting Psychologists Press.
[53]
Lisa A. Steelman and Kelly A. Rutkowski. 2004. Moderators of employee reactions to negative feedback. Journal of Managerial Psychology 19, 1 (Jan. 2004), 6--18.
[54]
M. Iftekhar Tanveer, Emy Lin, and Mohammed Ehsan Hoque. 2015. Rhema: A real-time in-situ intelligent interface to help people with public speaking. In Proceedings of the 20th International Conference on Intelligent User Interfaces (IUI’15). ACM, 286--295.
[55]
M. Iftekhar Tanveer, Ru Zhao, Kezhen Chen, Zoe Tiet, and Mohammed Ehsan Hoque. 2016. Automanner: An automated interface for making public speakers aware of their mannerisms. In Proceedings of the 29th Annual ACM Symposium on User Interface Software 8 Technology (UIST’16). ACM, 385--396.
[56]
Ha Trinh, Koji Yatani, and Darren Edge. 2014. PitchPerfect: integrated rehearsal environment for structured presentation preparation. In Proceedings of the 32nd Annual ACM Conference on Human Factors in Computing Systems (CHI’14). ACM, 1571--1580.
[57]
Joshua Wainer, David J. Feil-Seifer, Dylan A. Shell, and Maja J. Mataric. 2007. Embodiment and human-robot interaction: A task-based perspective. In RO-MAN 2007 - The 16th IEEE International Symposium on Robot and Human Interactive Communication. IEEE, 872--877.

Cited By

View all
  • (2024)Comuniqa: Exploring Large Language Models For Improving English Speaking SkillsProceedings of the 7th ACM SIGCAS/SIGCHI Conference on Computing and Sustainable Societies10.1145/3674829.3675082(256-267)Online publication date: 8-Jul-2024
  • (2023)"Enjoy, but Moderately!": Designing a Social Companion Robot for Social Engagement and Behavior Moderation in Solitary Drinking ContextProceedings of the ACM on Human-Computer Interaction10.1145/36100287:CSCW2(1-24)Online publication date: 4-Oct-2023
  • (2023)Improving Multiparty Interactions with a Robot Using Large Language ModelsExtended Abstracts of the 2023 CHI Conference on Human Factors in Computing Systems10.1145/3544549.3585602(1-8)Online publication date: 19-Apr-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies
Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies  Volume 1, Issue 2
June 2017
665 pages
EISSN:2474-9567
DOI:10.1145/3120957
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 June 2017
Published in IMWUT Volume 1, Issue 2

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Presentation rehearsal
  2. coaching
  3. feedback
  4. robot

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)290
  • Downloads (Last 6 weeks)30
Reflects downloads up to 21 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Comuniqa: Exploring Large Language Models For Improving English Speaking SkillsProceedings of the 7th ACM SIGCAS/SIGCHI Conference on Computing and Sustainable Societies10.1145/3674829.3675082(256-267)Online publication date: 8-Jul-2024
  • (2023)"Enjoy, but Moderately!": Designing a Social Companion Robot for Social Engagement and Behavior Moderation in Solitary Drinking ContextProceedings of the ACM on Human-Computer Interaction10.1145/36100287:CSCW2(1-24)Online publication date: 4-Oct-2023
  • (2023)Improving Multiparty Interactions with a Robot Using Large Language ModelsExtended Abstracts of the 2023 CHI Conference on Human Factors in Computing Systems10.1145/3544549.3585602(1-8)Online publication date: 19-Apr-2023
  • (2023)What Do People Think of Social Robots and Voice Agents as Public Speaking Coaches?2023 32nd IEEE International Conference on Robot and Human Interactive Communication (RO-MAN)10.1109/RO-MAN57019.2023.10309583(996-1003)Online publication date: 28-Aug-2023
  • (2023)Multimodal Transfer Learning for Oral Presentation AssessmentIEEE Access10.1109/ACCESS.2023.329583211(84013-84026)Online publication date: 2023
  • (2023)‘Um, so like, is this how I speak?’: design implications for automated visual feedback systems on speechBehaviour & Information Technology10.1080/0144929X.2023.2271997(1-20)Online publication date: 9-Nov-2023
  • (2022)Sharing the SpotlightProceedings of the 2022 ACM/IEEE International Conference on Human-Robot Interaction10.5555/3523760.3523832(551-560)Online publication date: 7-Mar-2022
  • (2022)Designing and Evaluating Presentation Avatar for Promoting Self-ReviewIEICE Transactions on Information and Systems10.1587/transinf.2021EDP7210E105.D:9(1546-1556)Online publication date: 1-Sep-2022
  • (2022)Supporting Self-development of Speech Delivery for Education ProfessionalsProceedings of the 21st International Conference on Mobile and Ubiquitous Multimedia10.1145/3568444.3570588(251-253)Online publication date: 27-Nov-2022
  • (2022)Effect of repetitive motion intervention on self-avatar on the sense of self-individualityProceedings of the 10th International Conference on Human-Agent Interaction10.1145/3527188.3561916(167-175)Online publication date: 5-Dec-2022
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media