research-article

Open access

Talking about Large Language Models

Author:

Murray ShanahanAuthors Info & Claims

Communications of the ACM, Volume 67, Issue 2

Pages 68 - 79

https://doi.org/10.1145/3624724

Published: 25 January 2024 Publication History

All formats PDF

Abstract

Interacting with a contemporary LLM-based conversational agent can create an illusion of being in the presence of a thinking creature. Yet, in their very nature, such systems are fundamentally not like us.

References

[1]

Ahn, M. et al. Do as I can, not as I say: Grounding language in robotic affordances. arXiv preprint arXiv:2204.01691. (2022).

Google Scholar

[2]

Alayrac, J.-B. et al. Flamingo: A visual language model for few-shot learning. In Advances in Neural Information Processing Systems (2022).

Google Scholar

[3]

Bender, E. and Koller, A. Climbing towards NLU: On meaning, form, and understanding in the age of data. In Proceedings of the 58^th Annual Meeting of the Assoc. for Computational Linguistics (2020), 5185--5198.

Crossref

Google Scholar

[4]

Bender, E., Gebru, T., McMillan-Major, A., and Shmitchell, S. On the dangers of stochastic parrots: Can language models be too big? In Proceedings of the 2021 ACM Conf. on Fairness, Accountability, and Transparency, 610--623.

Google Scholar

[5]

Brown, T. et al. Language models are few-shot learners. In Advances in Neural Information Processing Systems 33 (2020), 1877--1901.

Google Scholar

[6]

Chan, S.C. et al. Data distributional properties drive emergent in-context learning in transformers. In Advances in Neural Information Processing Systems (2022).

Google Scholar

[7]

Chowdhery, S. et al. PaLM: Scaling language modeling with pathways. arXiv preprint arxiv:2204.02311 (2022).

Google Scholar

[8]

Creswell, A. and Shanahan, M. Faithful reasoning using large language models. arXiv preprint arXiv:2208.14271 (2022).

Google Scholar

[9]

Creswell, A., Shanahan, M., and Higgins, I. Selection-inference: Exploiting large language models for interpretable logical reasoning. In Proceedings of the Intern. Conf. on Learning Representations (2023).

Google Scholar

[10]

Davidson, D. Rational animals. Dialectica 36 (1982), 317--327.

Crossref

Google Scholar

[11]

Dennett, D. Intentional systems theory. The Oxford Handbook of Philosophy of Mind. Oxford University Press (2009), 339--350.

Google Scholar

[12]

Devlin, J., Chang, M-W., Lee, K., and Toutanova, K. BERT: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018).

Google Scholar

[13]

Elhage, N. et al. A mathematical framework for transformer circuits. Transformer Circuits Thread (2021); https://bit.ly/3NFliBA.

Google Scholar

[14]

Glaese, A. et al. Improving alignment of dialogue agents via targeted human judgements. arXiv preprint arXiv:2209.14375 (2022).

Google Scholar

[15]

Halevy, A.Y., Norvig, P., and Pereira, F. The unreasonable effectiveness of data. IEEE Intelligent Systems 24, 2 (2009), 8--12.

Digital Library

Google Scholar

[16]

Harnad, S. The symbol grounding problem. Physica D: Nonlinear Phenomena 42, 1--3 (1990), 335--346.

Digital Library

Google Scholar

[17]

Kojima, T. et al. Large language models are zero-shot reasoners. arXiv preprint arXiv:2205.11916 (2022).

Google Scholar

[18]

Li, Z., Nye, M., and Andreas, J. Implicit representations of meaning in neural language models. In Proceedings of the 59^th Annual Meeting of the Assoc. for Computational Linguistics and the 11^th Intern. Joint Conf. on Natural Language Processing 1, Long Papers (2021).

Crossref

Google Scholar

[19]

Lu, J., Batra, D., Parikh, D., and Lee, S. Pretraining task-agnostic visiolinguistic representations for vision-and-language tasks. arXiv preprint arXiv:1908.02265 (2019).

Google Scholar

[20]

Marcus, G. and Davis, E. GPT-3, bloviator: OpenAI's language generator has no idea what it's talking about. MIT Technology Rev. (Aug. 2020).

Google Scholar

[21]

Meng, K., Bau, D., Andonian, A.J., and Belinkov, Y. Locating and editing factual associations in GPT. In Advances in Neural Information Processing Sys. (2022).

Google Scholar

[22]

Lake, B.M. and Murphy, G.L. Word meaning in minds and machines. Psychological Rev. 130, 2 (2023), 401--431.

Crossref

Google Scholar

[23]

Nye, M. et al. Show your work: Scratchpads for intermediate computation with language models. arXiv preprint arXiv:2112.00114 (2021).

Google Scholar

[24]

Olsson, N. et al. In-context learning and induction heads. Transformer Circuits Thread (2022); https://transformercircuits.pub/2022/in-context-learning-andinduction-heads/index.html.

Google Scholar

[25]

OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023).

Google Scholar

[26]

Ouyang, L. et al. Training language models to follow instructions with human feedback. In Advances in Neural Information Processing Systems (2022).

Google Scholar

[27]

Piantadosi, S.T. and Hill, F. Meaning without reference in large language models. arXiv preprint arXiv:2208.02957 (2022).

Google Scholar

[28]

Radford, A. et al. Language models are unsupervised multitask learners. (2019).

Google Scholar

[29]

Rae, J.W. et al. Scaling language models: Methods, analysis & insights from training Gopher. arXiv preprint arXiv:2112.11446 (2021).

Google Scholar

[30]

Ruane, E., Birhane, A. and Ventresque, A. Conversational AI: Social and ethical considerations. In Proceedings of the 27^th AIAI Irish Conf. on Artificial Intelligence and Cognitive Science (2019), 104--115.

Google Scholar

[31]

Schick, T. et al. Toolformer: Language models can teach themselves to use tools. arXiv preprint arXiv:2302.04761 (2023).

Google Scholar

[32]

Shanahan, M. and Mitchell, M. Abstraction for deep reinforcement learning. In Proceedings of the 31^stIntern. Joint Conf. on Artificial Intelligence (2022), 5588--5596.

Crossref

Google Scholar

[33]

Smith, B.C. The Promise of Artificial Intelligence: Reckoning and Judgment. MIT Press (2019).

Crossref

Google Scholar

[34]

Stiennon, N. et al. Learning to summarize from human feedback. In Advances in Neural Information Processing Systems (2020), 3008--3021.

Google Scholar

[35]

Thoppilan, R. et al. LaMDA: Language models for dialog applications. arXiv preprint arXiv:2201.08239 (2022).

Google Scholar

[36]

Vaswani, A. et al. Attention is all you need. In Advances in Neural Information Processing Systems (2017), 5998--6008.

Google Scholar

[37]

Wei, J. et al. Emergent abilities of large language models. Transactions on Machine Learning Research (2022).

Google Scholar

[38]

Wei, J. et al. Chain-of-thought prompting elicits reasoning in large language models. In Advances in Neural Information Processing Systems (2022).

Google Scholar

[39]

Weidinger, L. et al. Ethical and social risks of harm from language models. arXiv preprint arXiv:2112.04359 (2021).

Google Scholar

[40]

Wittgenstein, L. Philosophical Investigations. Basil Blackwell (1953).

Google Scholar

[41]

Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In Proceedings of the Intern. Conf. on Learning Representations (2023).

Google Scholar

Cited By

View all

Bao WCao YYang YChe HHuang JWen S(2025)Data-driven stock forecasting models based on neural networks: A reviewInformation Fusion10.1016/j.inffus.2024.102616113(102616)Online publication date: Jan-2025
https://doi.org/10.1016/j.inffus.2024.102616
Nfaoui EElfaik H(2024)Evaluating Arabic Emotion Recognition Task Using ChatGPT Models: A Comparative Analysis between Emotional Stimuli Prompt, Fine-Tuning, and In-Context LearningJournal of Theoretical and Applied Electronic Commerce Research10.3390/jtaer1902005819:2(1118-1141)Online publication date: 14-May-2024
https://doi.org/10.3390/jtaer19020058
Su YTan SHuang JPérez-González FComesaña-Alfaro PKrätzer CVicky Zhao H(2024)A Novel Universal Image Forensics Localization Model Based on Image Noise and Segment Anything ModelProceedings of the 2024 ACM Workshop on Information Hiding and Multimedia Security10.1145/3658664.3659639(149-158)Online publication date: 24-Jun-2024
https://dl.acm.org/doi/10.1145/3658664.3659639
Show More Cited By

Index Terms

Talking about Large Language Models
1. Computing methodologies
  1. Artificial intelligence
    1. Knowledge representation and reasoning
      1. Reasoning about belief and knowledge
    2. Natural language processing
  2. Machine learning
    1. Machine learning approaches
      1. Neural networks

Recommendations

Leveraging Large Language Models to Power Chatbots for Collecting User Self-Reported Data
CSCW

Large language models (LLMs) provide a new way to build chatbots by accepting natural language prompts. Yet, it is unclear how to design prompts to power chatbots to carry on naturalistic conversations while pursuing a given goal such as collecting self-...
Natural Language, Mixed-initiative Personal Assistant Agents
IMCOM '18: Proceedings of the 12th International Conference on Ubiquitous Information Management and Communication

The increasing popularity and use of personal voice assistant technologies, such as Siri and Google Now, is driving and expanding progress toward the long-term and lofty goal of using artificial intelligence to build human-computer dialog systems ...
Can Large Language Models Be Good Companions?: An LLM-Based Eyewear System with Conversational Common Ground

Developing chatbots as personal companions has long been a goal of artificial intelligence researchers. Recent advances in Large Language Models (LLMs) have delivered a practical solution for endowing chatbots with anthropomorphic language capabilities. ...

Comments

Information & Contributors

Information

Published In

Communications of the ACM Volume 67, Issue 2

February 2024

110 pages

EISSN:1557-7317

DOI:10.1145/3641526

Editor:
James Larus
Association for Computing Machinery, New York, NY

Issue’s Table of Contents

Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 25 January 2024

Published in CACM Volume 67, Issue 2

Check for updates

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

24
Total Citations
View Citations
11,120
Total Downloads

Downloads (Last 12 months)11,120
Downloads (Last 6 weeks)859

Reflects downloads up to 15 Sep 2024

Other Metrics

View Author Metrics

Citations

Cited By

View all

Bao WCao YYang YChe HHuang JWen S(2025)Data-driven stock forecasting models based on neural networks: A reviewInformation Fusion10.1016/j.inffus.2024.102616113(102616)Online publication date: Jan-2025
https://doi.org/10.1016/j.inffus.2024.102616
Nfaoui EElfaik H(2024)Evaluating Arabic Emotion Recognition Task Using ChatGPT Models: A Comparative Analysis between Emotional Stimuli Prompt, Fine-Tuning, and In-Context LearningJournal of Theoretical and Applied Electronic Commerce Research10.3390/jtaer1902005819:2(1118-1141)Online publication date: 14-May-2024
https://doi.org/10.3390/jtaer19020058
Su YTan SHuang JPérez-González FComesaña-Alfaro PKrätzer CVicky Zhao H(2024)A Novel Universal Image Forensics Localization Model Based on Image Noise and Segment Anything ModelProceedings of the 2024 ACM Workshop on Information Hiding and Multimedia Security10.1145/3658664.3659639(149-158)Online publication date: 24-Jun-2024
https://dl.acm.org/doi/10.1145/3658664.3659639
Deng Sde Rijke MNing YBaeza-Yates RBonchi F(2024)Advances in Human Event Modeling: From Graph Neural Networks to Language ModelsProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671466(6459-6469)Online publication date: 25-Aug-2024
https://dl.acm.org/doi/10.1145/3637528.3671466
Inie NDruga SZukerman PBender E(2024)From "AI" to Probabilistic Automation: How Does Anthropomorphization of Technical Systems Descriptions Influence Trust?Proceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency10.1145/3630106.3659040(2322-2347)Online publication date: 3-Jun-2024
https://dl.acm.org/doi/10.1145/3630106.3659040
Oliveira DPereira H(2024)Modeling texts with networks: comparing five approaches to sentence representationThe European Physical Journal B10.1140/epjb/s10051-024-00717-097:6Online publication date: 20-Jun-2024
https://doi.org/10.1140/epjb/s10051-024-00717-0
Zhuang RWang BSun SWang YDing ZLiu W(2024)Unlocking Chain of Thought in Base Language Models by Heuristic Instruction2024 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN60899.2024.10650937(1-7)Online publication date: 30-Jun-2024
https://doi.org/10.1109/IJCNN60899.2024.10650937
Tareaf RAbuJarour MEngelman TLiermann PKlotz J(2024)Accelerating Contextualization in AI Large Language Models Using Vector Databases2024 International Conference on Information Networking (ICOIN)10.1109/ICOIN59985.2024.10572088(316-321)Online publication date: 17-Jan-2024
https://doi.org/10.1109/ICOIN59985.2024.10572088
Takeda NLegaspi RNishimura YIkeda KPlötz TChernova S(2024)A Synergistic Large Language Model and Supervised Learning Approach to Zero-Shot and Continual Activity Recognition in Smart Homes2024 9th International Conference on Big Data Analytics (ICBDA)10.1109/ICBDA61153.2024.10607364(113-122)Online publication date: 16-Mar-2024
https://doi.org/10.1109/ICBDA61153.2024.10607364
Rostam ZSzénási SKertész G(2024)Achieving Peak Performance for Large Language Models: A Systematic ReviewIEEE Access10.1109/ACCESS.2024.342494512(96017-96050)Online publication date: 2024
https://doi.org/10.1109/ACCESS.2024.3424945
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Digital Edition

View this article in digital edition.

Digital Edition

Magazine Site

View this article on the magazine site (external)

Magazine Site

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Abstract

References

Cited By

Index Terms

Recommendations

Leveraging Large Language Models to Power Chatbots for Collecting User Self-Reported Data

Natural Language, Mixed-initiative Personal Assistant Agents

Can Large Language Models Be Good Companions?: An LLM-Based Eyewear System with Conversational Common Ground

Comments

Information

Published In

Publisher

Publication History

Check for updates

Qualifiers

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

View options

PDF

eReader

Digital Edition

Magazine Site

Get Access

Login options

Full Access

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations