skip to main content
research-article
Open access

Talking about Large Language Models

Published: 25 January 2024 Publication History

Abstract

Interacting with a contemporary LLM-based conversational agent can create an illusion of being in the presence of a thinking creature. Yet, in their very nature, such systems are fundamentally not like us.

References

[1]
Ahn, M. et al. Do as I can, not as I say: Grounding language in robotic affordances. arXiv preprint arXiv:2204.01691. (2022).
[2]
Alayrac, J.-B. et al. Flamingo: A visual language model for few-shot learning. In Advances in Neural Information Processing Systems (2022).
[3]
Bender, E. and Koller, A. Climbing towards NLU: On meaning, form, and understanding in the age of data. In Proceedings of the 58th Annual Meeting of the Assoc. for Computational Linguistics (2020), 5185--5198.
[4]
Bender, E., Gebru, T., McMillan-Major, A., and Shmitchell, S. On the dangers of stochastic parrots: Can language models be too big? In Proceedings of the 2021 ACM Conf. on Fairness, Accountability, and Transparency, 610--623.
[5]
Brown, T. et al. Language models are few-shot learners. In Advances in Neural Information Processing Systems 33 (2020), 1877--1901.
[6]
Chan, S.C. et al. Data distributional properties drive emergent in-context learning in transformers. In Advances in Neural Information Processing Systems (2022).
[7]
Chowdhery, S. et al. PaLM: Scaling language modeling with pathways. arXiv preprint arxiv:2204.02311 (2022).
[8]
Creswell, A. and Shanahan, M. Faithful reasoning using large language models. arXiv preprint arXiv:2208.14271 (2022).
[9]
Creswell, A., Shanahan, M., and Higgins, I. Selection-inference: Exploiting large language models for interpretable logical reasoning. In Proceedings of the Intern. Conf. on Learning Representations (2023).
[10]
Davidson, D. Rational animals. Dialectica 36 (1982), 317--327.
[11]
Dennett, D. Intentional systems theory. The Oxford Handbook of Philosophy of Mind. Oxford University Press (2009), 339--350.
[12]
Devlin, J., Chang, M-W., Lee, K., and Toutanova, K. BERT: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018).
[13]
Elhage, N. et al. A mathematical framework for transformer circuits. Transformer Circuits Thread (2021); https://bit.ly/3NFliBA.
[14]
Glaese, A. et al. Improving alignment of dialogue agents via targeted human judgements. arXiv preprint arXiv:2209.14375 (2022).
[15]
Halevy, A.Y., Norvig, P., and Pereira, F. The unreasonable effectiveness of data. IEEE Intelligent Systems 24, 2 (2009), 8--12.
[16]
Harnad, S. The symbol grounding problem. Physica D: Nonlinear Phenomena 42, 1--3 (1990), 335--346.
[17]
Kojima, T. et al. Large language models are zero-shot reasoners. arXiv preprint arXiv:2205.11916 (2022).
[18]
Li, Z., Nye, M., and Andreas, J. Implicit representations of meaning in neural language models. In Proceedings of the 59th Annual Meeting of the Assoc. for Computational Linguistics and the 11th Intern. Joint Conf. on Natural Language Processing 1, Long Papers (2021).
[19]
Lu, J., Batra, D., Parikh, D., and Lee, S. Pretraining task-agnostic visiolinguistic representations for vision-and-language tasks. arXiv preprint arXiv:1908.02265 (2019).
[20]
Marcus, G. and Davis, E. GPT-3, bloviator: OpenAI's language generator has no idea what it's talking about. MIT Technology Rev. (Aug. 2020).
[21]
Meng, K., Bau, D., Andonian, A.J., and Belinkov, Y. Locating and editing factual associations in GPT. In Advances in Neural Information Processing Sys. (2022).
[22]
Lake, B.M. and Murphy, G.L. Word meaning in minds and machines. Psychological Rev. 130, 2 (2023), 401--431.
[23]
Nye, M. et al. Show your work: Scratchpads for intermediate computation with language models. arXiv preprint arXiv:2112.00114 (2021).
[24]
Olsson, N. et al. In-context learning and induction heads. Transformer Circuits Thread (2022); https://transformercircuits.pub/2022/in-context-learning-andinduction-heads/index.html.
[25]
OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023).
[26]
Ouyang, L. et al. Training language models to follow instructions with human feedback. In Advances in Neural Information Processing Systems (2022).
[27]
Piantadosi, S.T. and Hill, F. Meaning without reference in large language models. arXiv preprint arXiv:2208.02957 (2022).
[28]
Radford, A. et al. Language models are unsupervised multitask learners. (2019).
[29]
Rae, J.W. et al. Scaling language models: Methods, analysis & insights from training Gopher. arXiv preprint arXiv:2112.11446 (2021).
[30]
Ruane, E., Birhane, A. and Ventresque, A. Conversational AI: Social and ethical considerations. In Proceedings of the 27th AIAI Irish Conf. on Artificial Intelligence and Cognitive Science (2019), 104--115.
[31]
Schick, T. et al. Toolformer: Language models can teach themselves to use tools. arXiv preprint arXiv:2302.04761 (2023).
[32]
Shanahan, M. and Mitchell, M. Abstraction for deep reinforcement learning. In Proceedings of the 31stIntern. Joint Conf. on Artificial Intelligence (2022), 5588--5596.
[33]
Smith, B.C. The Promise of Artificial Intelligence: Reckoning and Judgment. MIT Press (2019).
[34]
Stiennon, N. et al. Learning to summarize from human feedback. In Advances in Neural Information Processing Systems (2020), 3008--3021.
[35]
Thoppilan, R. et al. LaMDA: Language models for dialog applications. arXiv preprint arXiv:2201.08239 (2022).
[36]
Vaswani, A. et al. Attention is all you need. In Advances in Neural Information Processing Systems (2017), 5998--6008.
[37]
Wei, J. et al. Emergent abilities of large language models. Transactions on Machine Learning Research (2022).
[38]
Wei, J. et al. Chain-of-thought prompting elicits reasoning in large language models. In Advances in Neural Information Processing Systems (2022).
[39]
Weidinger, L. et al. Ethical and social risks of harm from language models. arXiv preprint arXiv:2112.04359 (2021).
[40]
Wittgenstein, L. Philosophical Investigations. Basil Blackwell (1953).
[41]
Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In Proceedings of the Intern. Conf. on Learning Representations (2023).

Cited By

View all
  • (2025)Data-driven stock forecasting models based on neural networks: A reviewInformation Fusion10.1016/j.inffus.2024.102616113(102616)Online publication date: Jan-2025
  • (2024)Evaluating Arabic Emotion Recognition Task Using ChatGPT Models: A Comparative Analysis between Emotional Stimuli Prompt, Fine-Tuning, and In-Context LearningJournal of Theoretical and Applied Electronic Commerce Research10.3390/jtaer1902005819:2(1118-1141)Online publication date: 14-May-2024
  • (2024)A Novel Universal Image Forensics Localization Model Based on Image Noise and Segment Anything ModelProceedings of the 2024 ACM Workshop on Information Hiding and Multimedia Security10.1145/3658664.3659639(149-158)Online publication date: 24-Jun-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Communications of the ACM
Communications of the ACM  Volume 67, Issue 2
February 2024
110 pages
EISSN:1557-7317
DOI:10.1145/3641526
  • Editor:
  • James Larus
Issue’s Table of Contents
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 25 January 2024
Published in CACM Volume 67, Issue 2

Check for updates

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)11,120
  • Downloads (Last 6 weeks)859
Reflects downloads up to 15 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2025)Data-driven stock forecasting models based on neural networks: A reviewInformation Fusion10.1016/j.inffus.2024.102616113(102616)Online publication date: Jan-2025
  • (2024)Evaluating Arabic Emotion Recognition Task Using ChatGPT Models: A Comparative Analysis between Emotional Stimuli Prompt, Fine-Tuning, and In-Context LearningJournal of Theoretical and Applied Electronic Commerce Research10.3390/jtaer1902005819:2(1118-1141)Online publication date: 14-May-2024
  • (2024)A Novel Universal Image Forensics Localization Model Based on Image Noise and Segment Anything ModelProceedings of the 2024 ACM Workshop on Information Hiding and Multimedia Security10.1145/3658664.3659639(149-158)Online publication date: 24-Jun-2024
  • (2024)Advances in Human Event Modeling: From Graph Neural Networks to Language ModelsProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671466(6459-6469)Online publication date: 25-Aug-2024
  • (2024)From "AI" to Probabilistic Automation: How Does Anthropomorphization of Technical Systems Descriptions Influence Trust?Proceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency10.1145/3630106.3659040(2322-2347)Online publication date: 3-Jun-2024
  • (2024)Modeling texts with networks: comparing five approaches to sentence representationThe European Physical Journal B10.1140/epjb/s10051-024-00717-097:6Online publication date: 20-Jun-2024
  • (2024)Unlocking Chain of Thought in Base Language Models by Heuristic Instruction2024 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN60899.2024.10650937(1-7)Online publication date: 30-Jun-2024
  • (2024)Accelerating Contextualization in AI Large Language Models Using Vector Databases2024 International Conference on Information Networking (ICOIN)10.1109/ICOIN59985.2024.10572088(316-321)Online publication date: 17-Jan-2024
  • (2024)A Synergistic Large Language Model and Supervised Learning Approach to Zero-Shot and Continual Activity Recognition in Smart Homes2024 9th International Conference on Big Data Analytics (ICBDA)10.1109/ICBDA61153.2024.10607364(113-122)Online publication date: 16-Mar-2024
  • (2024)Achieving Peak Performance for Large Language Models: A Systematic ReviewIEEE Access10.1109/ACCESS.2024.342494512(96017-96050)Online publication date: 2024
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Digital Edition

View this article in digital edition.

Digital Edition

Magazine Site

View this article on the magazine site (external)

Magazine Site

Get Access

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media