skip to main content
10.1145/3626772.3657869acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
research-article
Open access

ProCIS: A Benchmark for Proactive Retrieval in Conversations

Published: 11 July 2024 Publication History

Abstract

The field of conversational information seeking, which is rapidly gaining interest in both academia and industry, is changing how we interact with search engines through natural language interactions. Existing datasets and methods are mostly evaluating reactive conversational information seeking systems that solely provide response to every query from the user. We identify a gap in building and evaluating proactive conversational information seeking systems that can monitor a multi-party human conversation and proactively engage in the conversation at an opportune moment by retrieving useful resources and suggestions. In this paper, we introduce a large-scale dataset for proactive document retrieval that consists of over 2.8 million conversations. We conduct crowdsourcing experiments to obtain high-quality and relatively complete relevance judgments through depth-k pooling. We also collect annotations related to the parts of the conversation that are related to each document, enabling us to evaluate proactive retrieval systems. We introduce normalized proactive discounted cumulative gain (npDCG) for evaluating these systems, and further provide benchmark results for a wide range of models, including a novel model we developed for this task. We believe that the developed dataset, called ProCIS, paves the path towards developing proactive conversational information seeking systems.

References

[1]
Tabish Ahmed and Sahan Bulathwela. 2022. Towards Proactive Information Retrieval in Noisy Text with Wikipedia Concepts. arXiv preprint arXiv:2210.09877 (2022).
[2]
Mohammad Aliannejadi, Leif Azzopardi, Hamed Zamani, Evangelos Kanoulas, Paul Thomas, and Nick Craswell. 2021. Analysing Mixed Initiatives and Search Strategies during Conversational Search. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management (Virtual Event, Queensland, Australia) (CIKM '21). Association for Computing Machinery, New York, NY, USA, 16--26. https://doi.org/10.1145/3459637.3482231
[3]
Mohammad Aliannejadi, Hamed Zamani, Fabio Crestani, and W. Bruce Croft. 2019. Asking Clarifying Questions in Open-Domain Information-Seeking Conversations. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval (Paris, France) (SIGIR'19). Association for Computing Machinery, New York, NY, USA, 475--484. https: //doi.org/10.1145/3331184.3331265
[4]
Salvatore Andolina, Valeria Orso, Hendrik Schneider, Khalil Klouche, Tuukka Ruotsalo, Luciano Gamberini, and Giulio Jacucci. 2018. Investigating Proactive Search Support in Conversations. In Proceedings of the 2018 Designing Interactive Systems Conference (Hong Kong, China) (DIS '18). Association for Computing Machinery, New York, NY, USA, 1295--1307. https://doi.org/10.1145/3196709. 3196734
[5]
Avi Arampatzis. 2001. Unbiased sd threshold optimization, initial query degradation, decay, and incrementality, for adaptive document filtering. In TREC.
[6]
Romaric Besançon, Stéphane Chaudiron, Djamel Mostefa, Olivier Hamon, Ismaïl Timimi, and Khalid Choukri. 2009. Overview of CLEF 2008 INFILE Pilot Track. In Evaluating Systems for Multilingual and Multimodal Information Access, Carol Peters, Thomas Deselaers, Nicola Ferro, Julio Gonzalo, Gareth J. F. Jones, Mikko Kurimo, Thomas Mandl, Anselmo Peñas, and Vivien Petras (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 939--946.
[7]
Pavel Braslavski, Denis Savenkov, Eugene Agichtein, and Alina Dubatovka. 2017. What Do You Mean Exactly? Analyzing Clarification Questions in CQA. In Proceedings of the 2017 Conference on Conference Human Information Interaction and Retrieval (Oslo, Norway) (CHIIR '17). Association for Computing Machinery, New York, NY, USA, 345--348. https://doi.org/10.1145/3020165.3022149
[8]
Eunsol Choi, He He, Mohit Iyyer, Mark Yatskar, Wen-tau Yih, Yejin Choi, Percy Liang, and Luke Zettlemoyer. 2018. QuAC: Question Answering in Context. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Brussels, Belgium, 2174--2184. https://doi.org/10.18653/v1/D18-1241
[9]
Jeffrey Dalton, Chenyan Xiong, and Jamie Callan. 2019. TREC CAsT 2019: The conversational assistance track overview. TREC (2019).
[10]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics, Minneapolis, Minnesota, 4171--4186. https://doi.org/10.18653/v1/N19-1423
[11]
Thibault Formal, Benjamin Piwowarski, and Stéphane Clinchant. 2021. SPLADE: Sparse Lexical and Expansion Model for First Stage Ranking. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval (Virtual Event, Canada) (SIGIR '21). Association for Computing Machinery, New York, NY, USA, 2288--2292. https://doi.org/10.1145/ 3404835.3463098
[12]
Debasis Ganguly, Dipasree Pal, Manisha Verma, and Procheta Sen. 2020. Overview of RCD-2020, the FIRE-2020 track on Retrieval from Conversational Dialogues. In FIRE 2020: Forum for Information Retrieval Evaluation, Hyderabad, India, December 16-20, 2020, Prasenjit Majumder, Mandar Mitra, Surupendu Gangopadhyay, and Parth Mehta (Eds.). ACM, 33--36. https://doi.org/10.1145/3441501.3441518
[13]
Luyu Gao, Xueguang Ma, Jimmy J. Lin, and Jamie Callan. 2022. Tevatron: An Efficient and Flexible Toolkit for Dense Retrieval. ArXiv abs/2203.05765 (2022).
[14]
Susan Gauch, Mirco Speretta, Aravind Chandramouli, and Alessandro Micarelli. 2007. User Profiles for Personalized Information Access. Springer Berlin Heidelberg, Berlin, Heidelberg, 54--89. https://doi.org/10.1007/978-3-540-72079-9_2
[15]
Amit Hattimare, Arkin Dharawat, Yelman Khan, Yen-Chieh Lien, Chris Samarinas, George Z. Wei, Yulin Yang, and Hamed Zamani. 2023. MarunaBot: Multi-Modal Augmentation for Large Language Models with Applications to Task-Oriented Dialogues. In Alexa Prize TaskBot Challenge 2 Proceedings. https://www.amazon.science/alexa-prize/proceedings/marunabot-v2-towards-end-to-end-multi-modal-task-oriented-dialogue-systems
[16]
Pengcheng He, Jianfeng Gao, and Weizhu Chen. 2023. DeBERTaV3: Improving DeBERTa using ELECTRA-Style Pre-Training with Gradient-Disentangled Embedding Sharing. In The Eleventh International Conference on Learning Representations, ICLR 2023, Kigali, Rwanda, May 1-5, 2023. OpenReview.net. https: //openreview.net/pdf?id=sE7-XhLxHA
[17]
Geoffrey E Hinton and Sam Roweis. 2002. Stochastic neighbor embedding. Advances in neural information processing systems 15 (2002).
[18]
Albert Q. Jiang, Alexandre Sablayrolles, Arthur Mensch, Chris Bamford, Devendra Singh Chaplot, Diego de las Casas, Florian Bressand, Gianna Lengyel, Guillaume Lample, Lucile Saulnier, Lélio Renard Lavaud, Marie-Anne Lachaux, Pierre Stock, Teven Le Scao, Thibaut Lavril, Thomas Wang, Timothée Lacroix, and William El Sayed. 2023. Mistral 7B. arXiv:2310.06825 [cs.CL]
[19]
Omar Khattab and Matei Zaharia. 2020. ColBERT: Efficient and Effective Passage Search via Contextualized Late Interaction over BERT. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval (Virtual Event, China) (SIGIR '20). Association for Computing Machinery, New York, NY, USA, 39--48. https://doi.org/10.1145/3397271.3401075
[20]
Vaibhav Kumar and Jamie Callan. 2020. Making Information Seeking Easier: An Improved Pipeline for Conversational Search. In Findings of the Association for Computational Linguistics: EMNLP 2020, Trevor Cohn, Yulan He, and Yang Liu (Eds.). Association for Computational Linguistics, Online, 3971--3980. https: //doi.org/10.18653/v1/2020.findings-emnlp.354
[21]
Lizi Liao, Grace Hui Yang, and Chirag Shah. 2023. Proactive Conversational Agents. In Proceedings of the Sixteenth ACM International Conference on Web Search and Data Mining (Singapore, Singapore) (WSDM '23). Association for Computing Machinery, New York, NY, USA, 1244--1247. https://doi.org/10.1145/ 3539597.3572724
[22]
Fengran Mo, Chen Qu, Kelong Mao, Tianyu Zhu, Zhan Su, Kaiyu Huang, and Jian-Yun Nie. 2024. History-Aware Conversational Dense Retrieval. arXiv:2401.16659 [cs.IR]
[23]
Tri Nguyen, Mir Rosenberg, Xia Song, Jianfeng Gao, Saurabh Tiwary, Rangan Majumder, and Li Deng. 2016. MS MARCO: A Human Generated MAchine Reading COmprehension Dataset. (November 2016). https://www.microsoft.com/en-us/research/publication/ms-marco-humangenerated-machine-reading-comprehension-dataset/
[24]
OpenAI. 2023. GPT-4 Technical Report. arXiv:2303.08774 [cs.CL]
[25]
Paul Owoicho, Jeffrey Dalton, Mohammad Aliannejadi, Leif Azzopardi, Johanne R Trippas, and Svitlana Vakulenko. 2022. TREC CAsT 2022: Going beyond user ask and system retrieve with initiative and response generation. NIST Special Publication (2022), 500--338.
[26]
Gustavo Penha, Alexandru Balan, and Claudia Hauff. 2019. Introducing mantis: a novel multi-domain information seeking dialogues dataset. arXiv preprint arXiv:1912.04639 (2019).
[27]
Chen Qu, Liu Yang, Cen Chen, Minghui Qiu,W. Bruce Croft, and Mohit Iyyer. 2020. Open-Retrieval Conversational Question Answering. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval (Virtual Event, China) (SIGIR '20). Association for Computing Machinery, New York, NY, USA, 539--548. https://doi.org/10.1145/3397271.3401110
[28]
Filip Radlinski and Nick Craswell. 2017. A Theoretical Framework for Conversational Search. In Proceedings of the 2017 Conference on Conference Human Information Interaction and Retrieval (Oslo, Norway) (CHIIR '17). Association for Computing Machinery, New York, NY, USA, 117--126. https://doi.org/10.1145/ 3020165.3020183
[29]
Sudha Rao and Hal Daumé III. 2018. Learning to Ask Good Questions: Ranking Clarification Questions using Neural Expected Value of Perfect Information. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, Melbourne, Australia, 2737--2746. https://doi.org/10.18653/v1/P18-1255
[30]
Siva Reddy, Danqi Chen, and Christopher D. Manning. 2019. CoQA: A Conversational Question Answering Challenge. Transactions of the Association for Computational Linguistics 7 (2019), 249--266. https://doi.org/10.1162/tacl_a_00266
[31]
S Robertson and Ian Soboroff. 2003. The TREC-2002 Filtering Track Report. https://tsapps.nist.gov/publication/get_pdf.cfm?pub_id=50769
[32]
Kevin Ros, Matthew Jin, Jacob Levine, and Cheng Xiang Zhai. 2023. Retrieving Webpages Using Online Discussions. In Proceedings of the 2023 ACM SIGIR International Conference on Theory of Information Retrieval (Taipei, Taiwan) (ICTIR '23). Association for Computing Machinery, New York, NY, USA, 159--168. https://doi.org/10.1145/3578337.3605139
[33]
Chris Samarinas, Pracha Promthaw, Atharva Nijasure, Hansi Zeng, Julian Killingback, and Hamed Zamani. 2024. Simulating Task-Oriented Dialogues with State Transition Graphs and Large Language Models. arXiv:2404.14772 [cs.CL]
[34]
Ivan Sekulić, Mohammad Aliannejadi, and Fabio Crestani. 2022. Evaluating Mixed-initiative Conversational Search Systems via User Simulation. In Proceedings of the Fifteenth ACM International Conference onWeb Search and Data Mining (Virtual Event, AZ, USA) (WSDM '22). Association for Computing Machinery, New York, NY, USA, 888--896. https://doi.org/10.1145/3488560.3498440
[35]
Procheta Sen, Debasis Ganguly, and Gareth Jones. 2018. Procrastination is the Thief of Time: Evaluating the Effectiveness of Proactive Search Systems. In The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval (Ann Arbor, MI, USA) (SIGIR '18). Association for Computing Machinery, New York, NY, USA, 1157--1160. https://doi.org/10.1145/3209978.3210114
[36]
Chirag Shah. 2018. Information fostering-being proactive with information seeking and retrieval: Perspective paper. In Proceedings of the 2018 conference on human information interaction & retrieval. 62--71.
[37]
Svetlana Stoyanchev, Alex Liu, and Julia Hirschberg. 2014. Towards Natural Clarification Questions in Dialogue Systems. In AISB '14, Vol. 20.
[38]
Yi Tay, Vinh Tran, Mostafa Dehghani, Jianmo Ni, Dara Bahri, Harsh Mehta, Zhen Qin, Kai Hui, Zhe Zhao, Jai Gupta, Tal Schuster, William W Cohen, and Donald Metzler. 2022. Transformer Memory as a Differentiable Search Index. In Advances in Neural Information Processing Systems, Vol. 35. Curran Associates, Inc., 21831--21843. https://proceedings.neurips.cc/paper_files/paper/2022/file/ 892840a6123b5ec99ebaab8be1530fba-Paper-Conference.pdf
[39]
Hugo Touvron, Thibaut Lavril, Gautier Izacard, Xavier Martinet, Marie-Anne Lachaux, Timothée Lacroix, Baptiste Rozière, Naman Goyal, Eric Hambro, Faisal Azhar, Aurelien Rodriguez, Armand Joulin, Edouard Grave, and Guillaume Lample. 2023. LLaMA: Open and Efficient Foundation Language Models. arXiv:2302.13971 [cs.CL]
[40]
Somin Wadhwa and Hamed Zamani. 2021. Towards System-Initiative Conversational Information Seeking. In DESIRES. 102--116.
[41]
Guan Wang, Sijie Cheng, Xianyuan Zhan, Xiangang Li, Sen Song, and Yang Liu. 2023. Openchat: Advancing open-source language models with mixed-quality data. arXiv preprint arXiv:2309.11235 (2023).
[42]
Ledell Wu, Fabio Petroni, Martin Josifoski, Sebastian Riedel, and Luke Zettlemoyer. 2020. Scalable Zero-shot Entity Linking with Dense Entity Retrieval. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), Bonnie Webber, Trevor Cohn, Yulan He, and Yang Liu (Eds.). Association for Computational Linguistics, Online, 6397--6407. https: //doi.org/10.18653/v1/2020.emnlp-main.519
[43]
Lee Xiong, Chenyan Xiong, Ye Li, Kwok-Fung Tang, Jialin Liu, Paul N. Bennett, Junaid Ahmed, and Arnold Overwijk. 2021. Approximate Nearest Neighbor Negative Contrastive Learning for Dense Text Retrieval. In 9th International Conference on Learning Representations, ICLR 2021, Virtual Event, Austria, May 3-7, 2021. OpenReview.net. https://openreview.net/forum?id=zeFrfgyZln
[44]
Shi Yu, Zhenghao Liu, Chenyan Xiong, Tao Feng, and Zhiyuan Liu. 2021. Few-Shot Conversational Dense Retrieval. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval (Virtual Event, Canada) (SIGIR '21). Association for Computing Machinery, New York, NY, USA, 829--838. https://doi.org/10.1145/3404835.3462856
[45]
Hamed Zamani, Susan Dumais, Nick Craswell, Paul Bennett, and Gord Lueck. 2020. Generating Clarifying Questions for Information Retrieval. In Proceedings of The Web Conference 2020 (Taipei, Taiwan) (WWW '20). Association for Computing Machinery, New York, NY, USA, 418--428. https://doi.org/10.1145/ 3366423.3380126
[46]
Hamed Zamani, Gord Lueck, Everest Chen, Rodolfo Quispe, Flint Luu, and Nick Craswell. 2020. MIMICS: A Large-Scale Data Collection for Search Clarification. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management. Association for Computing Machinery, New York, NY, USA, 3189--3196. https://doi.org/10.1145/3340531.3412772
[47]
Hamed Zamani, Bhaskar Mitra, Everest Chen, Gord Lueck, Fernando Diaz, Paul N. Bennett, Nick Craswell, and Susan T. Dumais. 2020. Analyzing and Learning from User Interactions for Search Clarification. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval (Virtual Event, China) (SIGIR '20). Association for Computing Machinery, New York, NY, USA, 1181--1190. https://doi.org/10.1145/3397271.3401160
[48]
Hamed Zamani and Azadeh Shakery. 2018. A language model-based framework for multi-publisher content-based recommender systems. Information Retrieval Journal 21 (2018), 369--409. https://doi.org/10.1007/s10791-018-9327-0
[49]
Hamed Zamani, Johanne R. Trippas, Jeff Dalton, and Filip Radlinski. 2023. Conversational Information Seeking. Foundations and Trends® in Information Retrieval 17, 3--4 (2023), 244--456. https://doi.org/10.1561/1500000081
[50]
Hansi Zeng, Chen Luo, Bowen Jin, Sheikh Muhammad Sarwar, Tianxin Wei, and Hamed Zamani. 2024. Scalable and Effective Generative Information Retrieval. In Proceedings of The Web Conference 2024 (Singapore, Singapore) (WWW '24). Association for Computing Machinery, New York, NY, USA.

Index Terms

  1. ProCIS: A Benchmark for Proactive Retrieval in Conversations

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SIGIR '24: Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval
    July 2024
    3164 pages
    ISBN:9798400704314
    DOI:10.1145/3626772
    This work is licensed under a Creative Commons Attribution International 4.0 License.

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 11 July 2024

    Check for updates

    Author Tags

    1. conversational search
    2. dialogue systems
    3. proactive search systems

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    SIGIR 2024
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 792 of 3,983 submissions, 20%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 126
      Total Downloads
    • Downloads (Last 12 months)126
    • Downloads (Last 6 weeks)65
    Reflects downloads up to 21 Sep 2024

    Other Metrics

    Citations

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media