skip to main content
10.1145/3397271.3401084acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
research-article

DVGAN: A Minimax Game for Search Result Diversification Combining Explicit and Implicit Features

Published: 25 July 2020 Publication History

Abstract

Search result diversification aims to retrieve diverse results to cover as many subtopics related to the query as possible. Recent studies showed that supervised diversification models are able to outperform the heuristic approaches, by automatically learning a diversification function other than using manually designed score functions. The main challenge of training a diversification model is the lack of high-quality training samples. Due to the involvement of dependence between documents in the ranker, it is very hard for training algorithms to select effective positive and negative ranking lists to train a reliable ranking model, given a large number of candidate documents within which different documents are relevant to different subtopics. To tackle this problem, we propose a supervised diversification framework based on Generative Adversarial Network (GAN). It consists of a generator and a discriminator interacting with each other in a minimax game. Specifically, the generator generates more confusing negative samples for the discriminator, and the discriminator sends back complementary ranking signals to the generator. Furthermore, we explicitly exploit subtopics in the generator, whereas focusing on modeling document similarity in the discriminator. Through such a minimax game, we are able to obtain better ranking models by combining ranking signals learned by the generator and the discriminator. Experimental results on the TREC Web Track dataset show that the proposed method can significantly outperform existing diversification methods.

Supplementary Material

MP4 File (3397271.3401084.mp4)
This is the presentation for "DVGAN: A Minimax Game for Search Result Diversification Combining Explicit and Implicit Features", summarizing the main contribution and experiment of this paper.

References

[1]
Jaime Carbonell and Jade Goldstein. 1998. The Use of MMR, Diversity-Based Reranking for Reordering Documents and Producing Summaries. In Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '98). Association for Computing Machinery, New York, NY, USA, 335--336. https://doi.org/10.1145/290941.291025
[2]
Olivier Chapelle, Donald Metlzer, Ya Zhang, and Pierre Grinspan. 2009. Expected Reciprocal Rank for Graded Relevance. In Proceedings of the 18th ACM Conference on Information and Knowledge Management (CIKM '09). Association for Computing Machinery, New York, NY, USA, 621--630. https://doi.org/10.1145/1645953. 1646033
[3]
Maheedhar Kolla Charles L. A. Clarke and Olga Vechtomova. 2009. An Effectiveness Measure for Ambiguous and Underspecified Queries. In Proceedings of the 2nd International Conference on Theory of Information Retrieval: Advances in Information Retrieval Theory, ICTIR 2009. 188--199. https://doi.org/10.1007/978-3- 642-04417-5_17
[4]
Charles L.A. Clarke, Maheedhar Kolla, Gordon V. Cormack, Olga Vechtomova, Azin Ashkan, Stefan Büttcher, and Ian MacKinnon. 2008. Novelty and Diversity in Information Retrieval Evaluation. In Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '08). Association for Computing Machinery, New York, NY, USA, 659--666. https://doi.org/10.1145/1390334.1390446
[5]
Charles L. Clarke, Maheedhar Kolla, and Olga Vechtomova. 2009. An Effectiveness Measure for Ambiguous and Underspecified Queries. In Proceedings of the 2nd International Conference on Theory of Information Retrieval: Advances in Information Retrieval Theory (ICTIR '09). Springer-Verlag, Berlin, Heidelberg, 188--199. https://doi.org/10.1007/978-3-642-04417-5_17
[6]
Van Dang and W. Bruce Croft. 2012. Diversity by Proportionality: An ElectionBased Approach to Search Result Diversification. In Proceedings of the 35th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '12). Association for Computing Machinery, New York, NY, USA, 65--74. https://doi.org/10.1145/2348283.2348296
[7]
David J. Groggel. 1996. Analyzing and Modeling Rank Data. Technometrics 38, 4 (1996), 403--403. https://doi.org/10.1080/00401706.1996.10484555 arXiv:https://www.tandfonline.com/doi/pdf/10.1080/00401706.1996.10484555
[8]
Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long Short-Term Memory. Neural Comput. 9, 8 (Nov. 1997), 1735--1780. https://doi.org/10.1162/neco.1997.9. 8.1735
[9]
Sha Hu, Zhicheng Dou, Xiaojie Wang, Tetsuya Sakai, and Ji-Rong Wen. 2015. Search Result Diversification Based on Hierarchical Intents. In Proceedings of the 24th ACM International on Conference on Information and Knowledge Management (CIKM '15). Association for Computing Machinery, New York, NY, USA, 63--72. https://doi.org/10.1145/2806416.2806455
[10]
Mehdi Mirza Bing Xu David Warde-Farley Sherjil Ozair Aaron Courville Ian J. Goodfellow, Jean Pouget-Abadie and Yoshua Bengio. 2014. Generative adversarial nets. In Proceedings of the 27th International Conference on Neural Information Processing Systems - Volume 2, NIPS 2014. 2672--2680. https://doi.org/10.5555/ 2969033.2969125
[11]
Changkuk Yoo Jamie Callan, Mark Hoy and Le Zhao. 2009. Clueweb09 data set. https://boston.lti.cs.cmu.edu/Data/clueweb09/
[12]
Zhengbao Jiang, Ji-Rong Wen, Zhicheng Dou, Wayne Xin Zhao, Jian-Yun Nie, and Ming Yue. 2017. Learning to Diversify Search Results via Subtopic Attention. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '17). Association for Computing Machinery, New York, NY, USA, 545--554. https://doi.org/10.1145/3077136.3080805
[13]
Jun Wang Yong Yu Lantao Yu, Weinan Zhang. 2017. SeqGAN: Sequence Generative Adversarial Nets with Policy Gradient. In Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, AAAI 2017.
[14]
Quoc V. Le and Tomas Mikolov. 2014. Distributed Representations of Sentences and Documents. In 31st International Conference on Machine Learning, ICML 2014.
[15]
Shuqi Lu, Zhicheng Dou, Xu Jun, Jian-Yun Nie, and Ji-Rong Wen. 2019. PSGAN: A Minimax Game for Personalized Search with Limited and Noisy Click Data. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR'19). Association for Computing Machinery, New York, NY, USA, 555--564. https://doi.org/10.1145/3331184.3331218
[16]
Thang Luong, Hieu Pham, and Christopher D. Manning. 2015. Effective Approaches to Attention-based Neural Machine Translation. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Lisbon, Portugal, 1412--1421. https: //doi.org/10.18653/v1/D15-1166
[17]
Rodrygo L.T. Santos. 2012. Explicit Web Search Result Diversification. SIGIR Forum 47, 1 (June 2012), 67--68. https://doi.org/10.1145/2492189.2492205
[18]
Rodrygo L.T. Santos, Craig Macdonald, and Iadh Ounis. 2010. Exploiting Query Reformulations for Web Search Result Diversification. In Proceedings of the 19th International Conference on World Wide Web (WWW '10). Association for Computing Machinery, New York, NY, USA, 881--890. https://doi.org/10.1145/1772690. 1772780
[19]
Jun Wang, Lantao Yu, Weinan Zhang, Yu Gong, Yinghui Xu, Benyou Wang, Peng Zhang, and Dell Zhang. 2017. IRGAN: A Minimax Game for Unifying Generative and Discriminative Information Retrieval Models. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '17). Association for Computing Machinery, New York, NY, USA, 515--524. https://doi.org/10.1145/3077136.3080786
[20]
Xiaojie Wang, Zhicheng Dou, Tetsuya Sakai, and Ji-Rong Wen. 2016. Evaluating Search Result Diversity Using Intent Hierarchies. In Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '16). Association for Computing Machinery, New York, NY, USA, 415--424. https://doi.org/10.1145/2911451.2911497
[21]
Xiaojie Wang, Ji-Rong Wen, Zhicheng Dou, Tetsuya Sakai, and Rui Zhang. 2017. Search Result Diversity Evaluation Based on Intent Hierarchies. IEEE Transactions on Knowledge and Data Engineering PP (07 2017), 1--1. https://doi.org/10.1109/ TKDE.2017.2729559
[22]
Fen Xia, Tie-Yan Liu, Jue Wang, Wensheng Zhang, and Hang Li. 2008. Listwise Approach to Learning to Rank: Theory and Algorithm. In Proceedings of the 25th International Conference on Machine Learning (ICML '08). Association for Computing Machinery, New York, NY, USA, 1192--1199. https://doi.org/10.1145/ 1390156.1390306
[23]
Long Xia, Jun Xu, Yanyan Lan, Jiafeng Guo, and Xueqi Cheng. 2015. Learning Maximal Marginal Relevance Model via Directly Optimizing Diversity Evaluation Measures. In Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '15). Association for Computing Machinery, New York, NY, USA, 113--122. https://doi.org/10.1145/ 2766462.2767710
[24]
Long Xia, Jun Xu, Yanyan Lan, Jiafeng Guo, and Xueqi Cheng. 2016. Modeling Document Novelty with Neural Tensor Network for Search Result Diversification. In Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '16). Association for Computing Machinery, New York, NY, USA, 395--404. https://doi.org/10.1145/2911451.2911498
[25]
Yisong Yue and Thorsten Joachims. 2008. Predicting Diverse Subsets Using Structural SVMs. In Proceedings of the 25th International Conference on Machine Learning (ICML '08). Association for Computing Machinery, New York, NY, USA, 1224--1231. https://doi.org/10.1145/1390156.1390310
[26]
Yadong Zhu, Yanyan Lan, Jiafeng Guo, Xueqi Cheng, and Shuzi Niu. 2014. Learning for Search Result Diversification. In Proceedings of the 37th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '14). Association for Computing Machinery, New York, NY, USA, 293--302. https://doi.org/10.1145/2600428.2609634

Cited By

View all
  • (2024)Passage-aware Search Result DiversificationACM Transactions on Information Systems10.1145/365367242:5(1-29)Online publication date: 13-May-2024
  • (2024)Multi-grained Document Modeling for Search Result DiversificationACM Transactions on Information Systems10.1145/365285242:5(1-22)Online publication date: 27-Apr-2024
  • (2024)CL4DIV: A Contrastive Learning Framework for Search Result DiversificationProceedings of the 17th ACM International Conference on Web Search and Data Mining10.1145/3616855.3635851(171-180)Online publication date: 4-Mar-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGIR '20: Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval
July 2020
2548 pages
ISBN:9781450380164
DOI:10.1145/3397271
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 25 July 2020

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. generative adversarial network
  2. search result diversification

Qualifiers

  • Research-article

Funding Sources

  • Beijing Outstanding Young Scientist Program
  • National Key R&D Program
  • National Natural Science Foundation of China

Conference

SIGIR '20
Sponsor:

Acceptance Rates

Overall Acceptance Rate 792 of 3,983 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)28
  • Downloads (Last 6 weeks)4
Reflects downloads up to 14 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Passage-aware Search Result DiversificationACM Transactions on Information Systems10.1145/365367242:5(1-29)Online publication date: 13-May-2024
  • (2024)Multi-grained Document Modeling for Search Result DiversificationACM Transactions on Information Systems10.1145/365285242:5(1-22)Online publication date: 27-Apr-2024
  • (2024)CL4DIV: A Contrastive Learning Framework for Search Result DiversificationProceedings of the 17th ACM International Conference on Web Search and Data Mining10.1145/3616855.3635851(171-180)Online publication date: 4-Mar-2024
  • (2023)Personalized and Diversified: Ranking Search Results in an Integrated WayACM Transactions on Information Systems10.1145/363198942:3(1-25)Online publication date: 9-Nov-2023
  • (2023)Search Result Diversification Using Query Aspects as BottlenecksProceedings of the 32nd ACM International Conference on Information and Knowledge Management10.1145/3583780.3615050(3040-3051)Online publication date: 21-Oct-2023
  • (2023)Controllable Multi-Objective Re-ranking with Policy HypernetworksProceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3580305.3599796(3855-3864)Online publication date: 6-Aug-2023
  • (2023)GDESA: Greedy Diversity Encoder with Self-attention for Search Results DiversificationACM Transactions on Information Systems10.1145/354410341:2(1-36)Online publication date: 3-Apr-2023
  • (2023)Incorporating Explicit Subtopics in Personalized SearchProceedings of the ACM Web Conference 202310.1145/3543507.3583488(3364-3374)Online publication date: 30-Apr-2023
  • (2023)Attentive Adversarial Collaborative FilteringIEEE Transactions on Systems, Man, and Cybernetics: Systems10.1109/TSMC.2023.324108353:7(4064-4076)Online publication date: Jul-2023
  • (2023)Integrated Personalized and Diversified Search Based on Search LogsIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2023.3291006(1-14)Online publication date: 2023
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media