skip to main content
10.1145/2348283.2348302acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
research-article

Evaluating aggregated search pages

Published: 12 August 2012 Publication History

Abstract

Aggregating search results from a variety of heterogeneous sources or verticals such as news, image and video into a single interface is a popular paradigm in web search. Although various approaches exist for selecting relevant verticals or optimising the aggregated search result page, evaluating the quality of an aggregated page is an open question.
This paper proposes a general framework for evaluating the quality of aggregated search pages. We evaluate our approach by collecting annotated user preferences over a set of aggregated search pages for 56 topics and 12 verticals. We empirically demonstrate the fidelity of metrics instantiated from our proposed framework by showing that they strongly agree with the annotated user preferences of pairs of simulated aggregated pages.
Furthermore, we show that our metrics agree with the majority preference more often than current diversity-based information retrieval metrics. Finally, we demonstrate the flexibility of our framework by showing that personalised historical preference data can be used to improve the performance of our proposed metrics.

References

[1]
R. Agrawal, S. Gollapudi, A. Halverson, and S. Ieong. Diversifying search results. WSDM, 2009.
[2]
J. Arguello, F. Diaz, and J. Callan. Learning to aggregate vertical results into web search results. CIKM, 2011.
[3]
J. Arguello, F. Diaz, J. Callan, and B. Carterette. A methodology for evaluating aggregated search results. ECIR, 2011.
[4]
J. Arguello, F. Diaz, J. Callan, and J.-F. Crespo. Sources of evidence for vertical selection. SIGIR, 2009.
[5]
J. Arguello, F. Diaz, and J.-F. Paiement. Vertical selection in the presence of unlabeled verticals. In SIGIR, 2010.
[6]
P. Bailey, N. Craswell, R. W. White, L. Chen, A. Satyanarayana, and S. M. M. Tahaghoghi. Evaluating whole-page relevance. SIGIR, 2010.
[7]
J. P. Callan, Z. Lu, and W. B. Croft. Searching distributed collections with inference networks. SIGIR, 1995.
[8]
O. Chapelle, D. Metlzer, Y. Zhang, and P. Grinspan. Expected reciprocal rank for graded relevance. CIKM, 2009.
[9]
D. Chen, W. Chen, H. Wang, Z. Chen, and Q. Yang. Beyond ten blue links: enabling user click modeling in federated web search. WSDM, 2012.
[10]
C. L. Clarke, M. Kolla, G. V. Cormack, O. Vechtomova, A. Ashkan, S. Buttcher, and I. MacKinnon. Novelty and diversity in information retrieval evaluation. SIGIR, 2008.
[11]
G. Dupret. User Models to Compare and Evaluate Web IR Metrics. In SIGIR 2009 Workshop on The Future of IR Evaluation, 2009.
[12]
K. Järvelin and J. Kekäläinen. Cumulated gain-based evaluation of ir techniques. TOIS, 2002.
[13]
K. S. Jones and C. J. Rijsbergen. Report on the need for and the provision of an 'ideal' information retrieval test collection. British Library Research and Development Report No. 5266, 1975.
[14]
T. Leelanupab, G. Zuccon, and J. M. Jose. A query-basis approach to parametrizing novelty-biased cumulative gain. In ICTIR, 2011.
[15]
X. Li, Y.-Y. Wang, and A. Acero. Learning query intent from regularized click graphs. In SIGIR, 2008.
[16]
A. Moffat and J. Zobel. Rank-biased precision for measurement of retrieval effectiveness. TOIS, 2008.
[17]
A. K. Ponnuswami, K. Pattabiraman, Q. Wu, R. Gilad-Bachrach, and T. Kanungo. On composition of a federated web search result page: using online users to provide pairwise preference for heterogeneous verticals. WSDM, 2011.
[18]
T. Sakai and R. Song. Evaluating diversified search results using per-intent graded relevance. SIGIR, 2011.
[19]
M. Sanderson, M. L. Paramita, P. Clough, and E. Kanoulas. Do user preferences and evaluation measures line up? SIGIR, 2010.
[20]
R. L. T. Santos, C. Macdonald, and I. Ounis. Aggregated search result diversification. ICTIR, 2011.
[21]
L. Si and J. Callan. Relevant document distribution estimation method for resource selection. SIGIR 2003.
[22]
E. M. Voorhees. Overview of the trec 2003 question answering track. In TREC, 2003.
[23]
X.-B. Xue, Z.-H. Zhou, and Z. M. Zhang. Improving web search using image snippets. ACM Trans. Internet Technol., 8:21, 2008.
[24]
J. Fleiss. Measuring nominal scale agreement among many raters. Psychological Bulletin., 76(5), 1971.
[25]
K. Zhou, R. Cummins, M. Lalmas, and J. M. Jose. Evaluating large-scale distributed vertical search. In LSDS-IR workshop in CIKM, 2011.
[26]
K. Zhou, R. Cummins, M. Halvey, M. Lalmas and J. M. Jose. Assessing and Predicting Vertical Intent for Web Queries. In ECIR, 2012.

Cited By

View all
  • (2024)Visualization-Enhanced Aggregated Search InterfacesProceedings of the 2024 Conference on Human Information Interaction and Retrieval10.1145/3627508.3638336(461-464)Online publication date: 10-Mar-2024
  • (2023)The Evolution of Web Search User Interfaces - An Archaeological Analysis of Google Search Engine Result PagesProceedings of the 2023 Conference on Human Information Interaction and Retrieval10.1145/3576840.3578320(55-68)Online publication date: 19-Mar-2023
  • (2023)Formally Modeling Users in Information RetrievalA Behavioral Economics Approach to Interactive Information Retrieval10.1007/978-3-031-23229-9_2(23-64)Online publication date: 18-Feb-2023
  • Show More Cited By

Index Terms

  1. Evaluating aggregated search pages

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SIGIR '12: Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
    August 2012
    1236 pages
    ISBN:9781450314725
    DOI:10.1145/2348283
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 12 August 2012

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. aggregated search
    2. diversity
    3. evaluation
    4. performance metric

    Qualifiers

    • Research-article

    Conference

    SIGIR '12
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 792 of 3,983 submissions, 20%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)1
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 15 Sep 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Visualization-Enhanced Aggregated Search InterfacesProceedings of the 2024 Conference on Human Information Interaction and Retrieval10.1145/3627508.3638336(461-464)Online publication date: 10-Mar-2024
    • (2023)The Evolution of Web Search User Interfaces - An Archaeological Analysis of Google Search Engine Result PagesProceedings of the 2023 Conference on Human Information Interaction and Retrieval10.1145/3576840.3578320(55-68)Online publication date: 19-Mar-2023
    • (2023)Formally Modeling Users in Information RetrievalA Behavioral Economics Approach to Interactive Information Retrieval10.1007/978-3-031-23229-9_2(23-64)Online publication date: 18-Feb-2023
    • (2022)From linear to non-linear: investigating the effects of right-rail results on complex SERPsAdvances in Computational Intelligence10.1007/s43674-021-00028-22:1Online publication date: 10-Jan-2022
    • (2021)POSSCOREProceedings of the 30th ACM International Conference on Information & Knowledge Management10.1145/3459637.3482463(1119-1129)Online publication date: 26-Oct-2021
    • (2021)Neural Instant Search for Music and PodcastProceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining10.1145/3447548.3467188(2984-2992)Online publication date: 14-Aug-2021
    • (2021)Meta-evaluation of Conversational Search Evaluation MetricsACM Transactions on Information Systems10.1145/344502939:4(1-42)Online publication date: 1-Sep-2021
    • (2019)Aggregating E-commerce Search Results from Heterogeneous Sources via Hierarchical Reinforcement LearningThe World Wide Web Conference10.1145/3308558.3313455(1771-1781)Online publication date: 13-May-2019
    • (2019)Deep Structure Learning for Rumor Detection on Twitter2019 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN.2019.8852468(1-8)Online publication date: Jul-2019
    • (2019)Text Attention and Focal Negative Loss for Scene Text Detection2019 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN.2019.8851959(1-8)Online publication date: Jul-2019
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media