short-paper

A Retrieval System for Images and Videos based on Aesthetic Assessment of Visuals

Authors:

Daniel Vera Nieto,

Saikishore Kalloori,

Clara Fernandez Labrador,

Severin Klingler,

Markus GrossAuthors Info & Claims

SIGIR '23: Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval

Pages 3180 - 3184

https://doi.org/10.1145/3539618.3591817

Published: 18 July 2023 Publication History

Abstract

Attractive images or videos are the visual backbones of journalism and social media to gain the user's attention. From trailers to teaser images to image galleries, appealing visuals have only grown in importance over the years. However, selecting eye-catching shots from a video or the perfect image from large image collections is a challenging and time-consuming task. We present our tool that can assess image and video content from an aesthetic standpoint. We discovered that it is possible to perform such an assessment by combining expert knowledge with data-driven information. We combine the relevant aesthetic features and machine learning algorithms into an aesthetics retrieval system, which enables users to sort uploaded visuals based on an aesthetic score and interact with additional photographic, cinematic, and person-specific features.

References

[1]

[n. d.]. Docker Compose. https://docs.docker.com/compose/. Accessed: 2023-02-17.

[2]

[n. d.]. FastAPI framework, high performance, easy to learn, fast to code, ready for production. https://fastapi.tiangolo.com/. Accessed: 2023-02--17.

[3]

[n. d.]. Flickr. https://www.flickr.com/

[4]

[n. d.]. Google Photos. https://www.google.com/photos/

[5]

[n. d.]. Helm. The package manager for Kubernetes. https://helm.sh/. Accessed: 2023-02--17.

[6]

[n. d.]. Kubernetes. Production-Grade Container Orchestration. https:// kubernetes.io/. Accessed: 2023-02--17.

[7]

[n. d.]. React. A JavaScript library for building user interfaces. https://reactjs.org/. Accessed: 2023-02--17.

[8]

[n. d.]. Tamedia Image Concierge. https://www.epfl.ch/labs/lsir/tamedia-image-concierge/

[9]

[n. d.]. TinyDB, your tiny, document oriented database optimized for your happiness. https://tinydb.readthedocs.io/en/latest/. Accessed: 2023-02--17.

[10]

[n. d.]. Unplash. https://unsplash.com/

[11]

Aasif Ansari and Muzammil H Mohammed. 2015. Content based video retrieval systems-methods, techniques, trends and challenges. International Journal of Computer Applications 112, 7 (2015).

[12]

Y Alp Aslandogan and Clement T. Yu. 1999. Techniques and systems for image and video retrieval. IEEE transactions on Knowledge and Data Engineering 11, 1 (1999), 56--63.

[13]

Tunç Ozan Aydin, Aljoscha Smolic, and Markus Gross. 2014. Automated aesthetic analysis of photographic images. IEEE transactions on visualization and computer graphics 21, 1 (2014), 31--42.

[14]

Aaron Bangor, Philip T Kortum, and James T Miller. 2008. An empirical evaluation of the system usability scale. Intl. Journal of Human--Computer Interaction 24, 6 (2008), 574--594.

[15]

Luigi Celona, Marco Leonardi, Paolo Napoletano, and Alessandro Rozza. 2022. Composition and Style Attributes Guided Image Aesthetic Assessment. IEEE Transactions on Image Processing 31 (2022), 5009--5024. https://doi.org/10.1109/ TIP.2022.3191853

Digital Library

[16]

Qiuyu Chen, Wei Zhang, Ning Zhou, Peng Lei, Yi Xu, Yu Zheng, and Jianping Fan. 2020. Adaptive fractional dilated convolution network for image aesthetics assessment. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 14114--14123.

[17]

John P Chin, Virginia A Diehl, and Kent L Norman. 1988. Development of an instrument measuring user satisfaction of the human-computer interface. In Proceedings of the SIGCHI conference on Human factors in computing systems. 213--218.

Digital Library

[18]

Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition. Ieee, 248--255.

[19]

Yubin Deng, Chen Change Loy, and Xiaoou Tang. 2017. Image aesthetic assessment: An experimental survey. IEEE Signal Processing Magazine 34, 4 (2017), 80--106.

[20]

Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, and Neil Houlsby. 2021. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. In International Conference on Learning Representations. https://openreview.net/forum?id= YicbFdNTTy

[21]

Lorenz Gen, Flo and Ramzi. 2016. EyeEm. https://developer.nvidia.com/blog/ understanding-aesthetics-deep-learning

[22]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770--778.

[23]

Vlad Hosu, Bastian Goldlucke, and Dietmar Saupe. 2019. Effective aesthetics prediction with multi-level spatially pooled features. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 9375--9383.

[24]

Qingqiu Huang, Yu Xiong, Anyi Rao, Jiaze Wang, and Dahua Lin. 2020. Movienet: A holistic dataset for movie understanding. In European Conference on Computer Vision. Springer, 709--727.

Digital Library

[25]

Saikishore Kalloori, Francesco Ricci, and Rosella Gennari. 2018. Eliciting pairwise preferences in recommender systems. In Proceedings of the 12th ACM Conference on Recommender Systems. 329--337.

Digital Library

[26]

Saikishore Kalloori, Francesco Ricci, and Marko Tkalcic. 2016. Pairwise preferences based matrix factorization and nearest neighbor recommendation techniques. In Proceedings of the 10th ACM Conference on Recommender Systems. 143--146.

Digital Library

[27]

Shu Kong, Xiaohui Shen, Zhe Lin, Radomir Mech, and Charless Fowlkes. 2016. Photo aesthetics ranking network with attributes and content adaptation. In European conference on computer vision. Springer, 662--679.

[28]

Xin Lu, Zhe Lin, Hailin Jin, Jianchao Yang, and James Z. Wang. 2014. RAPID: Rating Pictorial Aesthetics Using Deep Learning. In Proceedings of the 22nd ACM International Conference on Multimedia (Orlando, Florida, USA) (MM '14). Association for Computing Machinery, New York, NY, USA, 457--466. https: //doi.org/10.1145/2647868.2654927

Digital Library

[29]

Xin Lu, Zhe Lin, Xiaohui Shen, Radomír Mech, and James Z. Wang. 2015. Deep Multi-patch Aggregation Network for Image Style, Aesthetics, and Quality Estimation. In 2015 IEEE International Conference on Computer Vision (ICCV). 990--998. https://doi.org/10.1109/ICCV.2015.119

Digital Library

[30]

Wei Luo, Xiaogang Wang, and Xiaoou Tang. 2011. Content-based photo quality assessment. In 2011 International Conference on Computer Vision. IEEE, 2206--2213.

[31]

Shuang Ma, Jing Liu, and Chang Wen Chen. 2017. A-lamp: Adaptive layout-aware multi-patch deep convolutional neural network for photo aesthetic assessment. In Proceedings of the IEEE conference on computer vision and pattern recognition. 4535--4544.

[32]

Yu-Fei Ma, Lie Lu, Hong-Jiang Zhang, and Mingjing Li. 2002. A user attention model for video summarization. In Proceedings of the tenth ACM international conference on Multimedia. 533--542.

Digital Library

[33]

Naila Murray, Luca Marchesotti, and Florent Perronnin. 2012. AVA: A large-scale database for aesthetic visual analysis. In 2012 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2408--2415.

Digital Library

[34]

Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, et al . 2019. Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019).

[35]

BV Patel and BB Meshram. 2012. Content based video retrieval systems. arXiv preprint arXiv:1205.1641 (2012).

[36]

Luan Pham, The Huynh Vu, and Tuan Anh Tran. 2021. Facial expression recognition using residual masking network. In 2020 25Th international conference on pattern recognition (ICPR). IEEE, 4513--4519.

[37]

Guoping Qiu. 2022. Challenges and opportunities of image and video retrieval. Frontiers in Imaging 1 (2022), 2.

[38]

Anyi Rao, Jiaze Wang, Linning Xu, Xuekun Jiang, Qingqiu Huang, Bolei Zhou, and Dahua Lin. 2020. A unified framework for shot type classification based on subject centric lens. In Computer Vision--ECCV 2020: 16th European Conference, Glasgow, UK, August 23--28, 2020, Proceedings, Part XI 16. Springer, 17--34.

[39]

Luca Rossetto, Ivan Giangreco, Claudiu Tanase, and Heiko Schuldt. 2016. Vitrivr: A Flexible Retrieval Stack Supporting Multiple Query Modes for Searching in Multimedia Collections. In Proceedings of the 24th ACM International Conference on Multimedia (Amsterdam, The Netherlands) (MM '16). Association for Computing Machinery, New York, NY, USA, 1183--1186. https://doi.org/10.1145/2964284. 2973797

Digital Library

[40]

El Mehdi Saoudi and Said Jai-Andaloussi. 2021. A distributed content-based video retrieval system for large datasets. Journal of Big Data 8, 1 (2021), 1--26.

[41]

Ville Satopaa, Jeannie Albrecht, David Irwin, and Barath Raghavan. 2011. Finding a "Kneedle" in a Haystack: Detecting Knee Points in System Behavior. In 2011 31st International Conference on Distributed Computing Systems Workshops. 166--171. https://doi.org/10.1109/ICDCSW.2011.20

Digital Library

[42]

Farhana Sultana, Abu Sufian, and Paramartha Dutta. 2020. Evolution of image segmentation using deep convolutional neural network: a survey. Knowledge-Based Systems 201 (2020), 106062.

[43]

Chen Sun, Abhinav Shrivastava, Saurabh Singh, and Abhinav Gupta. 2017. Revisiting unreasonable effectiveness of data in deep learning era. In Proceedings of the IEEE international conference on computer vision. 843--852.

[44]

Hossein Talebi and Peyman Milanfar. 2018. NIMA: Neural image assessment. IEEE transactions on image processing 27, 8 (2018), 3998--4011.

[45]

Munan Xu, Jia-Xing Zhong, Yurui Ren, Shan Liu, and Ge Li. 2020. Context-aware attention network for predicting image aesthetic subjectivity. In Proceedings of the 28th ACM International Conference on Multimedia. 798--806.

Digital Library

[46]

Feng Yang, Junjie Ke, Peyman Milanfar, Qifei Wang, and Yilin Wang. 2021. MUSIQ: Multi-scale Image Quality Transformer. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV).

[47]

Ke Zhang, Wei-Lun Chao, Fei Sha, and Kristen Grauman. 2016. Video summarization with long short-term memory. In European conference on computer vision. Springer, 766--782.

Index Terms

A Retrieval System for Images and Videos based on Aesthetic Assessment of Visuals
1. Information systems
  1. Information retrieval
    1. Users and interactive retrieval
      1. Search interfaces
  2. Information systems applications
    1. Multimedia information systems
      1. Multimedia content creation

Recommendations

The study on content based multimedia data retrieval system

Of late, advance in hardware and communications technology has been rapidly increasing the demand for diverse multimedia information, which, including all image, audio, video, text, numerical data, etc., should be designed to excel the existing ...
Content-based video retrieval and compression: a unified solution
ICIP '97: Proceedings of the 1997 International Conference on Image Processing (ICIP '97) 3-Volume Set-Volume 1 - Volume 1

Video compression and retrieval have been treated as separate problems in the past. We present an object-based video representation that facilitates both compression and retrieval. Typically in retrieval applications, a video sequence is subdivided in ...
Aesthetic Attributes Assessment of Images
MM '19: Proceedings of the 27th ACM International Conference on Multimedia

Image aesthetic quality assessment has been a relatively hot topic during the last decade. Most recently, comments type assessment (aesthetic captions) has been proposed to describe the general aesthetic impression of an image using text. In this paper, ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

SIGIR '23: Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval

July 2023

3567 pages

ISBN:9781450394086

DOI:10.1145/3539618

General Chairs:
Hsin-Hsi Chen
National Taiwan University
,
Wei-Jou (Edward) Duh
National Taiwan University
,
Hen-Hsen Huang
Academia Sinica
,
Program Chairs:
Makoto P. Kato
Spotify
,
Josiane Mothe
Universite de Toulouse
,
Barbara Poblete
University of Chile and Amazon Visiting Academic

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGIR: ACM Special Interest Group on Information Retrieval

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 18 July 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Short-paper

Conference

SIGIR '23

Sponsor:

SIGIR

SIGIR '23: The 46th International ACM SIGIR Conference on Research and Development in Information Retrieval

July 23 - 27, 2023

Taipei, Taiwan

Acceptance Rates

Overall Acceptance Rate 792 of 3,983 submissions, 20%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
137
Total Downloads

Downloads (Last 12 months)110
Downloads (Last 6 weeks)1

Reflects downloads up to 21 Sep 2024

Other Metrics

View Author Metrics

Citations

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents