skip to main content
research-article

Robust Spammer Detection in Microblogs: Leveraging User Carefulness

Published: 18 August 2017 Publication History

Abstract

Microblogging Web sites, such as Twitter and Sina Weibo, have become popular platforms for socializing and sharing information in recent years. Spammers have also discovered this new opportunity to unfairly overpower normal users with unsolicited content, namely social spams. Although it is intuitive for everyone to follow legitimate users, recent studies show that both legitimate users and spammers follow spammers for different reasons. Evidence of users seeking spammers on purpose is also observed. We regard this behavior as useful information for spammer detection. In this article, we approach the problem of spammer detection by leveraging the “carefulness” of users, which indicates how careful a user is when she is about to follow a potential spammer. We propose a framework to measure the carefulness and develop a supervised learning algorithm to estimate it based on known spammers and legitimate users. We illustrate how the robustness of the detection algorithms can be improved with aid of the proposed measure. Evaluation on two real datasets from Sina Weibo and Twitter with millions of users are performed, as well as an online test on Sina Weibo. The results show that our approach indeed captures the carefulness, and it is effective for detecting spammers. In addition, we find that our measure is also beneficial for other applications, such as link prediction.

References

[1]
Lada A. Adamic and Eytan Adar. 2003. Friends and neighbors on the Web. Social Networks 25, 3, 211--230.
[2]
Lars Backstrom and Jure Leskovec. 2011. Supervised random walks: Predicting and recommending links in social networks. In Proceedings of the 4th ACM International Conference on Web Search and Data Mining. 635--644.
[3]
Fabrıcio Benevenuto, Gabriel Magno, Tiago Rodrigues, and Virgılio Almeida. 2010. Detecting spammers on Twitter. In Proceedings of the 7th Annual Collaboration, Electronic Messaging, Anti-Abuse, and Spam Conference. 12.
[4]
Yazan Boshmaf, Dionysios Logothetis, Georgos Siganos, Jorge Lería, Jose Lorenzo, Matei Ripeanu, and Konstantin Beznosov. 2015. Íntegro: Leveraging victim prediction for robust fake account detection in OSNs. In Proceedings of the 2015 Network and Distributed System Security Symposium.
[5]
P. O. Boykin and V. P. Roychowdhury. 2005. Leveraging social networks to fight spam. Computer 38, 4, 61--68.
[6]
Qiang Cao, Michael Sirivianos, Xiaowei Yang, and Tiago Pregueiro. 2012. Aiding the detection of fake accounts in large scale social online services. In Proceedings of the 9th USENIX Conference on Networked Systems Design and Implementation. 15.
[7]
Paul-Alexandru Chirita, Jörg Diederich, and Wolfgang Nejdl. 2005. MailRank: Using ranking for spam detection. In Proceedings of the 14th ACM International Conference on Information and Knowledge Management. 373--380.
[8]
George Danezis and Prateek Mittal. 2009. SybilInfer: Detecting sybil nodes using social networks. In Proceedings of the ISOC Network and Distributed System Security Symposium.
[9]
Peng Gao, Neil Zhenqiang Gong, Sanjeev Kulkarni, Kurt Thomas, and Prateek Mittal. 2015. SybilFrame: A defense-in-depth framework for structure-based sybil detection. arXiv:1503.02985.
[10]
Sheng Gao, Ludovic Denoyer, and Patrick Gallinari. 2011. Temporal link prediction by integrating content and structure information. In Proceedings of the 20th ACM International Conference on Information and Knowledge Management. 1169--1174.
[11]
Saptarshi Ghosh, Bimal Viswanath, Farshad Kooti, Naveen Kumar Sharma, Gautam Korlam, Fabricio Benevenuto, Niloy Ganguly, and Krishna Phani Gummadi. 2012. Understanding and combating link farming in the Twitter social network. In Proceedings of the 21st International Conference on World Wide Web. 61--70.
[12]
Neil Zhenqiang Gong, Michael Frank, and Payal Mittal. 2014a. SybilBelief: A semi-supervised learning approach for structure-based sybil detection. IEEE Transactions on Information Forensics and Security 9, 6, 976--987.
[13]
Neil Zhenqiang Gong, Ameet Talwalkar, Lester Mackey, Ling Huang, Eui Chul Richard Shin, Emil Stefanov, Elaine Runting Shi, and Dawn Song. 2014b. Joint link prediction and attribute inference using a social-attribute network. ACM Transactions on Intelligent Systems and Technology 5, 2, Article No. 27.
[14]
Chris Grier, Kurt Thomas, Vern Paxson, and Michael Zhang. 2010. @Spam: The underground on 140 characters or less. In Proceedings of the 17th ACM Conference on Computer and Communications Security. 27--37.
[15]
Zoltán Gyöngyi, Hector Garcia-Molina, and Jan Pedersen. 2004. Combating Web spam with TrustRank. In Proceedings of the 30th International Conference on Very Large Data Bases. 576--587.
[16]
Paul Heymann, Georgia Koutrika, and Hector Garcia-Molina. 2007. Fighting spam on social Web sites: A survey of approaches and future challenges. IEEE Internet Computing 11, 6, 36--45.
[17]
John Hopcroft, Tiancheng Lou, and Jie Tang. 2011. Who will follow you back? Reciprocal relationship prediction. In Proceedings of the 20th ACM International Conference on Information and Knowledge Management. 1137--1146.
[18]
Xia Hu, Jiliang Tang, and Huan Liu. 2014. Leveraging knowledge across media for spammer detection in microblogging. In Proceedings of the 37th International ACM SIGIR Conference on Research and Development in Information Retrieval. 547--556.
[19]
Xia Hu, Jiliang Tang, Yanchao Zhang, and Huan Liu. 2013. Social spammer detection in microblogging. In Proceedings of the 23rd International Joint Conference on Artificial Intelligence. 2633--2639.
[20]
Junxian Huang, Yinglian Xie, Fang Yu, Qifa Ke, Martin Abadi, Eliot Gillum, and Z. Morley Mao. 2013. SocialWatch: Detection of online service abuse via large-scale social graphs. In Proceedings of the 8th ACM SIGSAC Symposium on Information, Computer, and Communications Security. 143--148.
[21]
Haewoon Kwak, Changhyun Lee, Hosung Park, and Sue Moon. 2010. What is Twitter, a social network or a news media? In Proceedings of the 19th International Conference on World Wide Web. 591--600.
[22]
Kyumin Lee, James Caverlee, and Steve Webb. 2010. Uncovering social spammers: Social honeypots + machine learning. In Proceedings of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval. 435--442.
[23]
Jure Leskovec and Christos Faloutsos. 2006. Sampling from large graphs. In Proceedings of the 12th ACM International Conference on Knowledge Discovery and Data Mining. 631--636.
[24]
David Liben-Nowell and Jon Kleinberg. 2003. The link prediction problem for social networks. In Proceedings of the 12th International Conference on Information and Knowledge Management. 556--559.
[25]
Gianluca Stringhini, Christopher Kruegel, and Giovanni Vigna. 2010. Detecting spammers on social networks. In Proceedings of the 26th Annual Computer Security Applications Conference. 1--9.
[26]
Kurt Thomas, Chris Grier, Dawn Song, and Vern Paxson. 2011. Suspended accounts in retrospect: An analysis of Twitter spam. In Proceedings of the 2011 ACM SIGCOMM Conference on Internet Measurement Conference. 243--258.
[27]
Binghui Wang, Le Zhang, and Neil Zhenqiang Gong. 2017. SybilSCAR: Sybil detection in online social networks via local rule based propagation. In Proceedings of the IEEE International Conference on Computer Communications.
[28]
Dashun Wang, Dino Pedreschi, Chaoming Song, Fosca Giannotti, and Albert-Laszlo Barabasi. 2011. Human mobility, social ties, and link prediction. In Proceedings of the 17th ACM International Conference on Knowledge Discovery and Data Mining. 1100--1108.
[29]
Jianshu Weng, Ee-Peng Lim, Jing Jiang, and Qi He. 2010. TwitterRank: Finding topic-sensitive influential Twitterers. In Proceedings of the 3rd ACM International Conference on Web Search and Data Mining. 261--270.
[30]
Q. Xu, E. W. Xiang, Q. Yang, J. Du, and J. Zhong. 2012. SMS spam detection using noncontent features. IEEE Intelligent Systems 27, 6, 44--51.
[31]
Jilong Xue, Zhi Yang, Xiaoyong Yang, Xiao Wang, Lijiang Chen, and Yafei Dai. 2013. VoteTrust: Leveraging friend invitation graph to defend against social network sybils. In Proceedings of the 32nd IEEE International Conference on Computer Communications. 2400--2408.
[32]
Lian Yan, Robert H. Dodier, Michael Mozer, and Richard H. Wolniewicz. 2003. Optimizing classifier performance via an approximation to the Wilcoxon-Mann-Whitney statistic. In Proceedings of the 20th International Conference on Machine Learning. 848--855.
[33]
Chao Yang, Robert Harkreader, Jialong Zhang, Seungwon Shin, and Guofei Gu. 2012. Analyzing spammers’ social networks for fun and profit: A case study of cyber criminal ecosystem on Twitter. In Proceedings of the 21st International Conference on World Wide Web. 71--80.
[34]
Zhi Yang, Christo Wilson, Xiao Wang, Tingting Gao, Ben Y. Zhao, and Yafei Dai. 2011. Uncovering social network sybils in the wild. In Proceedings of the 2011 ACM SIGCOMM Conference on Internet Measurement. ACM, New York, NY, 259--268.
[35]
Sarita Yardi, Daniel Romero, and Grant Schoenebeck. 2009. Detecting spam in a Twitter network. First Monday 15, 1.
[36]
Haifeng Yu, Phillip B. Gibbons, Michael Kaminsky, and Feng Xiao. 2008. SybilLimit: A near-optimal social network defense against sybil attacks. In Proceedings of the IEEE Symposium on Security and Privacy. 3--17.
[37]
Haifeng Yu, Michael Kaminsky, Phillip B. Gibbons, and Abraham Flaxman. 2006. SybilGuard: Defending against sybil attacks via social networks. Computer Communication Review 36, 4, 267--278.
[38]
L. L. Yu, S. Asur, and B. A. Huberman. 2012. Artificial inflation: The real story of trends and trend-setters in Sina Weibo. In Proceedings of the International Conference on Privacy, Security, Risk, and Trust, and the International Conference on Social Computing. 514--519.
[39]
Yin Zhu, Xiao Wang, Erheng Zhong, Nathan Nan Liu, He Li, and Qiang Yang. 2012. Discovering spammers in social networks. In Proceedings of the 26th AAAI Conference on Artificial Intelligence.

Cited By

View all
  • (2023)SybilHP: Sybil Detection in Directed Social Networks with Adaptive Homophily PredictionApplied Sciences10.3390/app1309534113:9(5341)Online publication date: 25-Apr-2023
  • (2023)Markov-Driven Graph Convolutional Networks for Social Spammer DetectionIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2022.315066935:12(12310-12322)Online publication date: 1-Dec-2023
  • (2023)Interpreting Graph-Based Sybil Detection Methods as Low-Pass FilteringIEEE Transactions on Information Forensics and Security10.1109/TIFS.2023.323736418(1225-1236)Online publication date: 2023
  • Show More Cited By

Index Terms

  1. Robust Spammer Detection in Microblogs: Leveraging User Carefulness

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Intelligent Systems and Technology
    ACM Transactions on Intelligent Systems and Technology  Volume 8, Issue 6
    Survey Paper, Regular Papers and Special Issue: Social Media Processing
    November 2017
    265 pages
    ISSN:2157-6904
    EISSN:2157-6912
    DOI:10.1145/3127339
    • Editor:
    • Yu Zheng
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 18 August 2017
    Accepted: 01 March 2017
    Revised: 01 May 2016
    Received: 01 December 2015
    Published in TIST Volume 8, Issue 6

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Spammer detection
    2. microblog
    3. social network

    Qualifiers

    • Research-article
    • Research
    • Refereed

    Funding Sources

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)19
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 14 Sep 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)SybilHP: Sybil Detection in Directed Social Networks with Adaptive Homophily PredictionApplied Sciences10.3390/app1309534113:9(5341)Online publication date: 25-Apr-2023
    • (2023)Markov-Driven Graph Convolutional Networks for Social Spammer DetectionIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2022.315066935:12(12310-12322)Online publication date: 1-Dec-2023
    • (2023)Interpreting Graph-Based Sybil Detection Methods as Low-Pass FilteringIEEE Transactions on Information Forensics and Security10.1109/TIFS.2023.323736418(1225-1236)Online publication date: 2023
    • (2023)Real-Time Detection of COVID-19 Events From Twitter: A Spatial-Temporally Bursty-Aware MethodIEEE Transactions on Computational Social Systems10.1109/TCSS.2022.316974210:2(656-672)Online publication date: Apr-2023
    • (2022)Fuz-Spam: Label Smoothing-Based Fuzzy Detection of Spammers in Internet of ThingsIEEE Transactions on Fuzzy Systems10.1109/TFUZZ.2021.313031130:11(4543-4554)Online publication date: 1-Nov-2022
    • (2022)Abnormal Behavior Analysis Based on Truth Discovery and Machine Learning2022 Global Conference on Robotics, Artificial Intelligence and Information Technology (GCRAIT)10.1109/GCRAIT55928.2022.00026(83-88)Online publication date: Jul-2022
    • (2022)A comprehensive survey of various methods in opinion spam detectionMultimedia Tools and Applications10.1007/s11042-022-13702-582:9(13199-13239)Online publication date: 5-Sep-2022
    • (2022)Semi-supervised internet water army detection based on graph embeddingMultimedia Tools and Applications10.1007/s11042-022-13633-182:7(9891-9912)Online publication date: 16-Sep-2022
    • (2022)SybilSort algorithm - a friend request decision tracking recommender system in online social networksApplied Intelligence10.1007/s10489-021-02578-x52:4(3995-4014)Online publication date: 1-Mar-2022
    • (2021)Robust Spammer Detection Using Collaborative Neural Network in Internet-of-Things ApplicationsIEEE Internet of Things Journal10.1109/JIOT.2020.30038028:12(9549-9558)Online publication date: 15-Jun-2021
    • Show More Cited By

    View Options

    Get Access

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media