skip to main content
10.1145/3501247.3539504acmconferencesArticle/Chapter ViewAbstractPublication PageswebsciConference Proceedingsconference-collections
introduction
Free access

Documenting Web Data for Social Research (#DocuWeb22): A participatory workshop for developing structured and reusable practices

Published: 26 June 2022 Publication History

Abstract

With this half-day, in-person workshop, we attempt to collaboratively discover best practices as well as frequent pitfalls encountered when working with Web data. Participants will first be presented with different perspectives on the significance of data quality in this specific context and familiarized with existing, structured approaches for the critical reflection on and documentation of data collection processes, before being invited to share their own experiences with the collection, use and documentation of Web data. We hope to thereby inspire participants to further integrate data documentation practices into their research processes, and for us to learn from the participants’ experiences in order to improve upon existing documentation frameworks for Web data. More details of the workshop, including the planned activities can be found at https://frohleon.github.io/DocuWeb22/.

References

[1]
Ashley Amaya, Paul P Biemer, and David Kinyon. 2020. Total error in a big data world: adapting the TSE framework to big data. Journal of Survey Statistics and Methodology 8, 1(2020), 89–119.
[2]
Emily M Bender and Batya Friedman. 2018. Data statements for natural language processing: Toward mitigating system bias and enabling better science. Transactions of the Association for Computational Linguistics 6 (2018), 587–604.
[3]
Timnit Gebru, Jamie Morgenstern, Briana Vecchione, Jennifer Wortman Vaughan, Hanna Wallach, Hal Daumé Iii, and Kate Crawford. 2021. Datasheets for datasets. Commun. ACM 64, 12 (2021), 86–92.
[4]
Yuli Patrick Hsieh and Joe Murphy. 2017. Total twitter error. Total survey error in practice 74 (2017), 23–46.
[5]
Margaret Mitchell, Simone Wu, Andrew Zaldivar, Parker Barnes, Lucy Vasserman, Ben Hutchinson, Elena Spitzer, Inioluwa Deborah Raji, and Timnit Gebru. 2019. Model cards for model reporting. In Proceedings of the conference on fairness, accountability, and transparency. 220–229.
[6]
Brent Daniel Mittelstadt, Patrick Allo, Mariarosaria Taddeo, Sandra Wachter, and Luciano Floridi. 2016. The ethics of algorithms: Mapping the debate. Big Data & Society 3, 2 (2016), 2053951716679679.
[7]
Alexandra Olteanu, Carlos Castillo, Fernando Diaz, and Emre Kıcıman. 2019. Social data: Biases, methodological pitfalls, and ethical boundaries. Frontiers in Big Data 2(2019), 13.
[8]
Indira Sen, Fabian Flöck, Katrin Weller, Bernd Weiß, and Claudia Wagner. 2021. A total error framework for digital traces of human behavior on online platforms. Public Opinion Quarterly 85, S1 (2021), 399–422.

Index Terms

  1. Documenting Web Data for Social Research (#DocuWeb22): A participatory workshop for developing structured and reusable practices
        Index terms have been assigned to the content through auto-classification.

        Recommendations

        Comments

        Information & Contributors

        Information

        Published In

        cover image ACM Conferences
        WebSci '22: Proceedings of the 14th ACM Web Science Conference 2022
        June 2022
        479 pages
        ISBN:9781450391917
        DOI:10.1145/3501247
        Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

        Sponsors

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 26 June 2022

        Check for updates

        Author Tags

        1. data collection
        2. data quality
        3. dataset documentation
        4. guidelines
        5. web data

        Qualifiers

        • Introduction
        • Research
        • Refereed limited

        Conference

        WebSci '22
        Sponsor:
        WebSci '22: 14th ACM Web Science Conference 2022
        June 26 - 29, 2022
        Barcelona, Spain

        Acceptance Rates

        Overall Acceptance Rate 245 of 933 submissions, 26%

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • 0
          Total Citations
        • 58
          Total Downloads
        • Downloads (Last 12 months)32
        • Downloads (Last 6 weeks)14
        Reflects downloads up to 15 Sep 2024

        Other Metrics

        Citations

        View Options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        HTML Format

        View this article in HTML Format.

        HTML Format

        Get Access

        Login options

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media