skip to main content
10.1145/3397271.3401411acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
research-article

QuAChIE: Question Answering based Chinese Information Extraction System

Published: 25 July 2020 Publication History

Abstract

In this paper, we present the design of QuAChIE, a Question Answering based Chinese Information Extraction system. QuAChIE mainly depends on a well-trained question answering model to extract high-quality triples. The group of head entity and relation are regarded as a question given the input text as the context. For the training and evaluation of each model in the system, we build a large-scale information extraction dataset using Wikidata and Wikipedia pages by distant supervision. The advanced models implemented on top of the pre-trained language model and the enormous distant supervision data enable QuAChIE to extract relation triples from documents with cross-sentence correlations. The experimental results on the test set and the case study based on the interactive demonstration show its satisfactory Information Extraction quality on Chinese document-level texts.

References

[1]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018).
[2]
Weipeng Huang, Xingyi Cheng, Kunlong Chen, Taifeng Wang, and Wei Chu. 2019. Toward fast and accurate neural chinese word segmentation with multi-criteria learning. arXiv preprint arXiv:1903.04190 (2019).
[3]
Shengbin Jia, Shijia E, Maozhen Li, and Yang Xiang. 2018. Chinese open relation extraction and knowledge base establishment. ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP), Vol. 17, 3 (2018), 1--22.
[4]
Xiaoya Li, Fan Yin, Zijun Sun, Xiayu Li, Arianna Yuan, Duo Chai, Mingxin Zhou, and Jiwei Li. 2019 b. Entity-relation extraction as multi-turn question answering. arXiv preprint arXiv:1905.05529 (2019).
[5]
Ziran Li, Ning Ding, Zhiyuan Liu, Haitao Zheng, and Ying Shen. 2019 a. Chinese Relation Extraction with Multi-Grained Information and External Linguistic Knowledge. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Florence, Italy, 4377--4386. https://doi.org/10.18653/v1/P19--1430
[6]
Lin Qiu, Hao Zhou, Yanru Qu, Weinan Zhang, Suoheng Li, Shu Rong, Dongyu Ru, Lihua Qian, Kewei Tu, and Yong Yu. 2018. QA4IE: A Question Answering based Framework for Information Extraction. arXiv preprint arXiv:1804.03396 (2018).
[7]
Jonathan Raphael Raiman and Olivier Michel Raiman. 2018. Deeptype: multilingual entity linking by neural type system evolution. In Thirty-Second AAAI Conference on Artificial Intelligence .
[8]
Xiang Ren, Zeqiu Wu, Wenqi He, Meng Qu, Clare R Voss, Heng Ji, Tarek F Abdelzaher, and Jiawei Han. 2017. CoType: Joint extraction of typed entities and relations with knowledge bases. In Proceedings of WWW. 1015--1024.
[9]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Advances in neural information processing systems. 5998--6008.
[10]
Denny Vrandevc ić and Markus Krötzsch. 2014. Wikidata: a free collaborative knowledge base. (2014).
[11]
Jingjing Xu, Ji Wen, Xu Sun, and Qi Su. 2017. A Discourse-Level Named Entity Recognition and Relation Extraction Dataset for Chinese Literature Text. CoRR, Vol. abs/1711.07010.
[12]
Yue Zhang and Jie Yang. 2018. Chinese ner using lattice lstm. arXiv preprint arXiv:1805.02023 (2018).
[13]
Suncong Zheng, Feng Wang, Hongyun Bao, Yuexing Hao, Peng Zhou, and Bo Xu. 2017. Joint Extraction of Entities and Relations Based on a Novel Tagging Scheme. In ACL, Vol. 1. 1227--1236.

Cited By

View all
  • (2024)Combining Sentence-based Relational Features with Biaffine Mechanism for Triple ExtractionProceedings of the 2024 16th International Conference on Machine Learning and Computing10.1145/3651671.3651747(121-127)Online publication date: 2-Feb-2024
  • (2023)Aspect-level Information Discrepancies across Heterogeneous Vulnerability Reports: Severity, Types and Detection MethodsACM Transactions on Software Engineering and Methodology10.1145/362473433:2(1-38)Online publication date: 22-Dec-2023
  • (2023)Do-GOOD: Towards Distribution Shift Evaluation for Pre-Trained Visual Document Understanding ModelsProceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3539618.3591670(569-579)Online publication date: 19-Jul-2023
  • Show More Cited By

Index Terms

  1. QuAChIE: Question Answering based Chinese Information Extraction System

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      SIGIR '20: Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval
      July 2020
      2548 pages
      ISBN:9781450380164
      DOI:10.1145/3397271
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 25 July 2020

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. information extraction
      2. question answering

      Qualifiers

      • Research-article

      Funding Sources

      • NSFC

      Conference

      SIGIR '20
      Sponsor:

      Acceptance Rates

      Overall Acceptance Rate 792 of 3,983 submissions, 20%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)7
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 14 Sep 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Combining Sentence-based Relational Features with Biaffine Mechanism for Triple ExtractionProceedings of the 2024 16th International Conference on Machine Learning and Computing10.1145/3651671.3651747(121-127)Online publication date: 2-Feb-2024
      • (2023)Aspect-level Information Discrepancies across Heterogeneous Vulnerability Reports: Severity, Types and Detection MethodsACM Transactions on Software Engineering and Methodology10.1145/362473433:2(1-38)Online publication date: 22-Dec-2023
      • (2023)Do-GOOD: Towards Distribution Shift Evaluation for Pre-Trained Visual Document Understanding ModelsProceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3539618.3591670(569-579)Online publication date: 19-Jul-2023
      • (2023)BT-CKBQA: An efficient approach for Chinese knowledge base question answeringData & Knowledge Engineering10.1016/j.datak.2023.102204147(102204)Online publication date: Sep-2023

      View Options

      Get Access

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media