skip to main content
10.3115/1119250.1119262dlproceedingsArticle/Chapter ViewAbstractPublication PagessighanConference Proceedingsconference-collections
Article
Free access

Building a large Chinese corpus annotated with semantic dependency

Published: 11 July 2003 Publication History

Abstract

At present most of corpora are annotated mainly with syntactic knowledge. In this paper, we attempt to build a large corpus and annotate semantic knowledge with dependency grammar. We believe that words are the basic units of semantics, and the structure and meaning of a sentence consist mainly of a series of semantic dependencies between individual words. A 1,000,000-word-scale corpus annotated with semantic dependency has been built. Compared with syntactic knowledge, semantic knowledge is more difficult to annotate, for ambiguity problem is more serious. In the paper, the strategy to improve consistency is addressed, and congruence is defined to measure the consistency of tagged corpus. Finally, we will compare our corpus with other well-known corpora.

References

[1]
Collin F. Baker, Charles J. Fillmore, John B. Lowe. 1998. The Berkeley FrameNet Project, Proceedings of the COLING-ACL, Montreal, Canada.
[2]
Zhendong Dong and Qiang Dong. 2001. Construction of a Knowledge System and its Impact on Chinese Research, Contemporary Linguistics, 3: 33--44, Beijing.
[3]
Chu-Ren Huang and Keh-jiann Chen. 1992. A Chinese Corpus for Linguistics Research. In the Proceedings of the COLING. 1214--1217. Nantes, France
[4]
Richard Hudson. 1998. Word Grammars, Dependency and Valency, An International Handbook of Contemporary Research. Edited by Vilmos Agel, Ludwig M. Eichinger, etc. Berlin, Walter de Gruyter.
[5]
Tom B. Y. Lai and Changning Huang. 2000. Dependency-based syntactic analysis of Chinese and Annotation of Parsed Corpus. The 38th Annual Meeting of the Association for Computational Linguistics, Hong Kong.
[6]
Juanzi Li and Zuoying Wang. 2002. Chinese Statistcal Parser Based on Semantic Dependecies, Tsinghua Science and Technology, 7(6): 591--595.
[7]
Mingqin Li, Fang You, Juanzi Li, Zuoying Wang. 2002. Manual of Tagging Semantic Dependency (third Version), Technical Report, Tsinghua University, Department of Electronic Engineering.
[8]
Mitchell P. Marcus, Beatrice Santorini, Mary Ann Marcinkiewicz. 1993. Building a large annotated corpus of English: The Penn Treebank. Computational Linguistics, 19(2): 313--330.
[9]
Jiaju Mei, Yiming Zhu and YunQi Gao, Yin Hongxiang, Edited. 1983. Tongyici Cilin (Dictionary of Synonymous Words), Shanghai Cishu Publisher.
[10]
George A. Miller, Richard Beckwith, Christiane Fellbaum, Derek Gross, and Katherine Miller. 1993. Introduction to WordNet: An On-line Lexical Database, Five papers on WordNet, CSL Report 43, Cognitive Science Laboratory. Princeton University.
[11]
Jane J. Robinson. 1970. Dependency Structures and Transformation Rules. Lanuage, 46: 259--285.
[12]
Fei Xia, Martha Palmer, Nianwen Xue, Mary Ellen Okurowski, John Kovarik, Fu-Dong Chiou, Shizhe Huang, Tony Kroch, and Mitch Marcus. 2000. Proceedings of the second International Conference on Language Resources and Evaluation (LREC 2000), Athens, Greece.
[13]
Guowei Yan and Huimin Tan. 1999. Corpus Annotating Mannual Based on HowNet (Jiyu ZhiWang de Yuliao Biaozhu Shouce), Technical Report, the Department of computer science, Hong Kong University of Sience of Techonolgy. http://www.keenage.com
[14]
Fang You, Juanzi Li and Zuoying Wang. 2002. An approach Based HowNet for Extracting Chinese Message Structure, Computer Engineering and Applications, 38: 56--58.
[15]
Jianping Zhang. 1999. A Study of Language Model and Understanding Algorithm for Large Vocabulary Spontaneous Speech Recognition. Doctor Dissertation, The Department of Electronic Engineering, Tsinghua University, Beijing.

Cited By

View all
  • (2014)A Semantics Oriented Grammar for Chinese TreebankingProceedings of the 15th International Conference on Computational Linguistics and Intelligent Text Processing - Volume 840310.1007/978-3-642-54906-9_30(366-378)Online publication date: 6-Apr-2014
  • (2012)SemEval-2012 task 5Proceedings of the First Joint Conference on Lexical and Computational Semantics - Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation10.5555/2387636.2387696(378-384)Online publication date: 7-Jun-2012
  • (2006)A chinese corpus with word sense annotationProceedings of the 21st international conference on Computer Processing of Oriental Languages: beyond the orient: the research challenges ahead10.1007/11940098_43(414-421)Online publication date: 17-Dec-2006

Recommendations

Comments

Information & Contributors

Information

Published In

cover image DL Hosted proceedings
SIGHAN '03: Proceedings of the second SIGHAN workshop on Chinese language processing - Volume 17
July 2003
193 pages

Publisher

Association for Computational Linguistics

United States

Publication History

Published: 11 July 2003

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)40
  • Downloads (Last 6 weeks)11
Reflects downloads up to 15 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2014)A Semantics Oriented Grammar for Chinese TreebankingProceedings of the 15th International Conference on Computational Linguistics and Intelligent Text Processing - Volume 840310.1007/978-3-642-54906-9_30(366-378)Online publication date: 6-Apr-2014
  • (2012)SemEval-2012 task 5Proceedings of the First Joint Conference on Lexical and Computational Semantics - Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation10.5555/2387636.2387696(378-384)Online publication date: 7-Jun-2012
  • (2006)A chinese corpus with word sense annotationProceedings of the 21st international conference on Computer Processing of Oriental Languages: beyond the orient: the research challenges ahead10.1007/11940098_43(414-421)Online publication date: 17-Dec-2006

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media