Abstract
The common use of null arguments is one of the most critical issues in pro-drop languages. When translating Korean into other languages, the omitted elements should be replaced with appropriate pronouns to get grammatical target sentences. One of the most important issues when dealing with zero pronouns is to determine the referentiality of zero pronouns. Since, like expletive ‘it’ in English, omitted elements do not have always explicit referents, it is important to determine whether a zero pronoun is referential or not. In this paper, we focus on identifying non-referential zero pronouns. Since non-referential zero pronouns are likely to occur in similar contexts, referentiality determination in this paper is regarded as the identification of clauses containing non-referential zero pronouns. Our method outperforms the baseline systems using n-grams and bag of words, and achieves the F-measure of 0.51 and 0.78.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Halliday, M.A.K., Hasan, R.: Cohesion in English. London Publishing Group (1976)
Grosz, B.J., Joshi, A.K., Weinstein, S.: Centering: A Framework for Modeling the Local Coherence of Discourse. Computational Linguistics 21(2), 203–225 (1995)
Haussler, D.: Convolution Kernels on Discrete Structures. UCS-CRL-99-10, UC Santa Cruz (1999)
Joachims, T.: Making large-Scale SVM Learning Practical. In: Scholkopf, B., Burges, C., Smola, A. (eds.) Advances in Kernel Methods - Support Vector Learning. MIT Press, Cambridge (1999)
Japkowicz, N.: The class imbalance problem: Significance and strategies. In: The International Conference on Artificial Intelligence, Las Vegas (2000)
Collins, M., Duffy, N.: Convolution Kernels for Natural Language. In: Neural Information Processing Systems (NIPS), pp. 625–632 (2001)
Soon, W.M., Ng, H.T., Lim, D.C.Y.: A Machine Learning Approach to Coreference Resolution of Noun Phrases. Computational Lingusitics 27(4), 521–544 (2001)
Evans, R.: Applying machine learning toward an automatic classification of it. Literary and Linguistic Computing 16(1), 45–57 (2002)
Kotsiantis, S.B., Pintelas, P.E.: Mixture of Expert Agents for Handling Imbalanced Data Sets. Annals of Mathematics, Computing & Teleinformatics 1(1), 46–55 (2003)
Ng, V.: Learning noun phrase anaphoricity to improve coreference resolution: Issues in representation and optimization. In: 42nd Annual Meeting of the Association for Computational Linguistics (ACL), pp. 152–159 (2004)
Iida, R., Inui, K., Matsumoto, Y.: Anaphora Resolution by Antecedent Identification Followed by Anaphoricity Determination. ACM Transactions on Asian Language Information Processing 4(4), 417–434 (2005)
Han, N.-R.: Korean Zero Pronous: Analysis and Resolution. Doctoral dissertation, Department of Linguistics at the University of Pennsylvania (2006)
Moschitti, A.: Making Tree Kernels Practical for Natural Language Learning. In: 11th International Conference on European Association for Computational Linguistics, pp. 113–120 (2006)
Roh, J.-E., Lee, J.-H.: Generation of Zero Pronouns Based on the Centering Theory and Pairwise Salience of Entities. IEICE Transactions on Information and Systems E89-D(2), 837–846 (2006)
Iida, R., Inui, K., Matsumoto, Y.: Zero-Anaphora Resolution by Learning Rich Syntactic Pattern Features. ACM Transactions on Asian Language Information Processing, article 12, 6(4) (2007)
Zhao, S., Ng, H.T.: Identification and Resolution of Chinese Zero Pronouns: A Machine Learning Approach. In: 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pp. 541–550 (2007)
Bergsma, S., Lin, D., Gorbel, R.: Distributional Identification of Non-Referential Pronouns. In: ACL-HLT 2008, Columbus, Ohio, pp. 10–18 (2008)
Kim, K.-S., Park, S.-B., Song, H.-J., Park, S.-Y., Lee, S.-J.: Identification of Subject Shareness for Korean-English Machine Translation. In: 10th Pacific Rim International Conference on Artificial Intelligence, pp. 211–222 (2008)
Yang, X., Su, J., Tan, C.L.: A Twin-Candidate Model for Learning-Based Anaphora Resolution. Computational Linguistics 34(3), 3270–3356 (2008)
Iida, R., Inui, K., Matsumoto, Y.: Capturing Salience with a Trainable Cache Model for Zero-anaphora Resolution. In: Joint Conference of the 47th Annual Meeting of the Association for Computational Linguistics and the 4th International Joint Conference on Natural Language Processing, pp. 647–655 (2009)
Wu, D., Liang, T.: Zero Anaphora Resolution by Case-based Reasoning and Pattern Conceptualization. Expert Systems with Applications 36(4), 7544–7551 (2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kim, KS., Park, SB., Song, HJ., Park, S.Y., Lee, SJ. (2010). Identification of Non-referential Zero Pronouns for Korean-English Machine Translation. In: Zhang, BT., Orgun, M.A. (eds) PRICAI 2010: Trends in Artificial Intelligence. PRICAI 2010. Lecture Notes in Computer Science(), vol 6230. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15246-7_13
Download citation
DOI: https://doi.org/10.1007/978-3-642-15246-7_13
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15245-0
Online ISBN: 978-3-642-15246-7
eBook Packages: Computer ScienceComputer Science (R0)