skip to main content
10.3115/980845.980961dlproceedingsArticle/Chapter ViewAbstractPublication PagesaclConference Proceedingsconference-collections
Article
Free access

Large scale collocation data and their application to Japanese word processor technology

Published: 10 August 1998 Publication History

Abstract

Word processors or computers used in Japan employ Japanese input method through keyboard stroke combined with Kana (phonetic) character to Kanji (ideographic, Chinese) character conversion technology. The key factor of Kana-to-Kanji conversion technology is how to raise the accuracy of the conversion through the homophone processing, since we have so many homophonic Kanjis. In this paper, we report the results of our Kana-to-Kanji conversion experiments which embody the homophone processing based on large scale collocation data. It is shown that approximately 135,000 collocations yield 9.1 % raise of the conversion accuracy compared with the prototype system which has no collocation data.

References

[1]
Shudo, K. et al., 1980. Morphological Aspect of Japanese Language Processing. in Proc. of 8th Internat. Conf. on Computational Linguistics(COLING80)
[2]
Oshima, Y. et al., 1986. A Disambiguation Method in Kana-to-Kanji Conversion Using Case Frame Grammar. in Trans. of IPSJ, 27--7. (in Japanese)
[3]
Kobayashi, T. et al., 1986. Realization of Kana-to-Kanji Conversion Using Neural Networks. in Toshiba Review, 47--11. (in Japanese)
[4]
Yoshimura, K. et al., 1987. Morphological Analysis of Japanese Sentences using the Least Cost Method. in IPSJ SIG NL-60. (in Japanese)
[5]
Shudo, K. et al., 1988. On the Idiomatic Expressions in Japanese Language. in IPSJ SIG NL-66. (in Japanese)
[6]
Church, K. W. et al, 1990. Word Association Norms, Mutual Information, and Lexicography. in Computational Linguistics, 16.
[7]
Yamamoto, K. et al., 1992. Kana-to-Kanji Conversion Using Co-occurrence Groups. in Proc. of 44th Conf. of IPSJ. (in Japanese)
[8]
Ikehara, S. et al., 1996. A Statistical Method for Extracting Uninterrupted and Interrupted Collocations from Very Large Corpora. in Proc. of 16th Internat. Conf. on Computational Linguistics (COLING 96)
[9]
Viterbi, A., J., 1967, Error Bounds for Convolutional Codes and an Asymptotically Optimal Decoding Algorithm. in IEEE Trans. on Information Theory 13.

Cited By

View all
  • (2011)A comprehensive dictionary of multiword expressionsProceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 110.5555/2002472.2002494(161-170)Online publication date: 19-Jun-2011
  1. Large scale collocation data and their application to Japanese word processor technology

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image DL Hosted proceedings
      ACL '98/COLING '98: Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics - Volume 1
      August 1998
      768 pages

      Sponsors

      • Government of Canada
      • Université de Montréal

      Publisher

      Association for Computational Linguistics

      United States

      Publication History

      Published: 10 August 1998

      Qualifiers

      • Article

      Acceptance Rates

      Overall Acceptance Rate 85 of 443 submissions, 19%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)34
      • Downloads (Last 6 weeks)15
      Reflects downloads up to 21 Sep 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2011)A comprehensive dictionary of multiword expressionsProceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 110.5555/2002472.2002494(161-170)Online publication date: 19-Jun-2011

      View Options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Get Access

      Login options

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media