research-article

A Neural Joint Model with BERT for Burmese Syllable Segmentation, Word Segmentation, and POS Tagging

Authors:

Shengxiang Gao,

Hongbin WangAuthors Info & Claims

Transactions on Asian and Low-Resource Language Information Processing, Volume 20, Issue 4

Article No.: 54, Pages 1 - 23

https://doi.org/10.1145/3436818

Published: 26 May 2021 Publication History

Abstract

The smallest semantic unit of the Burmese language is called the syllable. In the present study, it is intended to propose the first neural joint learning model for Burmese syllable segmentation, word segmentation, and part-of-speech (POS) tagging with the BERT. The proposed model alleviates the error propagation problem of the syllable segmentation. More specifically, it extends the neural joint model for Vietnamese word segmentation, POS tagging, and dependency parsing [28] with the pre-training method of the Burmese character, syllable, and word embedding with BiLSTM-CRF-based neural layers. In order to evaluate the performance of the proposed model, experiments are carried out on Burmese benchmark datasets, and we fine-tune the model of multilingual BERT. Obtained results show that the proposed joint model can result in an excellent performance.

References

[1]

Chris Alberti, Kenton Lee, and Michael Collins. 2019. A BERT baseline for the natural questions. arXiv: Computation and Language.

[2]

Cunli Mao, Zhibo Man, Zhengtao Yu, Zhenhan Wang, Shengxiang Gao, and Yafei Zhang. 2020. A Burmese dependency parsing method based on transfer learning. In Proceedings of the 2020 International Conference on Asian Language Processing (IALP’20). IEEE, 92–97.

[3]

Bernd Bohnet, Ryan McDonald, Goncalo Simoes, Daniel Andor, Emily Pitler, and Joshua Maynez. 2018. Morphosyntactic tagging with a meta-BiLSTM model over context sensitive token encodings. arXiv:1805.08237.

[4]

Wanxiang Che, Yijia Liu, Yuxuan Wang, Bo Zheng, and Ting Liu. 2018. Towards better UD parsing: Deep contextualized word embeddings, ensemble, and treebank concatenation. arXiv:1807.03121.

[5]

Xinchi Chen, Xipeng Qiu, and Xuanjing Huang. 2016. A feature-enriched neural model for joint Chinese word segmentation and part-of-speech tagging. arXiv:1611.05384.

[6]

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. BERT: Pre-training of deep bidirectional transformers for language understanding. arXiv:1810.04805.

[7]

Chenchen Ding, Hnin Thu Zar Aye, Win Pa Pa, Khin Thandar Nwet, Khin Mar Soe, Masao Utiyama, and Eiichiro Sumita. 2019. Towards Burmese (Myanmar) morphological analysis: Syllable-based tokenization and part-of-speech tagging. ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP) 19, 1 (2019), 1–34.

Digital Library

[8]

Chenchen Ding, Ye Kyaw Thu, Masao Utiyama, and Eiichiro Sumita. 2016. Word segmentation for Burmese (Myanmar). ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP) 15, 4 (2016), 1–10.

Digital Library

[9]

Chenchen Ding, Masao Utiyama, and Eiichiro Sumita. 2018. NOVA: A feasible and flexible annotation system for joint tokenization and part-of-speech tagging. ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP) 18, 2 (2018), 1–18.

Digital Library

[10]

Chenchen Ding, Sann Su Su Yee, Win Pa Pa, Khin Mar Soe, Masao Utiyama, and Eiichiro Sumita. 2020. A Burmese (Myanmar) treebank: Guideline and analysis. ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP) 19, 3 (2020), 1–13.

Digital Library

[11]

Jun Hatori, Takuya Matsuzaki, Yusuke Miyao, and Jun’ichi Tsujii. 2012. Incremental joint approach to word segmentation, POS tagging, and dependency parsing in Chinese. In Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers-Volume 1. Association for Computational Linguistics, 1045–1053.

Digital Library

[12]

Tin Htay Hlaing and Yoshiki Mikami. 2014. Automatic syllable segmentation of Myanmar texts using finite state transducer. ICTer 6, 2 (2014).

[13]

Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural Computation 9, 8 (1997), 1735–1780.

Digital Library

[14]

Hla Hla Htay and Kavi Narayana Murthy. 2008. Myanmar word segmentation using syllable level longest matching. In Proceedings of the 6th Workshop on Asian Language Resources.

[15]

Zhiheng Huang, Wei Xu, and Kai Yu. 2015. Bidirectional LSTM-CRF models for sequence tagging. arXiv:1508.01991.

[16]

Wenbin Jiang, Liang Huang, Qun Liu, and Yajuan Lü. 2008. A cascaded linear model for joint Chinese word segmentation and part-of-speech tagging. In Proceedings of ACL-08: HLT. 897–904.

Digital Library

[17]

Zhanming Jie and Wei Lu. 2019. Dependency-guided LSTM-CRF for named entity recognition. arXiv:1909.10148.

[18]

Dan Kondratyuk and Milan Straka. 2019. 75 languages, 1 model: Parsing universal dependencies universally. arXiv:1904.02099.

[19]

Canasai Kruengkrai, Kiyotaka Uchimoto, Jun’ichi Kazama, Yiou Wang, Kentaro Torisawa, and Hitoshi Isahara. 2009. An error-driven word-character hybrid model for joint Chinese word segmentation and POS tagging. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1-Volume 1. Association for Computational Linguistics, 513–521.

Digital Library

[20]

John Lafferty. 2001. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proceedings of ICML.

Digital Library

[21]

Guillaume Lample, Miguel Ballesteros, Sandeep Subramanian, Kazuya Kawakami, and Chris Dyer. 2016. Neural architectures for named entity recognition. arXiv:1603.01360.

[22]

Yang Liu. 2019. Fine-tune BERT for extractive summarization. arXiv:1903.10318.

[23]

Zin Maung Maung and Yoshiki Mikami. 2008. A rule-based syllable segmentation of Myanmar text. In Proceedings of the IJCNLP-08 Workshop on NLP for Less Privileged Languages.

[24]

Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient estimation of word representations in vector space. arXiv:1301.3781.

[25]

Aye Myat Mon, Soe Lai Phyue, Myint Myint Thein, Su Su Htay, and Thinn Thinn Win. 2010. Analysis of Myanmar word boundary and segmentation by using statistical approach. In Proceedings of the 2010 3rd International Conference on Advanced Computer Theory and Engineering (ICACTE’10), Vol. 5. IEEE, V5-233–V5-237.

[26]

Cynthia Myint. 2011. A hybrid approach for part-of-speech tagging of Burmese texts. In Proceedings of the 2011 International Conference on Computer and Management (CAMAN’11). IEEE, 1–4.

[27]

Phyu Hninn Myint, Tin Myat Htwe, and Ni Lar Thein. 2011. Bigram part-of-speech tagger for Myanmar language. In Proceedings of 2011 International Conference on Information Communication and Management, Singapore. 147–152.

[28]

Dat Quoc Nguyen. [n.d.]. A neural joint model for Vietnamese word segmentation, POS tagging and dependency parsing.

[29]

Rodrigo Nogueira and Kyunghyun Cho. 2019. Passage re-ranking with BERT.

[30]

Matthew E. Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, and Luke Zettlemoyer. 2018. Deep contextualized word representations. arXiv:1802.05365.

[31]

Myat Lay Phyu and Kiyota Hashimoto. 2017. Burmese word segmentation with character clustering and CRFs. In Proceedings of the 2017 14th International Joint Conference on Computer Science and Software Engineering (JCSSE’17). IEEE, 1–6.

[32]

Tao Qian, Yue Zhang, Meishan Zhang, Yafeng Ren, and Donghong Ji. 2015. A transition-based model for joint segmentation, POS-tagging and normalization. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. 1837–1846.

[33]

Xian Qian and Yang Liu. 2012. Joint Chinese word segmentation, POS tagging and parsing. In Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. Association for Computational Linguistics, 501–511.

Digital Library

[34]

Xipeng Qiu, Tianxiang Sun, Yige Xu, Yunfan Shao, Ning Dai, and Xuanjing Huang. 2020. Pre-trained models for natural language processing: A survey. Science China Technological Sciences (2020), 1–26.

[35]

Nuo Qun, Hang Yan, Xipeng Qiu, and Xuanjing Huang. 2020. Chinese word segmentation via BiLSTM+Semi-CRF with relay node. Journal of Computer Science and Technology 35, 5 (2020), 1115–1126.

[36]

Lin Songkai, Mao Cunli, Yu Zhengtao, Guo Jianyi, Wang Hongbin, and Zhang Jiafu. 2018. A method of Myanmar word segmentation based on convolution neural network. Journal of Chinese Information Processing 6 (2018), 8.

[37]

Chen Sun, Austin Myers, Carl Vondrick, Kevin Murphy, and Cordelia Schmid. 2019. VideoBERT: A joint model for video and language representation learning. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 7464–7473.

[38]

Chi Sun, Xipeng Qiu, Yige Xu, and Xuanjing Huang. 2019. How to fine-tune BERT for text classification? CoRR abs/1905.05583 (2019). arxiv:1905.05583. http://arxiv.org/abs/1905.05583

[39]

Weiwei Sun. 2011. A stacked sub-word model for joint Chinese word segmentation and part-of-speech tagging. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies-Volume 1. Association for Computational Linguistics, 1385–1394.

Digital Library

[40]

Ian Tenney, Dipanjan Das, and Ellie Pavlick. 2019. Bert rediscovers the classical nlp pipeline. arXiv:1905.05950.

[41]

Tun Thura Thet, Jin-Cheon Na, and Wunna Ko Ko. 2008. Word segmentation for the Myanmar language. Journal of Information Science 34, 5 (2008), 688–704.

Digital Library

[42]

Ye Kyaw Thu, Andrew Finch, Eiichiro Sumita, and Yoshinori Sagisaka. 2014. Integrating dictionaries into an unsupervised model for Myanmar word segmentation. In Proceedings of the Fifth Workshop on South and Southeast Asian Natural Language Processing. 20–27.

[43]

Ye Kyaw Thu, Win Pa Pa, Masao Utiyama, Andrew Finch, and Eiichiro Sumita. 2016. Introducing the Asian language treebank (ALT). In Proceedings of the 10th International Conference on Language Resources and Evaluation (LREC’16). 1574–1578.

[44]

Yuxuan Wang, Wanxiang Che, Jiang Guo, Yijia Liu, and Ting Liu. 2019. Cross-lingual BERT transformation for zero-shot dependency parsing. arXiv:1909.06775.

[45]

Liner Yang, Meishan Zhang, Yang Liu, Maosong Sun, Nan Yu, and Guohong Fu. 2017. Joint POS tagging and dependence parsing with transition-based neural networks. IEEE/ACM Transactions on Audio, Speech, and Language Processing 26, 8 (2017), 1352–1358.

Digital Library

[46]

Meishan Zhang, Nan Yu, and Guohong Fu. 2018. A simple and effective neural model for joint word segmentation and POS tagging. IEEE/ACM Transactions on Audio, Speech, and Language Processing 26, 9 (2018), 1528–1538.

Digital Library

[47]

Meishan Zhang, Yue Zhang, Wanxiang Che, and Ting Liu. 2014. Character-level Chinese dependency parsing. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 1326–1336.

[48]

Shaoning Zhang, Cunli Mao, Zhengtao Yu, Hongbin Wang, Zhongwei Li, and Jiafu Zhang. 2018. Word segmentation for Burmese based on dual-layer CRFs. ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP) 18, 1 (2018), 1–11.

Digital Library

[49]

Yue Zhang and Stephen Clark. 2008. Joint word segmentation and POS tagging using a single perceptron. In Proceedings of ACL-08: HLT. 888–896.

[50]

Yue Zhang and Stephen Clark. 2010. A fast decoder for joint word segmentation and POS-tagging using a single discriminative model. In Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 843–852.

Digital Library

[51]

Xiaoqing Zheng, Hanyang Chen, and Tianyu Xu. 2013. Deep learning for Chinese word segmentation and POS tagging. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing. 647–657.

Cited By

Jayanth KBharathi Mohan GKumar R(2023)Indian Language Analysis with XLM-RoBERTa: Enhancing Parts of Speech Tagging for Effective Natural Language Preprocessing2023 Seventh International Conference on Image Information Processing (ICIIP)10.1109/ICIIP61524.2023.10537689(850-854)Online publication date: 22-Nov-2023
https://doi.org/10.1109/ICIIP61524.2023.10537689
Guven Z(2022)The Comparison of Language Models with a Novel Text Filtering Approach for Turkish Sentiment AnalysisACM Transactions on Asian and Low-Resource Language Information Processing10.1145/355789222:2(1-16)Online publication date: 27-Dec-2022
https://dl.acm.org/doi/10.1145/3557892
Jin YTao SLiu QLiu X(2022)A BiLSTM-CRF Based Approach to Word Segmentation in Chinese2022 IEEE Intl Conf on Dependable, Autonomic and Secure Computing, Intl Conf on Pervasive Intelligence and Computing, Intl Conf on Cloud and Big Data Computing, Intl Conf on Cyber Science and Technology Congress (DASC/PiCom/CBDCom/CyberSciTech)10.1109/DASC/PiCom/CBDCom/Cy55231.2022.9927991(1-4)Online publication date: 12-Sep-2022
https://doi.org/10.1109/DASC/PiCom/CBDCom/Cy55231.2022.9927991

Index Terms

A Neural Joint Model with BERT for Burmese Syllable Segmentation, Word Segmentation, and POS Tagging
1. Computer systems organization
  1. Dependable and fault-tolerant systems and networks
    1. Redundancy
  2. Embedded and cyber-physical systems
    1. Embedded systems
    2. Robotics
2. Networks
  1. Network properties
    1. Network reliability

Recommendations

Word Segmentation for Burmese (Myanmar)

Experiments on various word segmentation approaches for the Burmese language are conducted and discussed in this note. Specifically, dictionary-based, statistical, and machine learning approaches are tested. Experimental results demonstrate that ...
Word Segmentation of Hiragana Sentences Using Hiragana BERT
PRICAI 2023: Trends in Artificial Intelligence
Abstract
Unlike Western languages, word segmentation is necessary for Japanese sentences because they do not have word boundaries. The performances of existing morphological analyzers for Japanese sentences are very high. However, it is difficult to ...
Conditional Random Fields for Korean Morpheme Segmentation and POS Tagging

There has been recent interest in statistical approaches to Korean morphological analysis. However, previous studies have been based mostly on generative models, including a hidden Markov model (HMM), without utilizing discriminative models such as a ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Asian and Low-Resource Language Information Processing

ACM Transactions on Asian and Low-Resource Language Information Processing Volume 20, Issue 4

July 2021

419 pages

ISSN:2375-4699

EISSN:2375-4702

DOI:10.1145/3465463

Editor:
Imed Zitouni
Google, USA

Issue’s Table of Contents

Copyright © 2021 Association for Computing Machinery.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 26 May 2021

Accepted: 01 November 2020

Revised: 01 July 2020

Received: 01 March 2020

Published in TALLIP Volume 20, Issue 4

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Refereed

Funding Sources

Key Program of National Natural Science Foundation of China
National Natural Science Foundation of China
Key Project of Natural Science Foundation of Yunnan Province
Candidates of the Young and Middle Aged Academic and Technical Leaders of Yunnan Province

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

3
Total Citations
View Citations
248
Total Downloads

Downloads (Last 12 months)37
Downloads (Last 6 weeks)4

Reflects downloads up to 15 Sep 2024

Other Metrics

View Author Metrics

Citations

Cited By

Jayanth KBharathi Mohan GKumar R(2023)Indian Language Analysis with XLM-RoBERTa: Enhancing Parts of Speech Tagging for Effective Natural Language Preprocessing2023 Seventh International Conference on Image Information Processing (ICIIP)10.1109/ICIIP61524.2023.10537689(850-854)Online publication date: 22-Nov-2023
https://doi.org/10.1109/ICIIP61524.2023.10537689
Guven Z(2022)The Comparison of Language Models with a Novel Text Filtering Approach for Turkish Sentiment AnalysisACM Transactions on Asian and Low-Resource Language Information Processing10.1145/355789222:2(1-16)Online publication date: 27-Dec-2022
https://dl.acm.org/doi/10.1145/3557892
Jin YTao SLiu QLiu X(2022)A BiLSTM-CRF Based Approach to Word Segmentation in Chinese2022 IEEE Intl Conf on Dependable, Autonomic and Secure Computing, Intl Conf on Pervasive Intelligence and Computing, Intl Conf on Cloud and Big Data Computing, Intl Conf on Cyber Science and Technology Congress (DASC/PiCom/CBDCom/CyberSciTech)10.1109/DASC/PiCom/CBDCom/Cy55231.2022.9927991(1-4)Online publication date: 12-Sep-2022
https://doi.org/10.1109/DASC/PiCom/CBDCom/Cy55231.2022.9927991

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View Issue’s Table of Contents