research-article

Adaptive Probabilistic Word Embedding

Authors:

Kaixiang MoAuthors Info & Claims

WWW '20: Proceedings of The Web Conference 2020

Pages 651 - 661

https://doi.org/10.1145/3366423.3380147

Published: 20 April 2020 Publication History

Abstract

Word embeddings have been widely used and proven to be effective in many natural language processing and text modeling tasks. It is obvious that one ambiguous word could have very different semantics in various contexts, which is called polysemy. Most existing works aim at generating only one single embedding for each word while a few works build a limited number of embeddings to present different meanings for each word. However, it is hard to determine the exact number of senses for each word as the word meaning is dependent on contexts. To address this problem, we propose a novel Adaptive Probabilistic Word Embedding (APWE) model, where the word polysemy is defined over a latent interpretable semantic space. Specifically, at first each word is represented by an embedding in the latent semantic space and then based on the proposed APWE model, the word embedding can be adaptively adjusted and updated based on different contexts to obtain the tailored word embedding. Empirical comparisons with state-of-the-art models demonstrate the superiority of the proposed APWE model.

References

[1]

Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2014. Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473(2014).

[2]

Sergey Bartunov, Dmitry Kondrashkin, Anton Osokin, and Dmitry Vetrov. 2016. Breaking sticks and ambiguities with adaptive skip-gram. In Artificial Intelligence and Statistics.

[3]

Yoshua Bengio, Réjean Ducharme, Pascal Vincent, and Christian Jauvin. 2003. A neural probabilistic language model. JMLR (2003).

Digital Library

[4]

David M. Blei, Andrew Y. Ng, and Michael I. Jordan. 2003. Latent Dirichlet Allocation. JMLR (2003).

Digital Library

[5]

Olivier Bousquet and Léon Bottou. 2008. The tradeoffs of large scale learning. In NIPS.

[6]

Elia Bruni, Nam-Khanh Tran, and Marco Baroni. 2014. Multimodal distributional semantics. Journal of Artificial Intelligence Research(2014).

[7]

Xinxiong Chen, Zhiyuan Liu, and Maosong Sun. 2014. A Unified Model for Word Sense Representation and Disambiguation. In EMNLP.

[8]

Rajarshi Das, Manzil Zaheer, and Chris Dyer. 2015. Gaussian lda for topic models with word embeddings. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers).

[9]

Haim Dubossarsky, Eitan Grossman, and Daphna Weinshall. 2018. Coming to Your Senses: on Controls and Evaluation Sets in Polysemy Research. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing.

[10]

Manaal Faruqui, Yulia Tsvetkov, Dani Yogatama, Chris Dyer, and Noah Smith. 2015. Sparse overcomplete word vector representations. arXiv (2015).

[11]

Lev Finkelstein, Evgeniy Gabrilovich, Yossi Matias, Ehud Rivlin, Zach Solan, Gadi Wolfman, and Eytan Ruppin. 2001. Placing search in context: The concept revisited. In WWW.

[12]

Jiang Guo, Wanxiang Che, Haifeng Wang, and Ting Liu. 2014. Learning Sense-specific Word Embeddings By Exploiting Bilingual Resources. In COLING.

[13]

Guy Halawi, Gideon Dror, Evgeniy Gabrilovich, and Yehuda Koren. 2012. Large-scale learning of word relatedness with constraints. In Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining.

Digital Library

[14]

Felix Hill, Roi Reichart, and Anna Korhonen. 2016. Simlex-999: Evaluating semantic models with (genuine) similarity estimation. Computational Linguistics(2016).

[15]

Matthew D Hoffman, David M Blei, Chong Wang, and John Paisley. 2013. Stochastic variational inference. The Journal of Machine Learning Research(2013).

[16]

Thomas Hofmann. 1999. Probabilistic Latent Semantic Indexing. In SIGIR.

[17]

Eric H Huang, Richard Socher, Christopher D Manning, and Andrew Y Ng. 2012. Improving word representations via global context and multiple word prototypes. In ACL.

[18]

Jiwei Li and Dan Jurafsky. 2015. Do Multi-Sense Embeddings Improve Natural Language Understanding?. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. 1722–1732.

[19]

Shuangyin Li, Yu Zhang, and Rong Pan. 2017. Recurrent Attentional Topic Model. In AAAI.

[20]

Percy Liang and Dan Klein. 2009. Online EM for unsupervised models. In ACL.

[21]

Pengfei Liu, Xipeng Qiu, and Xuanjing Huang. 2015. Learning Context-Sensitive Word Embeddings with Neural Tensor Skip-Gram Model. In IJCAI.

[22]

Yang Liu, Zhiyuan Liu, Tat-Seng Chua, and Maosong Sun. 2015. Topical Word Embeddings. In AAAI.

[23]

Hongyin Luo, Zhiyuan Liu, Huan-Bo Luan, and Maosong Sun. 2015. Online Learning of Interpretable Word Embeddings. In EMNLP.

[24]

Thang Luong, Richard Socher, and Christopher D Manning. 2013. Better word representations with recursive neural networks for morphology. In CoNLL.

[25]

Christopher D Manning, Prabhakar Raghavan, Hinrich Schütze, 2008. Introduction to information retrieval. Cambridge university press Cambridge.

[26]

Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. 2013. Distributed representations of words and phrases and their compositionality. In NIPS.

[27]

David Mimno, Hanna M Wallach, Edmund Talley, Miriam Leenders, and Andrew McCallum. 2011. Optimizing semantic coherence in topic models. In EMNLP.

[28]

Brian Murphy, Partha Pratim Talukdar, and Tom Mitchell. 2012. Learning effective and interpretable semantic models using non-negative sparse embedding. In COLING 2012.

[29]

Arvind Neelakantan, Jeevan Shankar, Alexandre Passos, and Andrew McCallum. 2014. Efficient Non-parametric Estimation of Multiple Embeddings per Word in Vector Space. In EMNLP.

[30]

David Newman, Jey Han Lau, Karl Grieser, and Timothy Baldwin. 2010. Automatic evaluation of topic coherence. In ACL.

[31]

Jeffrey Pennington, Richard Socher, and Christopher D Manning. 2014. Glove: Global Vectors for Word Representation. In EMNLP.

[32]

Matthew Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, and Luke Zettlemoyer. 2018. Deep Contextualized Word Representations. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers).

[33]

Joseph Reisinger and Raymond J Mooney. 2010. Multi-prototype vector-space models of word meaning. In ACL.

[34]

Bei Shi, Wai Lam, Shoaib Jameel, Steven Schockaert, and Kwun Ping Lai. 2017. Jointly learning word embeddings and latent topics. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval.

Digital Library

[35]

Fei Sun, Jiafeng Guo, Yanyan Lan, Jun Xu, and Xueqi Cheng. 2016. Sparse word embeddings using l1 regularized online learning. In IJCAI.

[36]

Fei Tian, Hanjun Dai, Jiang Bian, Bin Gao, Rui Zhang, Enhong Chen, and Tie-Yan Liu. 2014. A Probabilistic Model for Learning Multi-Prototype Word Embeddings. In COLING.

[37]

Zhaohui Wu and C Lee Giles. 2015. Sense-Aaware Semantic Analysis: A Multi-Prototype Word Representation Model Using Wikipedia. In AAAI.

Cited By

Wang CZheng WSun XZhou JLu J(2025)Probabilistic deep metric learning for hyperspectral image classificationPattern Recognition10.1016/j.patcog.2024.110878157(110878)Online publication date: Jan-2025
https://doi.org/10.1016/j.patcog.2024.110878
Ye ZLi S(2024)A transformer-based neural network framework for full names prediction with abbreviations and contextsData & Knowledge Engineering10.1016/j.datak.2023.102275150:COnline publication date: 1-Mar-2024
https://dl.acm.org/doi/10.1016/j.datak.2023.102275
Liu WGuo HGan JWang HWang HZhang CPeng QSun YYu BHou MLi BLi X(2024)A topic detection method based on KM-LSH Fusion algorithm and improved BTM modelSoft Computing10.1007/s00500-024-09874-xOnline publication date: 7-Aug-2024
https://doi.org/10.1007/s00500-024-09874-x
Show More Cited By

Index Terms

Adaptive Probabilistic Word Embedding

Index terms have been assigned to the content through auto-classification.

Recommendations

Adaptive cross-contextual word embedding for word polysemy with unsupervised topic modeling
Abstract
Because of its efficiency, word embedding has been widely used in many natural language processing and text modeling tasks. It aims to represent each word by a vector so such that the geometry between these vectors can capture the ...
Improving Vietnamese WordNet using word embedding
NLPIR '19: Proceedings of the 2019 3rd International Conference on Natural Language Processing and Information Retrieval

This paper presents a simple but effective method to improve the quality of WordNet synsets and extract glosses for synsets. We translate the Princeton WordNet and other intermediate WordNets to a target language using a machine translator, then the ...
Multi-prototype Morpheme Embedding for Text Classification
SMA 2020: The 9th International Conference on Smart Media and Applications

Representing a word into a continuous space, also known as a word vector, has been successful in various NLP tasks. The word-based embedding has two problems; one is the out-of-vocabulary problem and the other is does not take into account the context ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

WWW '20: Proceedings of The Web Conference 2020

April 2020

3143 pages

ISBN:9781450370233

DOI:10.1145/3366423

Editors:
Yennun Huang
Acadmica sinica, Taiwan
,
Irwin King
The Chinese University of Hong Kong, Hong Kong
,
Tie-Yan Liu
Microsoft Research Asia, China
,
Maarten van Steen
University of Twente, Netherlands

Copyright © 2020 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGWEB: ACM Special Interest Group on Hypertext, Hypermedia, and Web

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 April 2020

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

WWW '20

Sponsor:

SIGWEB

WWW '20: The Web Conference 2020

April 20 - 24, 2020

Taipei, Taiwan

Acceptance Rates

Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

8
Total Citations
View Citations
483
Total Downloads

Downloads (Last 12 months)23
Downloads (Last 6 weeks)1

Reflects downloads up to 15 Sep 2024

Other Metrics

View Author Metrics

Citations

Cited By

Wang CZheng WSun XZhou JLu J(2025)Probabilistic deep metric learning for hyperspectral image classificationPattern Recognition10.1016/j.patcog.2024.110878157(110878)Online publication date: Jan-2025
https://doi.org/10.1016/j.patcog.2024.110878
Ye ZLi S(2024)A transformer-based neural network framework for full names prediction with abbreviations and contextsData & Knowledge Engineering10.1016/j.datak.2023.102275150:COnline publication date: 1-Mar-2024
https://dl.acm.org/doi/10.1016/j.datak.2023.102275
Liu WGuo HGan JWang HWang HZhang CPeng QSun YYu BHou MLi BLi X(2024)A topic detection method based on KM-LSH Fusion algorithm and improved BTM modelSoft Computing10.1007/s00500-024-09874-xOnline publication date: 7-Aug-2024
https://doi.org/10.1007/s00500-024-09874-x
Alharbe NRakrouki MAljohani A(2023)A collaborative filtering recommendation algorithm based on embedding representationExpert Systems with Applications: An International Journal10.1016/j.eswa.2022.119380215:COnline publication date: 15-Feb-2023
https://dl.acm.org/doi/10.1016/j.eswa.2022.119380
Li SChen WZhang YZhao GPan RHuang ZTang Y(2022)A context-enhanced sentence representation learning method for close domains with topic modelingInformation Sciences: an International Journal10.1016/j.ins.2022.05.113607:C(186-210)Online publication date: 1-Aug-2022
https://dl.acm.org/doi/10.1016/j.ins.2022.05.113
Li SLuo HZhao GTang MLiu X(2022)bi-directional Bayesian probabilistic model based hybrid grained semantic matchmaking for Web service discoveryWorld Wide Web10.1007/s11280-022-01004-7Online publication date: 17-Feb-2022
https://doi.org/10.1007/s11280-022-01004-7
Li SPan RLuo HLiu XZhao G(2021)Adaptive cross-contextual word embedding for word polysemy with unsupervised topic modelingKnowledge-Based Systems10.1016/j.knosys.2021.106827218:COnline publication date: 30-Dec-2021
https://dl.acm.org/doi/10.1016/j.knosys.2021.106827
Xu WLi SLu Y(2020)Usr-mtl: an unsupervised sentence representation learning framework with multi-task learningApplied Intelligence10.1007/s10489-020-02042-2Online publication date: 14-Nov-2020
https://doi.org/10.1007/s10489-020-02042-2

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View Table of Contents