Sound-Word2Vec: Learning Word Representations Grounded in Sounds

Vijayakumar, Ashwin K; Vedantam, Ramakrishna; Parikh, Devi

Computer Science > Computation and Language

arXiv:1703.01720v1 (cs)

[Submitted on 6 Mar 2017 (this version), latest version 29 Aug 2017 (v4)]

Title:Sound-Word2Vec: Learning Word Representations Grounded in Sounds

Authors:Ashwin K Vijayakumar, Ramakrishna Vedantam, Devi Parikh

View PDF

Abstract:Sound and vision are the primary modalities that influence how we perceive the world around us. Thus, it is crucial to incorporate information from these modalities into language to help machines interact better with humans. While existing works have explored incorporating visual cues into language embeddings, the task of learning word representations that respect auditory grounding remains under-explored. In this work, we propose a new embedding scheme, sound-word2vec that learns language embeddings by grounding them in sound -- for example, two seemingly unrelated concepts, leaves and paper are closer in our embedding space as they produce similar rustling sounds. We demonstrate that the proposed embeddings perform better than language-only word representations, on two purely textual tasks that require reasoning about aural cues -- sound retrieval and foley-sound discovery. Finally, we analyze nearest neighbors to highlight the unique dependencies captured by sound-w2v as compared to language-only embeddings.

Comments:	5 pages
Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Sound (cs.SD)
Cite as:	arXiv:1703.01720 [cs.CL]
	(or arXiv:1703.01720v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.1703.01720

Submission history

From: Ashwin Vijayakumar [view email]
[v1] Mon, 6 Mar 2017 04:30:12 UTC (43 KB)
[v2] Fri, 28 Apr 2017 06:35:16 UTC (59 KB)
[v3] Thu, 10 Aug 2017 04:26:57 UTC (227 KB)
[v4] Tue, 29 Aug 2017 15:54:31 UTC (227 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CL

< prev | next >

new | recent | 2017-03

Change to browse by:

cs
cs.AI
cs.SD

References & Citations

DBLP - CS Bibliography

listing | bibtex

Ashwin K. Vijayakumar
Ramakrishna Vedantam
Devi Parikh

export BibTeX citation

Computer Science > Computation and Language

Title:Sound-Word2Vec: Learning Word Representations Grounded in Sounds

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Sound-Word2Vec: Learning Word Representations Grounded in Sounds

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators