Independent language modeling architecture for end-to-end ASR

Pham, Van Tung; Xu, Haihua; Khassanov, Yerbolat; Zeng, Zhiping; Chng, Eng Siong; Ni, Chongjia; Ma, Bin; Li, Haizhou

Computer Science > Computation and Language

arXiv:1912.00863 (cs)

[Submitted on 25 Nov 2019]

Title:Independent language modeling architecture for end-to-end ASR

Authors:Van Tung Pham, Haihua Xu, Yerbolat Khassanov, Zhiping Zeng, Eng Siong Chng, Chongjia Ni, Bin Ma, Haizhou Li

View PDF

Abstract:The attention-based end-to-end (E2E) automatic speech recognition (ASR) architecture allows for joint optimization of acoustic and language models within a single network. However, in a vanilla E2E ASR architecture, the decoder sub-network (subnet), which incorporates the role of the language model (LM), is conditioned on the encoder output. This means that the acoustic encoder and the language model are entangled that doesn't allow language model to be trained separately from external text data. To address this problem, in this work, we propose a new architecture that separates the decoder subnet from the encoder output. In this way, the decoupled subnet becomes an independently trainable LM subnet, which can easily be updated using the external text data. We study two strategies for updating the new architecture. Experimental results show that, 1) the independent LM architecture benefits from external text data, achieving 9.3% and 22.8% relative character and word error rate reduction on Mandarin HKUST and English NSC datasets respectively; 2)the proposed architecture works well with external LM and can be generalized to different amount of labelled data.

Subjects:	Computation and Language (cs.CL); Audio and Speech Processing (eess.AS)
Cite as:	arXiv:1912.00863 [cs.CL]
	(or arXiv:1912.00863v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.1912.00863

Submission history

From: Van Tung Pham [view email]
[v1] Mon, 25 Nov 2019 07:35:16 UTC (752 KB)

Computer Science > Computation and Language

Title:Independent language modeling architecture for end-to-end ASR

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Independent language modeling architecture for end-to-end ASR

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators