Skip to main content
Log in

An Expectation-Maximization Algorithm for Blind Separation of Noisy Mixtures Using Gaussian Mixture Model

  • Published:
Circuits, Systems, and Signal Processing Aims and scope Submit manuscript

    We’re sorry, something doesn't seem to be working properly.

    Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.


In this paper, we propose a new expectation-maximization (EM) algorithm, named GMM-EM, to blind separation of noisy instantaneous mixtures, in which the non-Gaussianity of independent sources is exploited by modeling their distribution using the Gaussian mixture model (GMM). The compatibility between the incomplete-data structure of the GMM and the hidden variable nature of the source separation problem leads to an efficient hierarchical learning and alternative method for estimating the sources and the mixing matrix. In comparison with conventional blind source separation algorithms, the proposed GMM-EM algorithm has superior performance for the separation of noisy mixtures due to the fact that the covariance matrix of the additive Gaussian noise is treated as a parameter. Furthermore, the GMM-EM algorithm works well in underdetermined cases by incorporating any prior information one may have and jointly estimating the mixing matrix and source signals in a Bayesian framework. Systematic simulations with both synthetic and real speech signals are used to show the advantage of the proposed algorithm over conventional independent component analysis techniques, such as FastICA, especially for noisy and/or underdetermined mixtures. Moreover, it can even achieve similar performance to a recent technique called null space component analysis with less computational complexity.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others


  1. For a likelihood function, a conjugate prior is defined as the prior for which the posteriori and the priori are of the same type of distributions.

  2. Matlab codes can be found at:

  3. Available at:


  1. S. Amari, A. Cichocki, Adaptive blind signal processing-neural network approaches. Proc. IEEE 86(10), 2026–2048 (1998)

    Article  Google Scholar 

  2. A.J. Bell, T.J. Sejnowski, An information-maximization approach to blind separation and blind deconvolution. Neural Comput. 7, 1129–1159 (1995)

    Article  Google Scholar 

  3. A. Belouchrani, J.F. Cardoso, Maximum likelihood source separation for discrete sources, in Elsevier EUSIPCO’94 (Edinburgh, 1994)

  4. J. Bilmes, A Gentle Tutorial on the EM Algorithm and Its Application to Parameter Estimation for Gaussian Mixture and Hidden Markov Models, Technical Report ICSI-TR-97-021 (University of Berkelym 1997).

  5. J.F. Cardoso, B.H. Laheld, Equivariant adaptive source separation. IEEE Trans. Signal Process. 44(12), 3017–3030 (1996)

    Article  Google Scholar 

  6. J.F. Cardoso, A. Souloumiac, Blind beamforming for non-Gaussian signals, in IEE Proceedings F on Radar and Signal Processing, vol. 140(6) (1993), pp. 362–370

  7. P. Comon, M. Rajih, Blind identification of underdetermined mixtures based on the characteristic function. Signal Process. 86(9), 2671–2681 (2006)

    Article  MATH  Google Scholar 

  8. A.P. Dempster, N.M. Laird, D.B. Rubin, Maximum likelihood from incomplete data via the EM algorithm. J. R. Stat. Soc. Ser. B 39(1), 1–38 (1977)

    MathSciNet  MATH  Google Scholar 

  9. J.L. Gauvain, C.H. Lee, Maximum a posteriori estimation for multivariate Gaussian mixture observations of chains. IEEE Trans. Speech Audio Process. 2(2), 291–298 (1994)

    Article  Google Scholar 

  10. F. Gu, H. Zhang, D. Zhu, Blind separation of complex sources using generalised generating function. IEEE Signal Process. Lett. 20(1), 71–74 (2013)

    Article  Google Scholar 

  11. F. Gu, H. Zhang, W. Wang, D. Zhu, Generalized generating function with tucker decomposition and alternating least squares for underdetermined blind identification. EURASIP J. Adv. Signal Process. (2013). doi:10.1186/1687-6180-2013-124

    Google Scholar 

  12. F. Gu, H. Zhang, D. Zhu, Blind separation of non-stationary sources using continuous density Markov models. Digit. Signal Process. 23(5), 1549–1564 (2013)

    Article  MathSciNet  Google Scholar 

  13. F. Gu, H. Zhang, Y. Xiao, A Bayesian approach to blind separation of mixed discrete sources by Gibbs sampleing, in Lecture Notes on Computer Science, vol. 6905 (2011), pp. 463–475

  14., Accessed in 2013

  15. Q. Huang, J. Yang, Y. Xue, Y. Zhou, Temporally correlated source separation based on variational Kalman smoother. Digit. Signal Process. 18(3), 422–433 (2008)

    Article  Google Scholar 

  16. Q. Huo, C. Chan, Bayesian Adaptive Learning of the Parameters of the Hidden Markov Model for Speech Recognition, HKU CSIS Technical Report TR-92-08 (1992).

  17. W. Hwang, J. Ho, Y. Lin, Null Space Component Analysis for Noisy Blind Source Separation, Technical Report TR-IIS-13-001 (2014).

  18. A. Hyvärinen, Fast and robust fixed-point algorithms for independent component analysis. IEEE Trans. Neural Netw. 10(3), 626–634 (1999)

    Article  Google Scholar 

  19. A. Karfoul, L. Albera, G. Birot, Blind underdetermined mixture identification by joint canonical decomposition of HO cumulants. IEEE Trans. Signal Process. 58(2), 638–649 (2010)

    Article  MathSciNet  Google Scholar 

  20. S. Kim, C.D. Yoo, Underdetermined blind source separation based on generalized Gaussian distribution, in Proceedings of the 16th IEEE Signal Processing Society Workshop on Machine Learning for Signal Processing (Arlington, VA, 2006), pp. 103–108

  21. K.H. Knuth, Informed source separation: a Bayesian tutorial, in Proceedings of the 13th European Signal Processing Conference (EUSIPCO 2005) (Antalya, 2005)

  22. S. Kullback, R.A. Leibler, On information and sufficiency. Ann. Math. Stat. 22(1), 79–86 (1951)

    Article  MathSciNet  MATH  Google Scholar 

  23. L.D. Lathauwer, J. Castaing, J.F. Cardoso, Fourth-order cumulant-based blind identification of underdetermined mixtures. IEEE Trans. Signal Process. 55(6), 2965–2973 (2007)

    Article  MathSciNet  Google Scholar 

  24. L.D. Lathauwer, J. Castaing, Blind identification of underdetermined mixtures by simultaneous matrix diagonalization. IEEE Trans. Signal Process. 56(3), 1096–1105 (2008)

    Article  MathSciNet  Google Scholar 

  25. C.-H. Lee, J.-L. Gauvain, Speaker adaptation based on MAP estimation of HMM parameters, in Proceedings of the IEEE International Conference on Acoustics Speech and Signal Processing (1993), pp. 558–561

  26. J.Q. Li, A.R. Barron, Mixture density estimation, in Advances in Neural Information Processing Systems, vol. 12 (MIT Press, Cambridge, 2000), pp. 279–285

  27. X. Luciani, A.L.F. de Almeida, P. Comon, Blind identification of underdetermined mixtures based on the characteristic function: the complex case. IEEE Trans. Signal Process. 59(2), 540–553 (2011)

    Article  MathSciNet  Google Scholar 

  28. J. Ma, L. Xu, M.I. Jordan, Asymptotic convergence rate of the EM algorithm for Gaussian mixtures. Neural Comput. 12, 2881–2907 (2000)

    Article  Google Scholar 

  29. E. Moulines, J.F. Cardoso, E. Gassiat, Maximum likelihood for blind separation and deconvolution of noisy signals using mixture models, in International Conference on Acoustics, Speech, and Signal Processing (Munich, 1997), pp. 3617–3620

  30. S. Peng, W. Hwang, Null space pursuit: an operator-based approach to adaptive signal processing. IEEE Trans. Signal Process. 58(5), 2475–2483 (2010)

    Article  MathSciNet  Google Scholar 

  31. M. Razaviyayn, M. Hong, Z. Luo, A Unified Convergence Analysis of Block Successive Minimization Methods for Nonsmooth Optimization, arXiv:1209.2385 [math.OC] (2012)

  32. T. Routtenberg, J. Tabrikian, MIMO-AR system identification and blind source separation for GMM-distributed sources. IEEE Trans. Signal Process. 57(5), 1717–1730 (2009)

    Article  MathSciNet  Google Scholar 

  33. T. Rydn, EM versus chain Monte Carlo for estimation of hidden Markov models: a computational perspective. Bayesian Anal. 3, 659–688 (2008)

    Article  MathSciNet  Google Scholar 

  34. G. Schwarz, Estimation the dimension of a model. Annu. Stat. 6(2), 461–464 (1978)

    Article  MATH  Google Scholar 

  35. H. Snoussi, A.M. Djafari, Unsupervised learning for source separation with mixture of Gaussians prior for sources and Gaussian prior for mixture coefficients, in Proceedings of the 2001 IEEE Signal Processing Society Workshop on Neural Networks for Signal Processing XI (2001), pp. 293-302

  36. H. Snoussi, J. Idier, Bayesian blind separation of generalized hyperbolic processes in noisy and underdetermined mixtures. IEEE Trans. Signal Process. 54(9), 3257–3269 (2006)

    Article  Google Scholar 

  37. S. Sun, C. Peng, W. Hou, J. Zheng, Y. Jiang, X. Zheng, Blind source separation with time series variational Bayes expectation maximization algorithm. Digit. Signal Process. 12(1), 17–33 (2012)

    Article  MathSciNet  Google Scholar 

  38. K. Todros, J. Tabrikian, Blind separation of independent sources using Gaussian mixture model. IEEE Trans. Signal Process. 55(7), 3645–3658 (2007)

    Article  MathSciNet  Google Scholar 

  39. P. Tseng, Convergence of a block coordinate descent method for nondifferentiable minimization. J. Optim. Theory Appl. 109, 475–494 (2001)

    Article  MathSciNet  MATH  Google Scholar 

  40. L. Xu, M.I. Jordan, On convergence properties of the EM algorithm for Gaussian mixtures. Neural Comput. 8, 129–151 (1996)

    Article  Google Scholar 

  41. Y. Zhang, X. Shi, C.H. Chen, A Gaussian mixture model for underdetermined independent component analysis. Signal Process. 86(6), 1538–1549 (2006)

    Article  MATH  Google Scholar 

  42. Y. Zhao, Image segmentation using temporal-spatial information in dynamic scenes, in Proceedings of the IEEE International Conference on Machine Learning and Cybernetics (2003)

Download references


This work is supported in part by the National Natural Science Foundation of China under Grant 61601477 and the Engineering and Physical Sciences Research Council (EPSRC) Grant No. EP/K014307/1.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Shan Wang.


Appendix 1: Proof of Equation (13)


$$\begin{aligned}&f\left( \mathbf{s},Y|\mathbf{x},{\mathbf{A}}^g,\mathbf{R}_w^g,\varTheta ^g\right) \nonumber \\&\quad = \prod \limits _{t = 1}^T {f\left( \mathbf{s}(t)|\mathbf{x},{\mathbf{A}}^g ,\mathbf{R}_w^g,\varTheta ^g \right) f\left( y(t)|\mathbf{s}(t),\mathbf{x},{\mathbf{A}}^g ,\mathbf{R}_w^g,\varTheta ^g \right) } \end{aligned}$$

On the other hand,

$$\begin{aligned}&f(\mathbf{x},\mathbf{s},Y|{\mathbf{A}},\mathbf{R}_w,\varTheta ) \nonumber \\&\quad = \prod \limits _{t = 1}^T {f\left( \mathbf{x}(t)|\mathbf{s}(t),{\mathbf{A}},\mathbf{R}_w \right) f\left( \mathbf{s}(t)|y(t),\varTheta \right) } \end{aligned}$$

Substituting (39) and (40) in (11), it is straightforward to derive that

$$\begin{aligned} J= & {} \sum \limits _{t = 1}^T {\sum \limits _{m = 1}^M {\int _\mathbf{s} {f(y(t) = m|\mathbf{s}(t),\mathbf{x},{\mathbf{A}}^g,\mathbf{R}_w^g,\varTheta ^g )} } } \nonumber \\&\quad \log \omega _m f(\mathbf{s}(t)|y(t) = m,\varTheta )\mathrm{d}{} \mathbf{s} \nonumber \\&\quad + \sum \limits _{t = 1}^T {\int _\mathrm{{s}} {f(\mathbf{s}(t)|\mathbf{x},{\mathbf{A}}^g,\mathbf{R}_w^g,\varTheta ^g )\log f(\mathbf{x}(t)|\mathbf{s}(t),{\mathbf{A}},\mathbf{R}_w )} } \mathrm{d}{} \mathbf{s} \end{aligned}$$

Appendix 2: Proof of Equation (14)

Based on the Bayesian theory, it is easy to obtain

$$\begin{aligned}&f\left( \mathbf{s}(t)|\mathbf{x}(t),{\mathbf{A}}^g,\mathbf{R}_w^g,\varTheta ^g \right) \nonumber \\&\quad = \sum \limits _{y(t) = 1}^M {f\left( \mathbf{x}(t)|\mathbf{s}(t),{\mathbf{A}}^g,\mathbf{R}_w^g,\varTheta ^g \right) }\nonumber \\&\qquad f\left( \mathbf{s}(t)|y(t) = m,{\mathbf{A}}^g,\mathbf{R}_w^g,\varTheta ^g \right) \end{aligned}$$


$$\begin{aligned}&f(\mathbf{x}(t),\mathbf{s}(t),{\mathbf{A}}^g,\mathbf{R}_w^g,\varTheta ^g ) \nonumber \\&\quad = \frac{1}{{\left| {2\pi \mathbf{R}_w^g } \right| ^{1/2} }}\nonumber \\&\qquad \exp \left\{ { - \frac{1}{2}(\mathbf{x}(t) - {\mathbf{A}}^g \mathbf{s}(t))^{\mathrm{T}} (\mathbf{R}_w^g )^{ - 1} (\mathbf{x}(t) - {\mathbf{A}}^g \mathbf{s}(t))} \right\} \nonumber \\&\qquad \sum _{m = 1}^M {\omega _m^g \frac{1}{{\left| {2\pi \mathbf{C}_m^g } \right| ^{1/2} }}} \nonumber \\&\qquad \quad \exp \left\{ { - \frac{1}{2}(\mathbf{s}(t) - {\varvec{\mu }}_m^g )^{\mathrm{T}} (\mathbf{C}_m^g )^{ - 1} (\mathbf{s}(t) - {\varvec{\mu }}_m^g )} \right\} \nonumber \\&\quad = \sum _{m = 1}^M {\omega _m^g \frac{1}{{\left| {2\pi \mathbf{R}_w^g } \right| ^{1/2} }}\frac{1}{{\left| {2\pi \mathbf{C}_m^g } \right| ^{1/2} }}} \nonumber \\&\qquad \exp \left\{ { - \frac{1}{2}(\mathbf{x}(t) - {\mathbf{A}}^g \mathbf{s}(t))^{\mathrm{T}} (\mathbf{R}_w^g )^{ - 1} (\mathbf{x}(t) - {\mathbf{A}}^g \mathbf{s}(t))} \right\} \nonumber \\&\qquad \exp \left\{ { - \frac{1}{2}(\mathbf{s}(t) - {\varvec{\mu }}_m^g )^{\mathrm{T}} (\mathbf{C}_m^g )^{ - 1} (\mathbf{s}(t) - {\varvec{\mu }}_m^g )} \right\} \end{aligned}$$

After a series of derivations, (42) can be simplified as

$$\begin{aligned} f(\mathbf{s}(t)|\mathbf{x}(t),{\mathbf{A}}^g,\mathbf{R}_w^g,\varTheta ^g ) = \sum _{m = 1}^M {\tilde{\omega } _{mt}^g \mathcal{N}\left[ {\mathbf{s}(t);{\varvec{\tilde{\mu }}}_{mt}^g,{\tilde{\mathbf{C}}}_{mt}^g } \right] } \end{aligned}$$


$$\begin{aligned} \left\{ \begin{array}{l} {\tilde{\mathbf{C}}}_{mt}^g = \left( {({\mathbf{A}}^g )^{\mathrm{T}} (\mathbf{R}_w^g )^{ - 1} {\mathbf{A}}^g + (\mathbf{C}_m^g )^{ - 1} } \right) ^{ - 1} \\ {\varvec{\tilde{\mu }}}_{mt}^g ~= \left( {{\tilde{\mathbf{C}}}_{mt}^g } \right) \left( {({\mathbf{A}}^g )^{\mathrm{T}} (\mathbf{R}_w^g )^{ - 1} \mathbf{x}(t) + (\mathbf{C}_m^g )^{ - 1} {\varvec{\mu }}_m^g } \right) \\ \tilde{\omega } _{mt}^g ~= \omega _m^g \left( {{{\left| {({\tilde{\mathbf{C}}}_{mt}^g )} \right| ^{1/2} } \Big / {\left| {2\pi \mathbf{R}_w^g } \right| ^{1/2} \left| {\mathbf{C}_m^g } \right| ^{1/2} }}} \right) \\ \qquad \quad \mathrm{{ }}\exp \left\{ { - \frac{1}{2}\left[ {\mathbf{x}^{\mathrm{T}} (t)(\mathbf{R}_w^g )^{ - 1} \mathbf{x}(t)} \right. } \right. \\ \qquad \qquad \quad \left. {\left. {\mathrm{{ }} + ({\varvec{\mu }}_m^g )^\mathrm{{T}} (\mathbf{C}_m^g )^{ - 1} {\varvec{\mu }}_m^g - ({\varvec{\tilde{\mu }}}_{mt}^g )^\mathrm{{T}} ({\tilde{\mathbf{C}}}_{mt}^g )^{ - 1} {\varvec{\tilde{\mu }}}_{mt}^g } \right] } \right\} \\ \end{array} \right. \end{aligned}$$

Appendix 3: Definition of Similarity Score

In order to measure the separation performance, the similarity score is introduced to evaluate the separation performance of the proposed algorithm

$$\begin{aligned} \rho _{ii} = {{\sum _{t = 1}^T {s_i (t)\hat{s}_i (t)} } \Big / {\sqrt{\sum _{t = 1}^T {(s_i (t))^2 } \sum _{t = 1}^T {(\hat{s}_i (t))^2 } } }} \end{aligned}$$

where \({{\hat{s}}_i(t)}\) is the ith recovered source signal. \(\rho _{ii}\) depicts the similarity between the ith original source signal and the corresponding recovered source signal. It is clear that the larger the value of \(\rho _{ii}\), the higher the degree of similarity between the original sources and the recovered sources.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Gu, F., Zhang, H., Wang, W. et al. An Expectation-Maximization Algorithm for Blind Separation of Noisy Mixtures Using Gaussian Mixture Model. Circuits Syst Signal Process 36, 2697–2726 (2017).

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI:

