A Gaussian mixture model (GMM) is useful for modeling data which comes from one of several groups. The groups might be different from each other, but data points within the same group can be well-modeled by a Gaussian distribution.

For example, the height of man and woman are normally distributed, and each of probability density distribution (PDF) is shown as bellow,

A multidimensional GMM is specified by a sum of Gaussians,

Where

GMM is parameterized by weight, mean and covariance.

The model parameters, weight, mean and covariance are estimated by EM algorithm.

In E step, probabilistic assignments are estimated using old model parameters,

In M step, model paramters are updated using the assignment of individual points,

The likelihood increases at each iterations, and when there is not much increase in likelihood value, we can assume that the EM algorithm is converged.

An animation demonstrating GMM fitting using the EM algorithm is shown below. The algorithm steps through from a random initialization to convergence.

An animation demonstrating GMM fitting using the EM algorithm is shown below. The algorithm steps through from a random initialization to convergence.

Advertisements

Reblogged this on sidgan.

Hi AHilan,

Good explanation and great sharing..

If i want to apply the latent variable using multinomial distribution,

could you share any good resource or notes or any sample code (in R preferably)?

Thank you in advance