I'm trying to model a dataset as a mixture of two Gaussian distributions in MATLAB and find the Bhattacharyya distance between the two. Using MATLAB's fitgmdist
function I was able to model this mixture and produce this plot:
gmdist = fitgmdist(data, 2);
gmsigma = gmdist.Sigma;
gmmu = gmdist.mu;
gmwt = gmdist.ComponentProportion;
histogram(data, 'Normalization', 'pdf', 'EdgeColor', 'none')
x = min_val:0.0001:max_val;
xlim([min_val max_val])
hold on;
plot(x, pdf(gmdist, x'), 'k')
hold on;
However, when I was debugging my distance code I realized that my two individual distributions did not match their components in the mixture.
p = pdf('Normal', x, gmmu(1), gmsigma(1));
q = pdf('Normal', x, gmmu(2), gmsigma(2));
plot(x, p*gmwt(1))
hold on;
plot(x, q*gmwt(2))
My expectation was that the above code should have produced two density plots that matched their components in the plot of the original mixture model, but this is not the case. This post clued me in that the PDFs I calculated integrated to 1 individually (rather than together), but I'm unsure how to obtain the "non-integrated" components of the mixture model such that I can plot them to match the original plot.
Best Answer
There is definitely a mistake in the implementation of the decomposition. Here is my rendering of the same problem with both weighted components appearing as they should:
My R code is as follows: