There is robust detection to test such outliers:=significantly different ratios. It is quite easy here because Gaussian mixture model is a sub model of epsilon contamination model and there exist least favourable densities to perform the likelihood ratio test.
To find such densities and accordingly the likelihood ratio test, you need to solve two non-linear equations and find two constants. Using these coefficients you get the least favourable densities and accordingly the likelihood ratio test.
The best thing here is that robust detection will provide you no loss of performance under outliers. Your performance can never degrade due to the boundedness property of the set of densities you consider around the nominal density.
It is also known that the likelihood ratio test based on least favourable densities is censored version of the original likelihood ratio test. I suggest to clip the likelihood ratio test and check the performance if you dont wanna deal with solving non linear equations.
$\newcommand{\N}{\mathcal{N}}\newcommand{\Var}{\mathrm{Var}}\newcommand{\E}{\Bbb{E}}$Assume that the two personas are represented by distributions $X_1\sim \N\left(\mu_1, \sigma_1^2\right)$ and $X_2\sim \N\left(\mu_2, \sigma_2^2\right)$, where $\mu_k$ and $\sigma_k^2$ are the mean and variance respectively of $X_k$, for $k=1,2$. Assume that $X_1$ and $X_2$ are independent.
We can model the overall persona as coming from $X_1$ with some probability $p$, or coming from $X_2$ otherwise (with probability $1-p$).
That is, if $Z$ is the overall persona, then $Z = IX_1 + (1-I)X_2$, where $I$ is a random variable that is $1$ with probability $p$ and $0$ with probability $1-p$, and $I,X_1,X_2$ are independent.
In this case, $Z$ (the overall persona) is modelled as a Gaussian Mixture Model, with probability density function $f_Z(z) = pf_{X_{1}}(z)+(1-p)f_{X_{2}}(z)$, where $f_{X_{k}}$ is the probability density function of $X_k$, $k=1,2$.
If you just want the mean and variance of the overall persona $Z$ (to use for a Gaussian model), the formulas are:
$\Bbb{E}[Z] = p \mu_1 + (1-p)\mu_2$
and
$\Var(Z) = p\sigma_1^2 +(1-p)\sigma_2^2 + p(1-p)\left(\mu_1-\mu_2\right)^2.$
Some hints to proving the formulas for the mean and variance of $Z$ are to recall the following facts:
$\E[Z] = \E[\E[Z\mid I]]$ by the Law of Total Expectation
$\Var(Z) = \E[\Var(Z\mid I)] + \Var(\E[Z\mid I])$ by the Law of Total Variance
If $Y$ is a random variable that takes value $a$ with probability $p$ and value $b$ with probability $1-p$ (where $a,b$ are constants), then $\E[Y] = pa+(1-p)b$ and $\Var(Y) = p(1-p)(a-b)^2$.
Best Answer
The dimension is simply the dimension of the data. If each data point is simply a scalar, the dimension is 1, if each sample is of the form $(x,y)$, the dimension is 2. The components are the number of independent Gaussians it is a mixture of. The dimension and number of components are not related to each other.