The straight answer to Q1 is "yes", it is definitely possible to cut up an underlying normally distributed continuous variable into an ordinal variable with 1 to 10 levels. You need something that can tell you the cumulative distribution function (often called CDF) of a normal distribution with a given mean and variance (you only need these two parameters to characterise a normal distribution). Then you need to calculate the differences between the values this returns for your various bin cutoffs (as its straight return will be the cumulative probability of a value at X or lower).
I'm sorry I don't use C# but in R this would be something like the below. This is for a 10 point example, if the normal distribution you think is your underlying latent variable has a mean of 5 and variance of 2; and my bins are minus infinity to 1.5, 1.5 to 2.5, 2.5 to 3.5, ... , 9.5 to infinity. You only need the mean and variance to characterise a normal distribution.
> options(digits=2)
> x <- pnorm(1:10+0.5, 5, 2)*100
> x[10] <- 100 # otherwise is just 9.5 to 10.5, not infinity
> x # ie cumulative prob (in %) to each bin
[1] 4 11 23 40 60 77 89 96 99 100
> c(x[1], diff(x)) # differences between the cumulative probs
[1] 4.0 6.6 12.1 17.5 19.7 17.5 12.1 6.6 2.8 1.2
Subsequently, the straight answer to Q2 is also "yes" there are definitely such methods but they should be used with caution and it is probably a little difficult just here to summarise all the pros and cons of the different ways of doing this.
It's also worth knowing that there are other methods for analysing this sort of ordinal data.
In case anyone is still interested, I have managed to implement Aristizabal's formulae in Java. This is more proof-of-concept than the requested "robust" code, but it is a starting point.
/**
* Computes the point estimate of the shift offset (gamma) from the given sample. The sample array will be sorted by this method.<p>
* Cf. Aristizabal section 2.2 ff.
* @param sample {@code double[]}, will be sorted
* @return gamma point estimate
*/
public static double pointEstimateOfGammaFromSample(double[] sample) {
Arrays.sort(sample);
DoubleUnaryOperator func = x->calculatePivotalOfSortedSample(sample, x)-1.0;
double upperLimit = sample[0];
double lowerLimit = 0;
double gamma = bisect(func, lowerLimit, upperLimit);
return gamma;
}
/**
* Cf. Aristizabal's equation (2.3.1)
* @param sample {@code double[]}, should be sorted in ascending order
* @param gamma shift offset
* @return pivotal value of sample
*/
private static double calculatePivotalOfSortedSample(final double[] sample, double gamma) {
final int n=sample.length;
final int n3=n/3;
final double mid = avg(sample, gamma, n3+1, n-n3);
final double low = avg(sample, gamma, 1, n3);
final double upp = avg(sample, gamma, n-n3+1, n);
final double result = (mid-low)/(upp-mid);
return result;
}
/**
* Computes average of sample values from {@code sample[l-1]} to {@code sample[u-1]}.
* @param sample {@code double[]}, should be sorted in ascending order
* @param gamma shift offset
* @param l lower limit
* @param u upper limit
* @return average
*/
private static double avg(double[] sample, double gamma, int l, int u) {
double sum = 0.0;
for (int i=l-1;i<u;sum+=Math.log(sample[i++]-gamma));
final int n = u-l+1;
return sum/n;
}
/**
* Naive bisection implementation. Should always complete if the given values actually straddles the root.
* Will call {@link #secant(DoubleUnaryOperator, double, double)} if they do not, in which case the
* call may not complete.
* @param func Function solve for root value
* @param lowerLimit Some value for which the given function evaluates < 0
* @param upperLimit Some value for which the given function evaluates > 0
* @return x value, somewhere between the lower and upper limits, which evaluates close enough to zero
*/
private static double bisect(DoubleUnaryOperator func, double lowerLimit, double upperLimit) {
final double eps = 0.000001;
double low=lowerLimit;
double valAtLow = func.applyAsDouble(low);
double upp=upperLimit;
double valAtUpp = func.applyAsDouble(upp);
if (valAtLow*valAtLow>0) {
// Switch to secant method
return secant(func, lowerLimit, upperLimit);
}
System.out.printf("bisect %f@%f -- %f@%f%n", valAtLow, low, valAtUpp, upp);
double mid;
while(true) {
mid = (upp+low)/2;
if (Math.abs(upp-low)/low<eps)
break;
double val = func.applyAsDouble(mid);
if (Math.abs(val)<eps)
break;
if (val<0)
low=mid;
else
upp=mid;
}
return mid;
}
/**
* Naive secant root solver implementation. May not complete if root not found.
* @param f Function solve for root value
* @param a Some value for which the given function evaluates
* @param b Some value for which the given function evaluates
* @return x value which evaluates close enough to zero
*/
static double secant(final DoubleUnaryOperator f, double a, double b) {
double fa = f.applyAsDouble(a);
if (fa==0)
return a;
double fb = f.applyAsDouble(b);
if (fb==0)
return b;
System.out.printf("secant %f@%f -- %f@%f%n", fa, a, fb, b);
if (fa*fb<0) {
return bisect(f, a, b);
}
while ( abs(b-a) > abs(0.00001*a) ) {
final double m = (a+b)/2;
final double k = (fb-fa)/(b-a);
final double fm = f.applyAsDouble(m);
final double x = m-fm/k;
if (Math.abs(fa)<Math.abs(fb)) {
// f(a)<f(b); Choose x and a
b=x;
fb=f.applyAsDouble(b);
} else {
// f(a)>=f(b); Choose x and b
a=x;
fa=f.applyAsDouble(a);
}
if (fa==0)
return a;
if (fb==0)
return b;
if (fa*fb<0) {
// Straddling root; switch to bisect method
return bisect(f, a, b);
}
}
return (a+b)/2;
}
Best Answer
The answer is No, not exactly anyhow.
If you have two quartiles of a normal population then you can find $\mu$ and $\sigma.$ For example the lower and upper quantiles of $\mathsf{Norm}(\mu = 100,\, \sigma = 10)$ are $93.255$ and $106.745,$ respectively.
Then $P\left(\frac{X-\mu}{\sigma} < -0.6745\right) = 0.25$ and $P\left(\frac{X-\mu}{\sigma} < 0.6745\right) = 0.75$ provide two equations that can be solved to find $\mu$ and $\sigma.$
However, sample quartiles are not population quartiles. There is not enough information in any normal sample precisely to determine $\mu$ and $\sigma.$
And you are not really sure your sample is from a normal population. If the population has mean $\mu$ and median $\eta,$ then the sample mean and median, respectively, are estimates of these two parameters. If the population is symmetrical, then $\mu = \eta,$ but you say the sample mean and median do not agree. So you cannot be sure the population is symmetrical, much less normal.