Estimating CVs. The coefficient of variation (CV) $\kappa = \sigma/\mu.$ It can be
estimated by $\hat \kappa = K = S/\bar X,$ where $\bar X$ and $S$
are the sample mean and SD, respectively. For small $n,$ this estimate
is biased on the low side, but for moderate and large samples
the bias is small. Methods of finding confidence intervals (CIs)
for the CV depend on the nature of the underlying distribution.
Because the type of population distribution may be unknown, it may
be useful to use a nonparametric bootstrap CI for the $\kappa.$
Because the population may be skewed (especially right-skewed) in
practice, the bootstrap must anticipate skewness.
Because I found the literature on CIs for the CV to be partly
hidden behind dollar barriers, and partly poorly explained, I'm
wondering if bootstrap CIs may be the best solution for your application. I gave
two examples of bootstrap CIs below, one using a sample from a
normal population and one using a sample from a gamma population.
At least, you can compare these results with results from formulas
you may find in your Internet searches.
Bootstrap CIs. If we knew the distribution of $V = K - \kappa,$ we could find
bounds $L$ and $U$ cutting 2.5% from its lower and upper tails,
respectively to get $P(L < K - \kappa < U) = 0.95,$ from which
we would obtain the 95% CI $(K - U, K - L)$ for $\kappa.$
Not knowing the distribution of $V,$ we re-sample from our data
$X = (X_1, X_2, \dots, X_n).$ Iteratively we find re-samples
of size $n$ with replacement from $X,$ find $K^* = S^*/\bar X^*$
and then $V* = K^* - \kappa^*$ for each re-sample, where
the observed CV $K_{obs}$ from the original sample $X$ is used
for $\kappa^*.$ Finally, we get $L^*$ and $U^*$ by cutting 2.5%
from each tail of the $V^*$'s, the 'bootstrapped' values of $V$,
and use these estimated bounds to get the a 95% bootstrap CI.
Examples of Bootstrap CIs. As a demonstration, I use a sample $X$ if $n = 100$ from
$\mathsf{Norm}(\mu = 200, \sigma=25)$ with $\kappa = 0.125.$
In the outline above of the bootstrap procedure, $*$'s represented
quantities based on re-sampling. In the R program below we use .re
for the same purpose.
Note: It is important to understand that re-sampling does not
create additional information. Re-sampling exploits information in existing
data to do statistical analysis.
Normal. For the particular normal sample we used $K_{obs} = 0.118$, and
the 95% nonparametric bootstrap CI obtained is $(0.102, 0.135).$
Because bootstrap procedures involve random re-sampling, each run
of the program may give a slightly different CI, but not much
different with as many as $B = 10^5 = 100,000$ iterations.
x = rnorm(100, 200, 25)
k.obs = sd(x)/mean(x); k.obs
## 0.1180088
B = 10^5; v.re = numeric(B)
for(i in 1:B) {
x.re = sample(x, 100, repl=T)
k.re = sd(x.re)/mean(x.re)
v.re[i] = k.re - k.obs }
UL = quantile(v.re, c(.975,.025))
k.obs - UL
## 97.5% 2.5%
## 0.1018754 0.1350186
Gamma. This bootstrap procedure is called 'nonparametric' because it does
not assume any particular type of distribution for the data. A
second sample of size $n = 100$ was taken from the distribution
$\mathsf{Gamma}(shape=\alpha = 4, rate=\lambda=.1)$ with
$\kappa = \sqrt{\alpha}/\alpha = 1/2.$ This sample has $K = 0.507$
and the 95% nonparametric bootstrap CI is $(0.442, 0.579).$
A second run of the bootstrap program with the same data gave
the CI $(0.442, 0.580).$
I don't have Ross's book at hand. Without the context of what has
been covered recently, it is difficult to know what approach the
author expects you to take.
Perhaps you are supposed to assume that the nine observations are
from a nearly normal population. Then you could use the t-interval
you mentioned.
Perhaps use of a nonparametric procedure is intended. Then one
possibility would be the one-sided nonparametric confidence interval
that often accompanies a Wilcoxon signed-rank test. Here is an
example of that using R statistical software. (Other statistical software
will do much the same thing, but the syntax of the code will be
somewhat different.)
Given the following nine (fictional) housing prices in thousands of dollars,
y
## 410.4 413.3 430.4 460.4 473.9 508.3 533.6 546.7 555.3
we find that the 95% upper bound (for the population median) is $521,000, as shown below:
summary(y); sd(y)
Min. 1st Qu. Median Mean 3rd Qu. Max.
410.4 430.4 473.9 481.4 533.6 555.3
## 56.95459 # standard deviation
wilcox.test(y, conf.int=T, alte="less")
Wilcoxon signed rank test
data: y
V = 45, p-value = 1
alternative hypothesis: true location is less than 0
95 percent confidence interval:
-Inf 520.95
sample estimates:
(pseudo)median
482
Note:
(1) Using R software, the corresponding t-interval gives the bound \$516,700. It is not surprising that the t and Wilcoxon bounds are about the same
because I generated my fake housing prices from a normal distribution.
(2) Another nonparametric procedure uses the confidence
interval corresponding to a sign test. From Miniab 17 software the
bound for the population median is $537,200.
Best Answer
I do not recommend to use the Student approximation. Instead, it is better to observe that $X_1 + \dots + X_n \simeq \Gamma(n,\theta)$ with $\theta = 1/\lambda$. Therefore, $\frac{2}{\lambda}(X_1 + \dots + X_n) \simeq \Gamma(n,\frac12)=\chi^2_{2n}$. Hence you can construct the required interval through the quantiles of $\chi^2_{2n}$.