Coefficient of variation (\(CV\)) is a measure of relative
dispersion representing the degree of variability relative to the mean
(Albatineh, Kibria, Wilcox, & Zogheib,
2014). Since cv is unitless, it is useful for comparison of
variables with different units (Albatineh et al.,
2014). It is also a measure of homogeneity. The
population coefficient of variation is:
\[CV = \frac{\sigma}{\mu},\] where
\(\sigma\) is the population standard
deviation and \(\mu\) is the population
mean. Almost always, we analyze data from samples but want to generalize
it as the population’s parameter (Albatineh et
al., 2014). Its sample’s estimate is given as:
\[cv = \frac{sd}{\bar{X}}\]
where \(sd\) is the sample standard
deviation, the square root of the unbiased estimator of population
variance, and \(\bar{X}\) is the sample
mean. The corrected cv to account for the sample size is: \[
cv_{corr} = cv * \biggl(1 - \frac{1}{4(n-1)}
+ \frac{1}{n}cv^2
+ \frac{1}{2 (n-1)^2} \biggr)
\] There are various methods for the calculation of
confidence intervals (CI) for cv. All of them
are fruitful and have particular use cases. Some of them are model-based
hence their usage depends the assumptions regarding the distribution of
data. For sake of versatility, we cover almost all of these methods in
cvcqv package. Here, we explain them along with some
examples:
Let us assume that CV follows a noncentral t
distribution, when the parent population of the scores is
normally-distributed, with noncentrality (\(\lambda\)) parameter:
\[
\lambda = \frac{\sqrt{n}}{cv}
\] with v degrees of freedom, where \(v = n - 1\). Let \(1 - \alpha\) be the CI coverage with \(\alpha_L + \alpha_U = \alpha\) in which
\(\alpha_L\) is the the proportion of
times that cv will be less than the lower confidence bound and
\(\alpha_U\) the proportion of times
that cv will be greater than the upper confidence bound in the
CI procedure (Kelley, 2007). The lower
confidence tile for \(\lambda\) is is
the noncentrality parameter that results in \(t_{(1-\alpha_L,v,\lambda_L)}=\hat{\lambda}\)
and the upper confidence tile for \(\lambda\) is is the noncentrality parameter
that results in \(t_{(\alpha_U,v,\lambda_U)}=\hat{\lambda}\),
where \(t_{(1-\alpha_L,v,\lambda_L)}=\hat{\lambda}\)
is the value of noncentral t distribution at the \(1-\alpha_L\) quantile with
noncentrality parameter \(\lambda_L\)
and \(t_{(\alpha_U,v,\lambda_U)}=\hat{\lambda}\)
is the value of noncentral t distribution at the \(\alpha_U\) quantile with
noncentrality parameter \(\lambda_U\),
respectively (Kelley, 2007).
Afterwards, we transform the tiles of the confidence interval for \(\lambda\), by dividing the tiles by \(\sqrt{n}\) and thereafter inverting them;
the CI limits of \(cv\) will be
obtained:
\[
p\left[\biggl(\frac{\lambda_U}{\sqrt{n}}\biggr)^{-1}
\le CV \le \biggl(\frac{\lambda_L}{\sqrt{n}}\biggr)^{-1}\right] =
1-\alpha
\] where \(p\) stands for
probability. Thanks to package MBESS (Kelley, 2018) for the computation of confidence
limits for the noncentrality parameter from a t distribution
(conf.limits.nct), \(cv\)
will be obtained as:
x <- c(
0.2, 0.5, 1.1, 1.4, 1.8, 2.3, 2.5, 2.7, 3.5, 4.4,
4.6, 5.4, 5.4, 5.7, 5.8, 5.9, 6.0, 6.6, 7.1, 7.9
)
cv_versatile(
x,
na.rm = TRUE,
digits = 3,
method = "kelley",
correction = TRUE,
alpha = 0.05
)## $method
## [1] "Corrected cv with Kelley 95% CI"
##
## $statistics
## est lower upper
## 58.058 41.467 98.51
McKay (McKay, 1932) introduced the
following CI for \(cv\); considering
\(u_1 = \chi_{v,1-\alpha/2}^2\) and
\(u_1 = \chi_{v,\alpha/2}^2\) being the
\(100(1-\alpha/2)\%\) and \(100(\alpha/2)\%\) percentile of the \(\chi^2\) distribution with \(v = n-1\) degrees of freedom, respectively
(Albatineh et al., 2014):
\[
\biggl(cv\left[\biggl(\frac{u_1}{v}-1\biggr)(cv)^{2}+\frac{u_1}{v}\right]^{-1/2}
\le CV \le cv
\left[\biggl(\frac{u_2}{v}-1\biggr)(cv)^{2}+\frac{u_2}{v}\right]^{-1/2}\biggr)
\] Let us calculate the 95% CI for our variable \(x\) according to McKay’s method (McKay, 1932):
## $method
## [1] "Corrected cv with McKay 95% CI"
##
## $statistics
## est lower upper
## 58.058 41.622 109.367
Miller (Edward Miller, 1991) introduced
the following CI for \(cv\);
considering \(Z_{\alpha/2}\) being the
\((1-\alpha/2)\) percentile of the
standard normal distribution (Albatineh et al.,
2014): \[
\biggl(cv - Z_{\alpha/2}\sqrt{
\biggl(\frac{cv^2}{v}\biggr)\biggl(\frac{1}{2}+cv^2\biggr)} \le
CV \le cv + Z_{\alpha/2}\sqrt{
\biggl(\frac{cv^2}{v}\biggr)\biggl(\frac{1}{2}+cv^2\biggr)}
\biggr)
\] where \(v = n-1\) is the
degree of freedom.
Let us calculate the 95% CI for \(x\)
according to Miller’s method (Edward Miller,
1991):
## $method
## [1] "Corrected cv with Miller 95% CI"
##
## $statistics
## est lower upper
## 58.058 34.173 81.942
Vangel (Vangel, 1996) proposed the following CI for \(cv\); which is a modification on McKay’s CI: \[ \biggl(cv\left[\biggl(\frac{u_1+1}{v}-1\biggr)(cv)^{2}+\frac{u_1}{v}\right]^{-1/2} \le CV \le cv \left[\biggl(\frac{u_2+1}{v}-1\biggr)(cv)^{2}+\frac{u_2}{v}\right]^{-1/2}\biggr) \] Let us calculate the 95% CI for \(x\) according to Vangel’s method (Vangel, 1996):
## $method
## [1] "Corrected cv with Vangel 95% CI"
##
## $statistics
## est lower upper
## 58.058 41.443 106.237
Mahmoudvand and Hassani (Mahmoudvand &
Hassani, 2009) proposed the following CI for \(cv\); which is obtained using ranked set
sampling (RSS):
\[
\biggl(\frac{cv}{2-C_n+Z_{1-\alpha/2}\sqrt{1-C_n^2}}
\le CV \le
\frac{cv}{2-C_n-Z_{1-\alpha/2}\sqrt{1-C_n^2}}
\biggr)
\] where \[
C_n=\sqrt{\frac{2}{n-1}}\frac{\Gamma{(n/2)}}{\Gamma{((n-1)/2)}},
\Gamma(n)=(n-1)!
\] Let us now calculate the 95% CI for \(x\) according to Mahmoudvand-Hassani’s
method (Mahmoudvand & Hassani,
2009):
cv_versatile(
x,
na.rm = TRUE,
digits = 3,
method = "mahmoudvand_hassani",
correction = TRUE,
alpha = 0.05
)## $method
## [1] "Corrected cv with Mahmoudvand-Hassani 95% CI"
##
## $statistics
## est lower upper
## 58.058 43.69 83.264
Wararit Panichkitkosolkul (Panichkitkosolkul,
2013) proposed the following CI for \(cv\); which is a normal
approximation: \[
\biggl(\frac{cv}{C_{n+1}+Z_{1-\alpha/2}\sqrt{1-C_{n+1}^2}}
\le CV \le
\frac{cv}{C_{n+1}-Z_{1-\alpha/2}\sqrt{1-C_{n+1}^2}}
\biggr)
\] where \(C_{n+1}=\sqrt{1-(1/2n)}\)
Now we calculate the normal approximation 95% CI for \(x\) according to Panichkitkosolkul (Panichkitkosolkul, 2013):
cv_versatile(
x,
na.rm = TRUE,
digits = 3,
method = "normal_approximation",
correction = TRUE,
alpha = 0.05
)## $method
## [1] "Corrected cv with Normal Approximation 95% CI"
##
## $statistics
## est lower upper
## 58.058 44.752 85.691
Panichkitkosolkul (Panichkitkosolkul,
2013) has also introduced the following CI for \(cv\):
\[
\biggl(\frac{cv\sqrt{v}}{\sqrt{b}}
\le CV \le
\frac{cv\sqrt{v}}{\sqrt{a}}
\biggr)
\] with \(v = n-1\) degrees of
freedom. Then, shortest-length 95% CI for \(x\) is:
cv_versatile(
x,
na.rm = TRUE,
digits = 3,
method = "shortest_length",
correction = TRUE,
alpha = 0.05
)## $method
## [1] "Corrected cv with Shortest-Length 95% CI"
##
## $statistics
## est lower upper
## 58.058 42.221 81.411
The \(100(1-\alpha)\%\) equal-tailed
CI for \(cv\) can be obtained as: \[
\biggl(\frac{cv\sqrt{v}}{\sqrt{\chi_{v,1-\alpha/2}^2}}
\le CV \le
\frac{cv\sqrt{v}}{\sqrt{\chi_{v,\alpha/2}^2}}
\biggr)
\] where \(\chi_{v,\alpha/2}^2\)
and \(\chi_{v,1-\alpha/2}^2\) are the
\(100(\alpha/2)\) and \(100(1-\alpha/2)\) percentiles of the
central \(\chi^2\) distribution with
\(v\) degrees of freedom, respectively
(Panichkitkosolkul, 2013).
Then, equal-tailed 95% CI for \(x\)
is:
cv_versatile(
x,
na.rm = TRUE,
digits = 3,
method = "equal_tailed",
correction = TRUE,
alpha = 0.05
)## $method
## [1] "Corrected cv with Equal-Tailed 95% CI"
##
## $statistics
## est lower upper
## 58.058 44.152 84.797
Thanks to package boot (Canty
& Ripley, 2017) we can obtain bootstrap CI around \(cv\):
## $method
## [1] "Corrected cv with Basic Bootstrap 95% CI"
##
## $statistics
## est lower upper
## 58.058 35.372 77.899
In conclusion, we can observe CIs calculated by all available methods:
## $method
## [1] "All methods"
##
## $statistics
## est lower upper
## kelley 57.774 41.467 98.510
## mckay 57.774 41.441 108.482
## miller 57.774 34.053 81.494
## vangel 57.774 41.264 105.424
## mahmoudvand_hassani 57.774 43.476 82.857
## equal_tailed 57.774 43.936 84.382
## shortest_length 57.774 42.014 81.012
## normal_approximation 57.774 44.533 85.272
## norm 57.774 39.254 78.560
## basic 57.774 37.798 77.753
## description
## kelley cv with Kelley 95% CI
## mckay cv with McKay 95% CI
## miller cv with Miller 95% CI
## vangel cv with Vangel 95% CI
## mahmoudvand_hassani cv with Mahmoudvand-Hassani 95% CI
## equal_tailed cv with Equal-Tailed 95% CI
## shortest_length cv with Shortest-Length 95% CI
## normal_approximation cv with Normal Approximation 95% CI
## norm cv with Normal Approximation Bootstrap 95% CI
## basic cv with Basic Bootstrap 95% CI