Coefficient of Variation

Coefficient of variation (\(CV\)) is a measure of relative dispersion representing the degree of variability relative to the mean (Albatineh, Kibria, Wilcox, & Zogheib, 2014). Since cv is unitless, it is useful for comparison of variables with different units (Albatineh et al., 2014). It is also a measure of homogeneity. The population coefficient of variation is:
\[CV = \frac{\sigma}{\mu},\] where \(\sigma\) is the population standard deviation and \(\mu\) is the population mean. Almost always, we analyze data from samples but want to generalize it as the population’s parameter (Albatineh et al., 2014). Its sample’s estimate is given as:
\[cv = \frac{sd}{\bar{X}}\]
where \(sd\) is the sample standard deviation, the square root of the unbiased estimator of population variance, and \(\bar{X}\) is the sample mean. The corrected cv to account for the sample size is: \[ cv_{corr} = cv * \biggl(1 - \frac{1}{4(n-1)} + \frac{1}{n}cv^2 + \frac{1}{2 (n-1)^2} \biggr) \] There are various methods for the calculation of confidence intervals (CI) for cv. All of them are fruitful and have particular use cases. Some of them are model-based hence their usage depends the assumptions regarding the distribution of data. For sake of versatility, we cover almost all of these methods in cvcqv package. Here, we explain them along with some examples:

Kelley Confidence Interval

Let us assume that CV follows a noncentral t distribution, when the parent population of the scores is normally-distributed, with noncentrality (\(\lambda\)) parameter:
\[ \lambda = \frac{\sqrt{n}}{cv} \] with v degrees of freedom, where \(v = n - 1\). Let \(1 - \alpha\) be the CI coverage with \(\alpha_L + \alpha_U = \alpha\) in which \(\alpha_L\) is the the proportion of times that cv will be less than the lower confidence bound and \(\alpha_U\) the proportion of times that cv will be greater than the upper confidence bound in the CI procedure (Kelley, 2007). The lower confidence tile for \(\lambda\) is is the noncentrality parameter that results in \(t_{(1-\alpha_L,v,\lambda_L)}=\hat{\lambda}\) and the upper confidence tile for \(\lambda\) is is the noncentrality parameter that results in \(t_{(\alpha_U,v,\lambda_U)}=\hat{\lambda}\), where \(t_{(1-\alpha_L,v,\lambda_L)}=\hat{\lambda}\) is the value of noncentral t distribution at the \(1-\alpha_L\) quantile with noncentrality parameter \(\lambda_L\) and \(t_{(\alpha_U,v,\lambda_U)}=\hat{\lambda}\) is the value of noncentral t distribution at the \(\alpha_U\) quantile with noncentrality parameter \(\lambda_U\), respectively (Kelley, 2007).
Afterwards, we transform the tiles of the confidence interval for \(\lambda\), by dividing the tiles by \(\sqrt{n}\) and thereafter inverting them; the CI limits of \(cv\) will be obtained:
\[ p\left[\biggl(\frac{\lambda_U}{\sqrt{n}}\biggr)^{-1} \le CV \le \biggl(\frac{\lambda_L}{\sqrt{n}}\biggr)^{-1}\right] = 1-\alpha \] where \(p\) stands for probability. Thanks to package MBESS (Kelley, 2018) for the computation of confidence limits for the noncentrality parameter from a t distribution (conf.limits.nct), \(cv\) will be obtained as:

x <- c(
  0.2, 0.5, 1.1, 1.4, 1.8, 2.3, 2.5, 2.7, 3.5, 4.4,
  4.6, 5.4, 5.4, 5.7, 5.8, 5.9, 6.0, 6.6, 7.1, 7.9
)
cv_versatile(
  x,
  na.rm = TRUE,
  digits = 3,
  method = "kelley",
  correction = TRUE,
  alpha = 0.05
)

## $method
## [1] "Corrected cv with Kelley 95% CI"
## 
## $statistics
##      est  lower upper
##   58.058 41.467 98.51

McKay Confidence Interval

McKay (McKay, 1932) introduced the following CI for \(cv\); considering \(u_1 = \chi_{v,1-\alpha/2}^2\) and \(u_1 = \chi_{v,\alpha/2}^2\) being the \(100(1-\alpha/2)\%\) and \(100(\alpha/2)\%\) percentile of the \(\chi^2\) distribution with \(v = n-1\) degrees of freedom, respectively (Albatineh et al., 2014):
\[ \biggl(cv\left[\biggl(\frac{u_1}{v}-1\biggr)(cv)^{2}+\frac{u_1}{v}\right]^{-1/2} \le CV \le cv \left[\biggl(\frac{u_2}{v}-1\biggr)(cv)^{2}+\frac{u_2}{v}\right]^{-1/2}\biggr) \] Let us calculate the 95% CI for our variable \(x\) according to McKay’s method (McKay, 1932):

cv_versatile(
  x,
  na.rm = TRUE,
  digits = 3,
  method = "mckay",
  correction = TRUE,
  alpha = 0.05
)

## $method
## [1] "Corrected cv with McKay 95% CI"
## 
## $statistics
##      est  lower   upper
##   58.058 41.622 109.367

Miller Confidence Interval

Miller (Edward Miller, 1991) introduced the following CI for \(cv\); considering \(Z_{\alpha/2}\) being the \((1-\alpha/2)\) percentile of the standard normal distribution (Albatineh et al., 2014): \[ \biggl(cv - Z_{\alpha/2}\sqrt{ \biggl(\frac{cv^2}{v}\biggr)\biggl(\frac{1}{2}+cv^2\biggr)} \le CV \le cv + Z_{\alpha/2}\sqrt{ \biggl(\frac{cv^2}{v}\biggr)\biggl(\frac{1}{2}+cv^2\biggr)} \biggr) \] where \(v = n-1\) is the degree of freedom.
Let us calculate the 95% CI for \(x\) according to Miller’s method (Edward Miller, 1991):

cv_versatile(
  x,
  na.rm = TRUE,
  digits = 3,
  method = "miller",
  correction = TRUE,
  alpha = 0.05
)

## $method
## [1] "Corrected cv with Miller 95% CI"
## 
## $statistics
##      est  lower  upper
##   58.058 34.173 81.942

Vangel Confidence Interval

Vangel (Vangel, 1996) proposed the following CI for \(cv\); which is a modification on McKay’s CI: \[ \biggl(cv\left[\biggl(\frac{u_1+1}{v}-1\biggr)(cv)^{2}+\frac{u_1}{v}\right]^{-1/2} \le CV \le cv \left[\biggl(\frac{u_2+1}{v}-1\biggr)(cv)^{2}+\frac{u_2}{v}\right]^{-1/2}\biggr) \] Let us calculate the 95% CI for \(x\) according to Vangel’s method (Vangel, 1996):

cv_versatile(
  x,
  na.rm = TRUE,
  digits = 3,
  method = "vangel",
  correction = TRUE,
  alpha = 0.05
)

## $method
## [1] "Corrected cv with Vangel 95% CI"
## 
## $statistics
##      est  lower   upper
##   58.058 41.443 106.237

Mahmoudvand-Hassani Confidence Interval

Mahmoudvand and Hassani (Mahmoudvand & Hassani, 2009) proposed the following CI for \(cv\); which is obtained using ranked set sampling (RSS):
\[ \biggl(\frac{cv}{2-C_n+Z_{1-\alpha/2}\sqrt{1-C_n^2}} \le CV \le \frac{cv}{2-C_n-Z_{1-\alpha/2}\sqrt{1-C_n^2}} \biggr) \] where \[ C_n=\sqrt{\frac{2}{n-1}}\frac{\Gamma{(n/2)}}{\Gamma{((n-1)/2)}}, \Gamma(n)=(n-1)! \] Let us now calculate the 95% CI for \(x\) according to Mahmoudvand-Hassani’s method (Mahmoudvand & Hassani, 2009):

cv_versatile(
  x,
  na.rm = TRUE,
  digits = 3,
  method = "mahmoudvand_hassani",
  correction = TRUE,
  alpha = 0.05
)

## $method
## [1] "Corrected cv with Mahmoudvand-Hassani 95% CI"
## 
## $statistics
##      est lower  upper
##   58.058 43.69 83.264

Normal Approximation Confidence Interval

Wararit Panichkitkosolkul (Panichkitkosolkul, 2013) proposed the following CI for \(cv\); which is a normal approximation: \[ \biggl(\frac{cv}{C_{n+1}+Z_{1-\alpha/2}\sqrt{1-C_{n+1}^2}} \le CV \le \frac{cv}{C_{n+1}-Z_{1-\alpha/2}\sqrt{1-C_{n+1}^2}} \biggr) \] where \(C_{n+1}=\sqrt{1-(1/2n)}\)
Now we calculate the normal approximation 95% CI for \(x\) according to Panichkitkosolkul (Panichkitkosolkul, 2013):

cv_versatile(
  x,
  na.rm = TRUE,
  digits = 3,
  method = "normal_approximation",
  correction = TRUE,
  alpha = 0.05
)

## $method
## [1] "Corrected cv with Normal Approximation 95% CI"
## 
## $statistics
##      est  lower  upper
##   58.058 44.752 85.691

Shortest-Length Confidence Interval

Panichkitkosolkul (Panichkitkosolkul, 2013) has also introduced the following CI for \(cv\):
\[ \biggl(\frac{cv\sqrt{v}}{\sqrt{b}} \le CV \le \frac{cv\sqrt{v}}{\sqrt{a}} \biggr) \] with \(v = n-1\) degrees of freedom. Then, shortest-length 95% CI for \(x\) is:

cv_versatile(
  x,
  na.rm = TRUE,
  digits = 3,
  method = "shortest_length",
  correction = TRUE,
  alpha = 0.05
)

## $method
## [1] "Corrected cv with Shortest-Length 95% CI"
## 
## $statistics
##      est  lower  upper
##   58.058 42.221 81.411

Equal-Tailed Confidence Interval

The \(100(1-\alpha)\%\) equal-tailed CI for \(cv\) can be obtained as: \[ \biggl(\frac{cv\sqrt{v}}{\sqrt{\chi_{v,1-\alpha/2}^2}} \le CV \le \frac{cv\sqrt{v}}{\sqrt{\chi_{v,\alpha/2}^2}} \biggr) \] where \(\chi_{v,\alpha/2}^2\) and \(\chi_{v,1-\alpha/2}^2\) are the \(100(\alpha/2)\) and \(100(1-\alpha/2)\) percentiles of the central \(\chi^2\) distribution with \(v\) degrees of freedom, respectively (Panichkitkosolkul, 2013).
Then, equal-tailed 95% CI for \(x\) is:

cv_versatile(
  x,
  na.rm = TRUE,
  digits = 3,
  method = "equal_tailed",
  correction = TRUE,
  alpha = 0.05
)

## $method
## [1] "Corrected cv with Equal-Tailed 95% CI"
## 
## $statistics
##      est  lower  upper
##   58.058 44.152 84.797

Bootstrap Confidence Intervals

Thanks to package boot (Canty & Ripley, 2017) we can obtain bootstrap CI around \(cv\):

cv_versatile(
  x,
  na.rm = TRUE,
  digits = 3,
  method = "basic",
  correction = TRUE,
  alpha = 0.05
)

## $method
## [1] "Corrected cv with Basic Bootstrap 95% CI"
## 
## $statistics
##      est lower  upper
##   58.058 37.66 76.888

All Available Methods

In conclusion, we can observe CIs calculated by all available methods:

cv_versatile(
  x,
  na.rm = TRUE,
  digits = 3,
  method = "all",
  correction = FALSE,
  alpha = 0.05
)

## $method
## [1] "All methods"
## 
## $statistics
##                         est  lower   upper
## kelley               57.774 41.467  98.510
## mckay                57.774 41.441 108.482
## miller               57.774 34.053  81.494
## vangel               57.774 41.264 105.424
## mahmoudvand_hassani  57.774 43.476  82.857
## equal_tailed         57.774 43.936  84.382
## shortest_length      57.774 42.014  81.012
## normal_approximation 57.774 44.533  85.272
## norm                 57.774 39.102  78.184
## basic                57.774 37.797  78.229
##                                                        description
## kelley                                       cv with Kelley 95% CI
## mckay                                         cv with McKay 95% CI
## miller                                       cv with Miller 95% CI
## vangel                                       cv with Vangel 95% CI
## mahmoudvand_hassani             cv with Mahmoudvand-Hassani 95% CI
## equal_tailed                           cv with Equal-Tailed 95% CI
## shortest_length                     cv with Shortest-Length 95% CI
## normal_approximation           cv with Normal Approximation 95% CI
## norm                 cv with Normal Approximation Bootstrap 95% CI
## basic                               cv with Basic Bootstrap 95% CI

Coefficient of Variation: cv_versatile

Maani Beigy

February 18, 2019