Introduction to Analitica

Carlos Jiménez-Gallardo

2025-06-12

1 Overview

The Analitica package provides essential tools for:

It is suitable for researchers, educators, and analysts seeking quick and interpretable workflows.

2 1. Descriptive Analysis

Use descripYG() to explore a numeric variable, optionally grouped by a categorical variable:

data(d_e, package = "Analitica")
descripYG(d_e, vd = Sueldo_actual)

#>     n     Mean Median       SD Kurtosis Skewness        CV   Min    Max   P25
#> 1 474 34419.57  28875 17075.66  8.30863 2.117877 0.4961033 15750 135000 24000
#>       P75     IQR Fence_Low Fence_High
#> 1 36937.5 12937.5   4593.75   56343.75
descripYG(d_e, vd = Sueldo_actual, vi = labor)
#> Picking joint bandwidth of 2460

#>   Group   n     Mean Median        SD  Kurtosis   Skewness         CV   Min
#> 1     1 363 27838.54  26550  7567.995 10.850828  1.8973062 0.27185316 15750
#> 2     2  27 30938.89  30750  2114.616  5.795226 -0.3472238 0.06834817 24300
#> 3     3  84 63977.80  60500 18244.776  4.913269  1.1597365 0.28517355 34410
#>      Max      P25      P75   IQR
#> 1  80000 22800.00 31200.00  8400
#> 2  35250 30150.00 30975.00   825
#> 3 135000 51956.25 71281.25 19325

3 2. Homogeneity of Variance Tests

You can assess variance assumptions using manual implementations:

Levene.Test(Sueldo_actual ~ labor, data = d_e)
#> $Statistic
#> [1] 36.089
#> 
#> $df
#> df_between  df_within 
#>          2        471 
#> 
#> $p_value
#> [1] 0
#> 
#> $Significance
#> [1] "***"
#> 
#> $Decision
#> [1] "Heteroscedastic"
#> 
#> $Method
#> [1] "Levene (median)"
#> 
#> attr(,"class")
#> [1] "homocedasticidad"
BartlettTest(Sueldo_actual ~ labor, data = d_e)
#> $Statistic
#> [1] 194.6489
#> 
#> $df
#> [1] 2
#> 
#> $p_value
#> [1] 0
#> 
#> $Significance
#> [1] "***"
#> 
#> $Decision
#> [1] "Heterocedastic"
#> 
#> $Method
#> [1] "Bartlett"
#> 
#> attr(,"class")
#> [1] "homocedasticidad"
FKTest(Sueldo_actual ~ labor, data = d_e)
#> $Statistic
#> [1] 88.2881
#> 
#> $df
#> [1] 2
#> 
#> $p_value
#> [1] 0
#> 
#> $Significance
#> [1] "***"
#> 
#> $Decision
#> [1] "Heteroscedastic"
#> 
#> $Method
#> [1] "Fligner-Killeen"
#> 
#> attr(,"class")
#> [1] "homocedasticidad"

4 3. Outlier Detection

Detect univariate outliers with Grubbs’ test:

res <- grubbs_outliers(d_e, Sueldo_actual)
head(res[res$outL == TRUE, ])
#>      ID Sexo   FechaNAc educacion labor Sueldo_actual Sueldo_inicial antigüedad
#> 18   18    h 20/03/1986        16     3        103750          27510         97
#> 29   29    h 28/01/1964        19     3        135000          79980         96
#> 32   32    h 28/01/1984        19     3        110625          45000         96
#> 34   34    h 02/02/1969        19     3         92000          39990         96
#> 103 103    h 17/03/1989        19     3         97000          35010         91
#> 106 106    h 04/08/1962        19     3         91250          29490         91
#>     experiencia minoria outL
#> 18           70       0 TRUE
#> 29          199       0 TRUE
#> 32          120       0 TRUE
#> 34          175       0 TRUE
#> 103          68       0 TRUE
#> 106          23       0 TRUE

5 4. Multiple Comparisons (Post Hoc Tests)

Fit an ANOVA model and apply post hoc tests:

mod <- aov(Sueldo_actual ~ as.factor(labor), data = d_e)
resultado <- GHTest(mod)
summary(resultado)
#> =====================================
#>   Multiple Comparison Method Summary
#> =====================================
#> Method used: Games-Howell 
#> 
#> >> Group means:
#>        1        2        3 
#> 27838.54 30938.89 63977.80 
#> 
#> >> Order of means (from highest to lowest):
#> [1] "3" "2" "1"
#> 
#> >> Pairwise comparisons:
#>    Comparacion Diferencia t_value    gl p_value Significancia
#> 1        1 - 2   3100.349  5.4518 93.07       0           ***
#> 11       1 - 3  36139.258 17.8034 89.71       0           ***
#> 2        2 - 3  33038.909 16.2606 89.58       0           ***
plot(resultado)

Other methods include TukeyTest(), ScheffeTest(), DuncanTest(), SNKTest(), T2Test(), and T3Test().

6 5. Non-Parametric Tests

When assumptions are violated, try:

g1 <- d_e$Sueldo_actual[d_e$labor == 1]
g2 <- d_e$Sueldo_actual[d_e$labor == 2]
MWTest(g1, g2)
#> $Resultados
#>            Comparacion Diferencia Valor_Critico p_value Significancia
#> Grupo2 Grupo1 - Grupo2   3100.349            NA   1e-04           ***
#> 
#> $Promedios
#>   Grupo1   Grupo2 
#> 27838.54 30938.89 
#> 
#> $Orden_Medias
#> [1] "Grupo2" "Grupo1"
#> 
#> $Metodo
#> [1] "Mann-Whitney U (two.sided, manual)"
#> 
#> attr(,"class")
#> [1] "comparacion" "mannwhitney"
BMTest(g1, g2)
#> $Resultados
#>            Comparacion Diferencia    df     SE t_critical p_value  p_hat
#> Grupo1 Grupo1 - Grupo2  -3100.349 64.98 9.7586     1.9971       0 0.7297
#>        Significancia
#> Grupo1           ***
#> 
#> $Promedios
#>   Grupo1   Grupo2 
#> 27838.54 30938.89 
#> 
#> $df
#> [1] 64.98189
#> 
#> $Orden_Medias
#> [1] "Grupo2" "Grupo1"
#> 
#> $Metodo
#> [1] "Brunner-Munzel (two.sided)"
#> 
#> $p_hat
#> [1] 0.7296704
#> 
#> attr(,"class")
#> [1] "comparacion"   "brunnermunzel"
BMpTest(g1, g2)
#> $Resultados
#>            Comparacion Diferencia Valor_Critico p_value  p_hat Significancia
#> Grupo2 Grupo1 - Grupo2   3100.349            NA       0 0.7297             *
#> 
#> $Promedios
#>   Grupo1   Grupo2 
#> 27838.54 30938.89 
#> 
#> $Orden_Medias
#> [1] "Grupo2" "Grupo1"
#> 
#> $Metodo
#> [1] "Brunner-Munzel (perm, two.sided)"
#> 
#> attr(,"class")
#> [1] "comparacion"        "brunnermunzel_perm"

7 Conclusion

Analitica integrates descriptive analysis with robust comparison methods for applied data exploration.

For detailed documentation, see ?Analitica or function-specific help pages like ?GHTest or ?descripYG.