crosstable provides a high level of customization. While
the available options may not be immediately intuitive at first, they
allow fine control over how summaries and effects are computed and
displayed.
Before exploring these options, we start by loading the package and setting a few convenient defaults.
Customization in crosstable mainly happens at three
levels:
funs)effect_args)test_args)Summary functions (funs) describe each group separately,
whereas effect_args and test_args control how
groups are compared. effect_args controls the estimated
effect size and confidence interval, whereas test_args
controls the hypothesis test and p-value.
funs argumentNumeric variables are summarized using a set of summary functions. By
default, crosstable reports: min/max,
median/IQR, mean/sd and
number of observations/missing. These summaries are
generated by the internal function cross_summary().
These summaries can be customized depending on how you want numeric variables to be reported.
In practice, you will often want all numeric variables to be
summarized in the same way.
For this reason, it is convenient to define funs globally
with crosstable_options(), although you can also pass it
directly to crosstable().
The first possibility is to use a named list of functions. If a
function returns multiple values (as with quantile()), the
names of the returned statistics are automatically combined.
crosstable_options(funs=c("mean"=mean, "std dev"=sd, qtl=~quantile(.x, prob=c(0.25, 0.75))))
crosstable(mtcars2, mpg) %>% as_flextable()value | |
|---|---|
Miles/(US) gallon (mpg) | |
mean | 20.1 |
std dev | 6.0 |
qtl 25% | 15.4 |
qtl 75% | 22.8 |
Another option is to provide a custom summary function that returns
several statistics at once. In this case, you should give the function
an empty name (" ") so that its internal labels are used
directly.
f = function(x) c("Mean (SD)"=meansd(x), "Med [IQR]"=mediqr(x))
crosstable(mtcars2, wt, funs=f) %>% as_flextable()value | |
|---|---|
Weight (1000 lbs) (wt) | |
f Mean (SD) | 3.22 (0.98) |
f Med [IQR] | 3.33 [2.58;3.61] |
value | |
|---|---|
Weight (1000 lbs) (wt) | |
Mean (SD) | 3.22 (0.98) |
Med [IQR] | 3.33 [2.58;3.61] |
To this end, crosstable exports convenience functions:
meansd(), meanCI(), mediqr(),
minmax(), and nna().
When effect = TRUE, crosstable computes an
effect comparing the levels of the by variable.
Effect calculation is controlled by the effect_args
argument, which defaults to the result of
crosstable_effect_args().
The function used for actual calculation depends on the type of variable being analyzed:
effect_summarize for numeric variableseffect_tabular for categorical variableseffect_survival for survival outcomesBy default, effect_tabular is set to
effect_odds_ratio(), which computes an odds ratio for
categorical variables.
Engine | effect | ||
|---|---|---|---|
straight | vshaped | ||
Transmission (am) | Odds ratio [95% Wald CI], ref='vshaped vs straight' | ||
auto | 7 (36.84%) | 12 (63.16%) | |
manual | 7 (53.85%) | 6 (46.15%) | |
Suppose that instead of an odds ratio, you want to compute a difference in proportions.
To define a custom categorical effect, you need to write a function that takes:
x: the variable being summarizedby: the grouping variableconf.level: the confidence leveland returns a list with the following elements:
summary: a data frame containing the effect label,
estimate, and confidence interval.x has more than two
levels.effect.type: the name of the effect being computedref: the reference level or comparison labelThe following example computes a difference in proportions and uses
prop.test() to derive the confidence interval.
ct_effect_prop_diff = function(x, by, conf.level){
tb = table(x, by)
test = prop.test(tb, conf.level=conf.level)
nms = dimnames(tb)[["x"]]
effect = diff(test$estimate)
effect.type = "Difference of proportions"
reference = glue::glue(", {nms[1]} vs {nms[2]}")
summary = data.frame(name = "Proportion difference", effect,
ci_inf = test$conf.int[1],
ci_sup = test$conf.int[2])
list(summary = summary, ref = reference, effect.type = effect.type)
}
my_effect_args = crosstable_effect_args(effect_tabular=ct_effect_prop_diff)
# crosstable_options(effect_args=my_effect_args) #set globally if desired
mtcars2 %>%
crosstable(am, by=vs, effect=TRUE, effect_args=my_effect_args) %>%
as_flextable()Engine | effect | ||
|---|---|---|---|
straight | vshaped | ||
Transmission (am) | Difference of proportions, auto vs manual | ||
auto | 7 (36.84%) | 12 (63.16%) | |
manual | 7 (53.85%) | 6 (46.15%) | |
The same general approach can be used to define custom effects for numeric and survival variables.
Several alternative effect functions are already implemented in
crosstable.
See ?effect_summary, ?effect_tabular, and
?effect_survival for available options.
Customizing statistical tests is even simpler.
A custom test function only needs to return a list with two
elements:
p.value: the p-valuemethod: the label displayed for the testFor example, the following function replaces the default numeric test
with a linear model.
In a two-group setting, this is close in spirit to a classical
comparison test, but it illustrates how custom testing logic can be
integrated into crosstable.
ct_test_lm = function(x, by){
fit = lm(x ~ by)
pval = anova(fit)$`Pr(>F)`[1]
list(p.value = pval, method = "Linear model ANOVA")
}
my_test_args = crosstable_test_args(test_summarize=ct_test_lm)
# crosstable_options(test_args=my_test_args) #set globally if desired
mtcars2 %>%
crosstable(mpg, by=vs, test=TRUE, test_args=my_test_args) %>%
as_flextable()Engine | test | ||
|---|---|---|---|
straight | vshaped | ||
Miles/(US) gallon (mpg) | p value: <0.0001 | ||
mean | 24.6 | 16.6 | |
std dev | 5.4 | 3.9 | |
qtl 25% | 21.4 | 14.8 | |
qtl 75% | 29.6 | 19.1 | |