A common challenge in translating evidence from randomized controlled trials (RCTs) to real-world practice is that trial participants may not reflect the broader target population. By definition in Parikh et al. 2025, subgroups that are “underrepresented” or “insufficiently represented” often occupy regions of the covariate space with heterogeneous treatment effects and insufficient representation in the trial data. If certain subgroups are underrepresented in the trial, estimates of the Target Average Treatment Effect (TATE) can be imprecise or misleading when transported to that population. The Sample Average Treatment Effect (SATE) is a finite sample equivalent version of the TATE.
The resulting estimand from ROOT is the Weighted Target Average Treatment Effect (WTATE): the average treatment effect restricted to the sufficiently represented subpopulation, estimated with lower variance than the unweighted TATE.
This vignette walks through a complete generalizability analysis
using the built-in diabetes_data dataset.
diabetes_data Datasetdiabetes_data is a simulated dataset that mimics a
diabetes intervention study. There are 2,000 individuals in a randomized
controlled trial (RCT) sample, and there are 8,000 individuals in this
simulated population we are making inferences to.
library(ROOT)
data(diabetes_data, package = "ROOT")
str(diabetes_data)
#> 'data.frame': 10000 obs. of 7 variables:
#> $ Race_Black: int 0 1 1 0 0 0 0 0 0 0 ...
#> $ Sex_Male : int 1 0 1 1 1 1 1 0 1 1 ...
#> $ DietYes : int 0 0 0 0 0 0 0 0 0 0 ...
#> $ Age45 : int 0 1 0 0 1 0 0 0 1 0 ...
#> $ S : int 1 0 0 0 0 0 0 0 0 0 ...
#> $ Tr : int 0 NA NA NA NA NA NA NA NA NA ...
#> $ Y : num 0.818 NA NA NA NA ...The key columns are:
| Column | Description |
|---|---|
Y |
Observed outcome (numeric) |
Tr |
Treatment assignment (0 = control, 1 = treated) |
S |
Sample indicator (1 = RCT, 0 = target population) |
Age45 |
Age ≥ 45 (binary indicator) |
DietYes |
Currently on a diet programme (binary indicator) |
Race_Black |
Race: Black (binary indicator) |
Sex_Male |
Sex: Male (binary indicator) |
# How many trial vs target population units?
table(S = diabetes_data$S)
#> S
#> 0 1
#> 8000 2000
# Treatment breakdown within the trial
table(Tr = diabetes_data$Tr[diabetes_data$S == 1])
#> Tr
#> 0 1
#> 977 1023Before running ROOT, it is good practice to check whether trial participants differ from the target population on key covariates. Systematic differences signal which subgroups may be underrepresented.
# Mean of each covariate by S
covariate_cols <- c("Age45", "DietYes", "Race_Black", "Sex_Male")
overlap <- sapply(covariate_cols, function(v) {
tapply(diabetes_data[[v]], diabetes_data$S, mean, na.rm = TRUE)
})
knitr::kable(
t(overlap),
digits = 3,
caption = "Covariate means by sample membership (S = 1: trial, S = 0: target)"
)| 0 | 1 | |
|---|---|---|
| Age45 | 0.153 | 0.154 |
| DietYes | 0.099 | 0.096 |
| Race_Black | 0.315 | 0.172 |
| Sex_Male | 0.460 | 0.557 |
Differences across rows flag potential sources of underrepresentation that ROOT will attempt to characterize.
We use characterizing_underrep(), which is the
high-level wrapper around ROOT() for
generalizability/transportability analyses. It expects data
to contain Y, Tr, and S, and
internally:
gen_fit <- characterizing_underrep(
data = diabetes_data,
generalizability_path = TRUE,
num_trees = 20,
top_k_trees = TRUE,
k = 10,
seed = 123
)print(gen_fit)
#> characterizing_underrep object
#> --- ROOT brief summary ---
#> ROOT object
#> Generalizability mode: TRUE
#>
#> Summary classifier (f):
#> n= 2000
#>
#> node), split, n, loss, yval, (yprob)
#> * denotes terminal node
#>
#> 1) root 2000 735 1 (0.36750000 0.63250000)
#> 2) Race_Black>=0.5 345 0 0 (1.00000000 0.00000000) *
#> 3) Race_Black< 0.5 1655 390 1 (0.23564955 0.76435045)
#> 6) Age45>=0.5 260 0 0 (1.00000000 0.00000000) *
#> 7) Age45< 0.5 1395 130 1 (0.09318996 0.90681004)
#> 14) DietYes>=0.5 130 0 0 (1.00000000 0.00000000) *
#> 15) DietYes< 0.5 1265 0 1 (0.00000000 1.00000000) *
#>
#> Estimand summary (generalization mode):
#> Unweighted SATE = 5.48424, SE = 0.4290411
#> Weighted WTATE = 3.55263, SE = 0.3609503summary() additionally reports the Rashomon set size,
the percentage of observations with \(w_{\text{opt}} = 1\), and the unweighted
and weighted estimands with their standard errors.
summary(gen_fit)
#> characterizing_underrep object
#> --- ROOT summary ---
#> ROOT object
#> Generalizability mode: TRUE
#>
#> Summary classifier (f):
#> n= 2000
#>
#> node), split, n, loss, yval, (yprob)
#> * denotes terminal node
#>
#> 1) root 2000 735 1 (0.36750000 0.63250000)
#> 2) Race_Black>=0.5 345 0 0 (1.00000000 0.00000000) *
#> 3) Race_Black< 0.5 1655 390 1 (0.23564955 0.76435045)
#> 6) Age45>=0.5 260 0 0 (1.00000000 0.00000000) *
#> 7) Age45< 0.5 1395 130 1 (0.09318996 0.90681004)
#> 14) DietYes>=0.5 130 0 0 (1.00000000 0.00000000) *
#> 15) DietYes< 0.5 1265 0 1 (0.00000000 1.00000000) *
#>
#> Global objective function:
#> User-supplied: No (default objective used)
#>
#> Estimand summary (generalization mode):
#> Unweighted SATE = 5.48424, SE = 0.4290411
#> Weighted WTATE = 3.55263, SE = 0.3609503
#>
#> Diagnostics:
#> Number of trees grown: 20
#> Rashomon set size: 10
#> % observations with w_opt == 1: 63.2%
#>
#> Leaf summary: 4 terminal nodes
#> leaf_id rule predicted_w n
#> 2 root & Race_Black>=0.5 0 345
#> 6 root & Race_Black< 0.5 & Age45>=0.5 0 260
#> 14 root & Race_Black< 0.5 & Age45< 0.5 & DietYes>=0.5 0 130
#> 15 root & Race_Black< 0.5 & Age45< 0.5 & DietYes< 0.5 1 1265
#> pct label
#> 17.2 Under-represented (drop, w = 0)
#> 13.0 Under-represented (drop, w = 0)
#> 6.5 Under-represented (drop, w = 0)
#> 63.2 Represented (keep, w = 1)The SATE (unweighted) is the simple trial average treatment effect transported to the full target population. The WTATE (weighted) restricts this estimate to the well-represented subpopulation, where the trial provides more reliable evidence. A smaller standard error (SE) for the WTATE relative to the SATE reflects the variance reduction achieved by this restriction.
The leaf_summary component of the returned object gives
an explicit human-readable rule for each terminal node of the summary
tree, along with the number and percentage of observations in each leaf
and whether they are classified as represented (\(w = 1\)) or underrepresented (\(w = 0\)).
gen_fit$leaf_summary
#> leaf_id rule predicted_w n
#> 1 2 root & Race_Black>=0.5 0 345
#> 2 6 root & Race_Black< 0.5 & Age45>=0.5 0 260
#> 3 14 root & Race_Black< 0.5 & Age45< 0.5 & DietYes>=0.5 0 130
#> 4 15 root & Race_Black< 0.5 & Age45< 0.5 & DietYes< 0.5 1 1265
#> pct label
#> 1 17.2 Under-represented (drop, w = 0)
#> 2 13.0 Under-represented (drop, w = 0)
#> 3 6.5 Under-represented (drop, w = 0)
#> 4 63.2 Represented (keep, w = 1)plot() renders the final characterized tree from the
Rashomon set. Blue leaves (\(w = 1\))
denote well-represented subgroups; orange leaves (\(w = 0\)) denote underrepresented subgroups.
The percentage shown in each leaf is the share of trial units falling
into that node.
The tree reads top-down as a decision rule: starting from the root (all trial units), the first split separates subgroups that are wholly underrepresented from those that may be included. Follow the branches down to each leaf to read the complete inclusion/exclusion rule for that subgroup.
From the characterized tree and leaf summary, we can describe the underrepresented subgroups in plain language:
The Rashomon set provides multiple near-optimal characterizations of these subgroups. The final summary tree aggregates across all trees in the set, giving a single interpretable rule.
| Parameter | Role | Default |
|---|---|---|
num_trees |
Number of trees to grow in the forest | 10 |
top_k_trees |
If TRUE, select the top k
trees by objective value |
FALSE |
k |
Rashomon set size when
top_k_trees = TRUE |
10 |
cutoff |
Rashomon threshold when
top_k_trees = FALSE; "baseline" uses the
objective at \(w \equiv 1\) |
"baseline" |
vote_threshold |
Fraction of Rashomon-set trees that must vote \(w = 1\) for a unit to be included | 2/3 |
seed |
Random seed for reproducibility | NULL |
feature_est |
Feature importance method used to bias split selection
("Ridge", "GBM", or a custom function) |
"Ridge" |
leaf_proba |
Controls tree depth by increasing the probability of stopping at a leaf | 0.25 |
Parikh, H., Ross, R. K., Stuart, E., & Rudolph, K. E. (2025). Who Are We Missing?: A Principled Approach to Characterizing the Underrepresented Population. Journal of the American Statistical Association, 120(551), 1414–1423. https://doi.org/10.1080/01621459.2025.2495319