Rashomon-set Optimal Trees for interpretable functional optimization and treatment effect generalizability
ROOT (Rashomon set of Optimal Trees) is a general functional optimization framework for learning interpretable binary weight functions, represented as sparse decision trees.
At its core, ROOT searches over functions \(w(X) \in \{0,1\}\) that assign inclusion or exclusion weights to units based on their covariates \(X\). The optimization objective (or loss function) can be chosen to reflect different scientific goals.
The original ROOT paper demonstrates this framework in the context of generalizing causal effects from randomized trials to a target population.
For a detailed worked example of ROOT in generalizability mode, see
the generalizability_path_example
vignette.
You can install the development version of ROOT from GitHub with:
# install.packages("devtools")
devtools::install_github("peterliu599/ROOT")ROOT can be applied to any optimization problem that can be expressed as a binary inclusion/exclusion decision. In this example, we use ROOT to select a minimum-variance portfolio from a universe of 100 simulated assets, each characterized by its market beta and annualized volatility. ROOT learns an interpretable tree-structured rule describing which assets to include (\(w = 1\)) to minimize portfolio return variance.
library(ROOT)
set.seed(123)
n_assets <- 100
# Asset features
volatility <- runif(n_assets, 0.05, 0.40) # annualised volatility
beta <- runif(n_assets, 0.5, 1.8) # market beta
sector <- sample(c("Tech", "Finance", "Energy", "Health"),
n_assets, replace = TRUE)
# Simulate returns correlated with beta and volatility
market <- rnorm(1000, 0.0005, 0.01)
returns_mat <- sapply(seq_len(n_assets), function(i)
beta[i] * market + rnorm(1000, 0, volatility[i] / sqrt(252))
)
vsq <- apply(returns_mat, 2, var) # per-asset return variance (objective)
dat_portfolio <- data.frame(
vsq = vsq,
vol = volatility,
beta = beta,
sector = as.integer(factor(sector))
)
portfolio_fit <- ROOT(
data = dat_portfolio,
num_trees = 20,
top_k_trees = TRUE,
k = 10,
seed = 42
)
summary(portfolio_fit) # for a full summary output## ROOT object
## Generalizability mode: FALSE
##
## Summary classifier (f):
## n= 100
##
## node), split, n, loss, yval, (yprob)
## * denotes terminal node
##
## 1) root 100 4 1 (0.0400000 0.9600000)
## 2) beta>=1.658134 12 4 1 (0.3333333 0.6666667)
## 4) vol>=0.3285209 4 0 0 (1.0000000 0.0000000) *
## 5) vol< 0.3285209 8 0 1 (0.0000000 1.0000000) *
## 3) beta< 1.658134 88 0 1 (0.0000000 1.0000000) *
##
## Global objective function:
## User-supplied: No (default objective used)
##
## Diagnostics:
## Number of trees grown: 20
## Rashomon set size: 10
## % observations with w_opt == 1: 96.0%
print(portfolio_fit) # for a brief summary print## ROOT object
## Generalizability mode: FALSE
##
## Summary classifier (f):
## n= 100
##
## node), split, n, loss, yval, (yprob)
## * denotes terminal node
##
## 1) root 100 4 1 (0.0400000 0.9600000)
## 2) beta>=1.658134 12 4 1 (0.3333333 0.6666667)
## 4) vol>=0.3285209 4 0 0 (1.0000000 0.0000000) *
## 5) vol< 0.3285209 8 0 1 (0.0000000 1.0000000) *
## 3) beta< 1.658134 88 0 1 (0.0000000 1.0000000) *
plot(portfolio_fit)
The characterized tree recovers an intuitive and interpretable portfolio construction rule. Assets with beta < 1.7 are included (88% of the universe). Among the remaining high-beta beta \(\geq\) 1.7 assets, those with volatility \(\geq\) 0.33 are excluded (\(w = 0\), 4%), while high-beta assets with volatility < 0.33 are retained (8%). Overall, the final majority vote by the Rashomon set of 10 near-optimal trees includes 96% of assets in the final portfolio and screens out only the most risk-concentrated subset.
For a detailed worked example of ROOT in optimization mode, see the
optimization_path_example
vignette.
If you encounter any bugs or have any specific feature requests, please file an issue.
The contents of this repository are distributed under the MIT license.
Parikh, H., Ross, R. K., Stuart, E., & Rudolph, K. E. (2025). Who Are We Missing?: A Principled Approach to Characterizing the Underrepresented Population. Journal of the American Statistical Association, 120(551), 1414–1423. https://doi.org/10.1080/01621459.2025.2495319