Optimization Path Example: Portfolio Selection via Variance Minimization

Overview

ROOT is a general functional optimization framework: by supplying a custom objective function, it can be applied to any problem where the goal is to learn an interpretable binary weight function $w(X) \in \{0, 1\}$ over a set of groups by covariates of interest.

In general optimization mode (generalizability_path = FALSE), the user provides a data.frame with an optional column vsq (a per-unit variance proxy, or the outcome to minimize) and any covariates to split on. ROOT searches over tree-structured weight functions to minimize the supplied objective, then returns a Rashomon set of near-optimal trees and a single summary tree characterizing the final weight assignments. The decision to vote on a single summary tree can be by default majority vote, or the user can specify their own voting functions.

This vignette demonstrates ROOT in general optimization mode using a portfolio selection problem: given a universe of 100 simulated assets characterized by their market beta and annualized volatility, ROOT learns an interpretable rule for which assets to include ($w = 1$) in order to minimize portfolio return variance.

Problem Setup

Why portfolio selection?

Constructing a minimum-variance portfolio could be framed as an optimization problem. The standard approach could produce weights that are continuous and can be hard to interpret or communicate. ROOT offers a complementary perspective: it returns a binary inclusion rule - a sparse decision tree that describes, in plain language, which types of assets belong in the low-variance portfolio.

Mapping to ROOT’s framework

The default global objective function ROOT minimizes is as follows: $L(w, D) = \sqrt{\frac{\sum_i w_i \cdot v_i^2}{\left(\sum_i w_i\right)^2}}$

where $v_i^2$ (vsq) is a pseudo outcome value, known as the per-unit variance proxy in the optimization example. In the portfolio context, we set vsq to the historical return variance of each asset. ROOT then finds the binary weight $w(X) \in \{0, 1\}$, described as a decision tree over asset features such as beta and volatility that minimizes this quantity. Users can also supply the global objective function they wish to minimize as well.

Simulating the Data

We simulate 100 assets with two features: annualized volatility and market beta. Returns are generated as a market factor model, and per-asset return variance is computed from 1,000 simulated return observations.

library(ROOT)
set.seed(123)

n_assets <- 100

# Asset features
volatility <- runif(n_assets, 0.05, 0.40)  # annualised volatility
beta       <- runif(n_assets, 0.5,  1.8)   # market beta
sector     <- sample(c("Tech", "Finance", "Energy", "Health"),
                     n_assets, replace = TRUE)

# Simulate returns: r_i = beta_i * r_market + epsilon_i
market      <- rnorm(1000, 0.0005, 0.01)
returns_mat <- sapply(seq_len(n_assets), function(i)
  beta[i] * market + rnorm(1000, 0, volatility[i] / sqrt(252))
)

# Per-asset return variance (the objective proxy ROOT will minimize)
vsq <- apply(returns_mat, 2, var)

dat_portfolio <- data.frame(
  vsq    = vsq,
  vol    = volatility,
  beta   = beta,
  sector = as.integer(factor(sector))
)

head(dat_portfolio)
#>            vsq        vol      beta sector
#> 1 0.0002750578 0.15065213 1.2799856      2
#> 2 0.0004951413 0.32590680 0.9326706      2
#> 3 0.0002835768 0.19314192 1.1351969      3
#> 4 0.0008172707 0.35905609 1.7408160      4
#> 5 0.0007130039 0.37916355 1.1277731      3
#> 6 0.0002929372 0.06594477 1.6574553      3

The vsq column is recognized by ROOT’s default objective function as the per-unit variance proxy. The columns vol, beta, and sector are the splitting features available to the tree.

Distribution of asset risk

plot(
  dat_portfolio$beta, dat_portfolio$vol,
  xlab = "Market beta", ylab = "Annualised volatility",
  pch  = 16, col = "#4E79A7AA",
  main = "Asset universe: volatility vs beta"
)

Scatter plot of asset beta vs volatility

We expect ROOT to identify the high-beta, high-volatility corner of this space as the region to exclude.

Fitting ROOT

portfolio_fit <- ROOT(
  data        = dat_portfolio,
  num_trees   = 20,
  top_k_trees = TRUE,
  k           = 10,
  seed        = 42
)

ROOT grows 20 trees and selects the 10 with the lowest objective value as the Rashomon set. Final asset weights $w_{\text{opt}}$ are determined by majority vote across those 10 trees.

Inspecting the Results

Print summary

print(portfolio_fit)
#> ROOT object
#>   Generalizability mode: FALSE 
#> 
#> Summary classifier (f):
#> n= 100 
#> 
#> node), split, n, loss, yval, (yprob)
#>       * denotes terminal node
#> 
#> 1) root 100 4 1 (0.0400000 0.9600000)  
#>   2) beta>=1.658134 12 4 1 (0.3333333 0.6666667)  
#>     4) vol>=0.3285209 4 0 0 (1.0000000 0.0000000) *
#>     5) vol< 0.3285209 8 0 1 (0.0000000 1.0000000) *
#>   3) beta< 1.658134 88 0 1 (0.0000000 1.0000000) *

Detailed summary

summary(portfolio_fit)
#> ROOT object
#>   Generalizability mode: FALSE 
#> 
#> Summary classifier (f):
#> n= 100 
#> 
#> node), split, n, loss, yval, (yprob)
#>       * denotes terminal node
#> 
#> 1) root 100 4 1 (0.0400000 0.9600000)  
#>   2) beta>=1.658134 12 4 1 (0.3333333 0.6666667)  
#>     4) vol>=0.3285209 4 0 0 (1.0000000 0.0000000) *
#>     5) vol< 0.3285209 8 0 1 (0.0000000 1.0000000) *
#>   3) beta< 1.658134 88 0 1 (0.0000000 1.0000000) *
#> 
#> Global objective function:
#>   User-supplied: No (default objective used)
#> 
#> Diagnostics:
#>   Number of trees grown: 20
#>   Rashomon set size: 10
#>   % observations with w_opt == 1: 96.0%

The Diagnostics section confirms that 20 trees were grown, 10 were selected into the Rashomon set, and 96% of assets received $w_{\text{opt}} = 1$ (included in the portfolio). Only the 4 most risk-concentrated assets are screened out.

Visualizing the Characterized Tree

plot(portfolio_fit)

Characterized tree for portfolio selection

The tree encodes a simple, actionable rule:

beta < 1.7: include the asset unconditionally (88 assets).
beta >= 1.7 and vol < 0.33: include the asset despite high market sensitivity, because some risk may be controlled (8 assets).
beta >= 1.7 and vol >= 0.33: exclude the asset — high market sensitivity combined with high volatility drives up portfolio variance (4 assets, $w = 0$).

Examining the Weights

The final weight assignments are stored in portfolio_fit$D_rash$w_opt. We can compare the included and excluded assets directly.

dat_portfolio$w_opt <- portfolio_fit$D_rash$w_opt

# Summary statistics by inclusion decision
included <- dat_portfolio[dat_portfolio$w_opt == 1, ]
excluded <- dat_portfolio[dat_portfolio$w_opt == 0, ]

cat("Included assets (w = 1):", nrow(included), "\n")
#> Included assets (w = 1): 96
cat("  Mean beta:       ", round(mean(included$beta), 3), "\n")
#>   Mean beta:        1.146
cat("  Mean volatility: ", round(mean(included$vol),  3), "\n\n")
#>   Mean volatility:  0.219

cat("Excluded assets (w = 0):", nrow(excluded), "\n")
#> Excluded assets (w = 0): 4
cat("  Mean beta:       ", round(mean(excluded$beta), 3), "\n")
#>   Mean beta:        1.701
cat("  Mean volatility: ", round(mean(excluded$vol),  3), "\n")
#>   Mean volatility:  0.368

The excluded assets have substantially higher beta and volatility than the included ones, confirming that ROOT correctly targets the risk-concentrated corner of the asset universe.

Visualizing the inclusion decision

plot(
  dat_portfolio$beta, dat_portfolio$vol,
  xlab = "Market beta", ylab = "Annualised volatility",
  pch  = ifelse(dat_portfolio$w_opt == 1, 16, 4),
  col  = ifelse(dat_portfolio$w_opt == 1, "#4E79A7", "#E15759"),
  main = "Portfolio inclusion decisions"
)
legend(
  "topleft",
  legend = c("w = 1 (included)", "w = 0 (excluded)"),
  pch    = c(16, 4),
  col    = c("#4E79A7", "#E15759"),
  bty    = "n"
)

Asset universe with inclusion decisions highlighted

Using a Custom Objective Function

ROOT’s general optimization mode is not limited to the default variance objective. You can supply any function of the form function(D) -> numeric where D is the working data frame with a column w containing the current weight assignments.

For example, suppose we want to minimize the interquartile range of portfolio returns rather than variance. We can define a custom objective:

iqr_objective <- function(D) {
  w <- D$w
  if (sum(w) == 0) return(Inf)
  # Weighted IQR: compute quantiles using the included assets only
  included_vsq <- D$vsq[w == 1]
  diff(quantile(included_vsq, probs = c(0.25, 0.75)))
}

portfolio_fit_iqr <- ROOT(
  data                = dat_portfolio,
  global_objective_fn = iqr_objective,
  num_trees           = 20,
  top_k_trees         = TRUE,
  k                   = 10,
  seed                = 112
)

The custom objective illustrates ROOT’s flexibility: any scalar-valued function of the weighted dataset can be used as the optimization target, and the resulting characterized tree would still be interpretable.

Key Parameters

Parameter	Role	Default
`num_trees`	Number of trees to grow in the forest	`10`
`top_k_trees`	If `TRUE`, select the top `k` trees by objective value	`FALSE`
`k`	Rashomon set size when `top_k_trees = TRUE`	`10`
`cutoff`	Rashomon threshold when `top_k_trees = FALSE`; `"baseline"` uses objective at $w \equiv 1$	`"baseline"`
`vote_threshold`	Fraction of Rashomon-set trees that must vote $w = 1$ for inclusion	`2/3`
`global_objective_fn`	Custom objective `function(D) -> numeric`; if `NULL`, uses default variance objective	`NULL`
`seed`	Random seed for reproducibility	`NULL`
`feature_est`	Feature importance method for split selection (`"Ridge"`, `"GBM"`, or custom)	`"Ridge"`
`leaf_proba`	Controls tree depth by increasing the probability of stopping at a leaf	`0.25`