Introduction to CausalSpline: Nonlinear Causal Dose-Response Estimation

Your Name

2026-03-21

Why nonlinear causal effects?

Most causal inference software assumes the treatment effect enters linearly:

\[Y = \beta_0 + \beta_1 T + \gamma X + \varepsilon\]

But many real-world relationships are genuinely nonlinear:

CausalSpline replaces the linear \(\beta_1 T\) with a spline \(f(T)\):

\[Y = \beta_0 + f(T) + \gamma X + \varepsilon, \quad E[Y(t)] = \beta_0 + f(t)\]

under standard unconfoundedness and positivity assumptions.


Installation

# From CRAN (once published)
install.packages("CausalSpline")

# Development version from GitHub
remotes::install_github("yourgithub/CausalSpline")

Quick start

library(CausalSpline)

1. Simulate data with a threshold effect

set.seed(42)
dat <- simulate_dose_response(n = 600, dgp = "threshold", confounding = 0.6)
head(dat)
#>          Y        T         X1         X2 X3 true_effect
#> 1 4.676083 5.251499  1.3709584 -0.2484829  0    4.502997
#> 2 3.505122 4.493596 -0.5646982  0.4223204  0    2.987192
#> 3 2.676237 4.328187  0.3631284  0.9876533  1    2.656375
#> 4 5.245936 5.958644  0.6328626  0.8355682  1    5.917287
#> 5 3.958747 4.996158  0.4042683 -0.6605219  1    3.992315
#> 6 3.086915 4.992211 -0.1061245  1.5640695  1    3.984422

The true dose-response is flat below \(T = 3\), then rises linearly:

plot(dat$T, dat$Y, pch = 16, col = rgb(0, 0, 0, 0.2),
     xlab = "Treatment T", ylab = "Outcome Y",
     main = "Observed data (confounded)")
lines(sort(dat$T), dat$true_effect[order(dat$T)],
      col = "red", lwd = 2)
legend("topleft", legend = "True f(T)", col = "red", lwd = 2)
True vs observed relationship
True vs observed relationship

2. Fit with IPW

fit_ipw <- causal_spline(
  Y ~ T | X1 + X2 + X3,
  data       = dat,
  method     = "ipw",
  df_exposure = 5,
  eval_grid  = 100
)
summary(fit_ipw)
#> === CausalSpline Summary ===
#> 
#> Call:
#>   causal_spline(formula = Y ~ T | X1 + X2 + X3, data = dat, method = "ipw", 
#>     df_exposure = 5, eval_grid = 100)
#> 
#> Estimation method  : IPW 
#> Spline type        : ns 
#> df (exposure)      : 5 
#> Interior knots     : 4.278 4.887 5.349 5.965 
#> 
#> Treatment variable:
#>   n = 600 
#>   Mean = 5.133   SD = 0.994 
#>   Range = [ 1.928 , 8.586 ]
#> 
#> IPW diagnostics:
#>   ESS = 406 / n = 600 ( 67.7 %)
#>   Weight range = [ 0.119 , 4.978 ]
#> 
#> Dose-response curve (selected percentiles):
#>      t estimate     se   lower   upper
#>  1.928  -1.0895 0.6925 -2.4468  0.2678
#>  3.072   0.5492 0.1771  0.2021  0.8963
#>  4.147   2.3213 0.1231  2.0800  2.5626
#>  5.223   4.7107 0.0928  4.5289  4.8926
#>  6.366   6.7306 0.1498  6.4370  7.0243
#>  7.510   8.9276 0.5554  7.8390 10.0162
#>  8.586  11.2355 1.5384  8.2204 14.2507
# Build true curve data frame for overlay
truth_df <- data.frame(
  t           = dat$T,
  true_effect = dat$true_effect
)
plot(fit_ipw, truth = truth_df)
IPW estimated dose-response with 95% CI
IPW estimated dose-response with 95% CI

3. Fit with G-computation

fit_gc <- causal_spline(
  Y ~ T | X1 + X2 + X3,
  data        = dat,
  method      = "gcomp",
  df_exposure = 5
)
plot(fit_gc, truth = truth_df,
     title = "G-computation — Threshold DGP")

4. Check overlap (positivity)

ov <- check_overlap(dat$T, fit_ipw$weights, plot = TRUE)
cat("ESS:", round(ov$ess), "/ n =", nrow(dat), "\n")
#> ESS: 406 / n = 600
ov$plot


Comparing DGPs

dgps <- c("threshold", "diminishing", "nonmonotone", "sinusoidal")
plots <- lapply(dgps, function(d) {
  dat_d <- simulate_dose_response(500, dgp = d, seed = 1)
  fit_d <- causal_spline(Y ~ T | X1 + X2 + X3, data = dat_d,
                          method = "ipw", df_exposure = 5,
                          verbose = FALSE)
  truth_d <- data.frame(t = dat_d$T, true_effect = dat_d$true_effect)
  plot(fit_d, truth = truth_d,
       title = paste("DGP:", d),
       rug = FALSE)
})

# Combine with patchwork (if available) or print individually
for (p in plots) print(p)


Choosing degrees of freedom

The df_exposure argument controls spline flexibility. Too few df = underfitting; too many = high variance. As a guide:

Shape Recommended df
Linear / simple trend 3
One bend / threshold 4–5
Inverted-U / hump 5–6
Oscillatory 7–10

You can use AIC/BIC on the outcome model or cross-validation for selection.


Methods summary

Argument Consistent if …
method = "ipw" GPS model correctly specified
method = "gcomp" Outcome model correctly specified
method = "dr" At least one of the two models is correct

References