Sensitivity analysis answers: “How strong would unmeasured confounding need to be to change our conclusions?”
Several frameworks exist:
| Framework | Key Metric | Interpretation |
|---|---|---|
| E-value | Risk ratio needed | “Unmeasured confounder must have RR = X” |
| Partial R² | Variance explained | “Confounder must explain X% of variance” |
| Le Cam Deficiency (δ) | Information loss | “Transfer penalty bounded by \(M\delta\)” |
This vignette shows how deficiency-based sensitivity analysis relates to and extends traditional approaches.
The E-value (VanderWeele & Ding, 2017) asks:
“To explain away the observed effect, an unmeasured confounder would need to have a risk ratio of at least E with both treatment and outcome.”
For an observed risk ratio RR, the E-value is:
\[E = RR + \sqrt{RR \times (RR - 1)}\]
Deficiency (δ) takes a decision-theoretic view:
“Given the information gap between observational and interventional data, the worst-case regret inflation term is bounded by \(M\delta\) (and there is a minimax floor \((M/2)\delta\)).”
Key insight: δ directly quantifies policy consequences, not just statistical associations.
| E-value Concept | Deficiency Equivalent |
|---|---|
| “Effect explained away” | δ → 1 (maximal deficiency) |
| “Effect robust” | δ → 0 (zero deficiency) |
| E-value = 2 | Moderate unmeasured confounding |
| E-value = 5 | Strong unmeasured confounding |
library(causaldef)
set.seed(42)
n <- 500
U <- rnorm(n) # Unmeasured confounder
W <- 0.7 * U + rnorm(n, sd = 0.5) # Observed covariate
A <- rbinom(n, 1, plogis(0.5 * U)) # Treatment
Y <- 2 * A + 1.5 * U + rnorm(n) # Outcome (true effect = 2)
df <- data.frame(W = W, A = A, Y = Y)
spec <- causal_spec(df, "A", "Y", "W")
#> ✔ Created causal specification: n=500, 1 covariate(s)def_results <- estimate_deficiency(
spec,
methods = c("unadjusted", "iptw"),
n_boot = 100
)
#> ℹ Estimating deficiency: unadjusted
#> ℹ Estimating deficiency: iptw
print(def_results)
#>
#> -- Deficiency Proxy Estimates (PS-TV) ------
#>
#> Method Delta SE CI Quality
#> unadjusted 0.1115 0.0283 [0.0585, 0.1772] Insufficient (Red)
#> iptw 0.0162 0.0091 [0.0055, 0.0364] Excellent (Green)
#> Note: delta is a propensity-score TV proxy (overlap/balance diagnostic).
#>
#> Best method: iptw (delta = 0.0162 )The confounding frontier maps deficiency as a function of confounding strength:
frontier <- confounding_frontier(
spec,
alpha_range = c(-3, 3),
gamma_range = c(-3, 3),
grid_size = 40
)
#> ℹ Computing benchmarks for observed covariates...
#> ✔ Computed confounding frontier: 40x40 grid
plot(frontier)Reading the Plot:
bounds <- policy_regret_bound(def_results, utility_range = c(0, 10), method = "iptw")
#> ℹ Transfer penalty: 0.1619 (delta = 0.0162)
print(bounds)
#>
#> -- Policy Regret Bounds -------------------------------------------------
#>
#> * Deficiency delta: 0.0162
#> * Delta mode: point
#> * Delta method: iptw
#> * Delta selection: pre-specified method
#> * Utility range: [0, 10]
#> * Transfer penalty: 0.1619 (additive regret upper bound)
#> * Minimax floor: 0.0809 (worst-case lower bound)
#>
#> Note: this is a plug-in bound using a deficiency proxy rather than an identified exact deficiency.
#>
#> Interpretation: Transfer penalty is 1.6 % of utility range given deltaFor comparison, we can compute the E-value for our effect estimate:
# Effect estimate
effect <- estimate_effect(def_results, target_method = "iptw")
print(effect)
#>
#> -- Causal Effect Estimate ----------------------
#> Method: iptw
#> Type: ATE
#> Contrast: 1 vs 0
#> Estimate: 2.1256
# For E-value calculation, we need to convert to risk ratio scale
# This is an approximation; exact E-values require binary outcomes
# Here we use the standardized effect size
# Assuming effect is on continuous scale, we can compute
# a pseudo-risk ratio via effect size transformation
effect_se <- 1 # Approximate SE (would be from bootstrap in practice)
effect_est <- effect$estimate
t_stat <- effect_est / effect_se
# Approximate conversion to OR (for conceptual illustration)
# Using Chinn's (2000) formula for approximate OR from mean difference
approx_or <- exp(effect_est / 1.81)
# E-value formula
if (approx_or > 1) {
e_value <- approx_or + sqrt(approx_or * (approx_or - 1))
} else {
e_value <- 1 / approx_or + sqrt(1 / approx_or * (1 / approx_or - 1))
}
cat(sprintf("
Approximate E-value: %.2f
Interpretation: To explain away the observed effect, an unmeasured
confounder would need a risk ratio of at least %.2f with both
treatment and outcome.
", e_value, e_value))
#>
#> Approximate E-value: 5.93
#>
#> Interpretation: To explain away the observed effect, an unmeasured
#> confounder would need a risk ratio of at least 5.93 with both
#> treatment and outcome.| Aspect | E-value | Deficiency (δ) |
|---|---|---|
| Scale | Risk ratio | Total variation distance |
| Interpretation | Strength needed to “explain away” | Information loss for decisions |
| Decision utility | Abstract | Direct (via \(M\delta\) transfer penalty) |
| Multi-method | Single estimate | Compares strategies |
| Negative controls | Not integrated | Built-in diagnostics |
Use E-values when: - Communicating to epidemiologists/clinicians familiar with RR - Binary outcomes with clear risk ratio interpretation - Want a single summary number
Use Deficiency (δ) when: - Need decision-theoretic bounds (policy regret) - Comparing multiple adjustment strategies - Have negative control outcomes available - Working with non-binary outcomes (continuous, survival) - Need to combine with sensitivity frontiers
A key advantage of the frontier approach is benchmarking: we can see where observed covariates fall on the confounding map.
# Add more covariates for benchmarking
df$W2 <- 0.3 * U + rnorm(n, sd = 0.8)
df$W3 <- 0.9 * U + rnorm(n, sd = 0.3)
spec_multi <- causal_spec(df, "A", "Y", c("W", "W2", "W3"))
#> ✔ Created causal specification: n=500, 3 covariate(s)
frontier_bench <- confounding_frontier(
spec_multi,
grid_size = 40
)
#> ℹ Computing benchmarks for observed covariates...
#> ✔ Computed confounding frontier: 40x40 grid
# Access benchmarks
if (!is.null(frontier_bench$benchmarks)) {
print(frontier_bench$benchmarks)
}
#> covariate alpha gamma delta
#> W_std W 0.082061386 1.2208028 0.03119422
#> W_std1 W2 0.001412356 0.5199305 0.00000000
#> W_std2 W3 0.109257973 1.4261294 0.04564695Using Benchmarks:
The benchmarks show the inferred confounding strength of each observed covariate. If an unmeasured confounder would need to be “stronger than W3” (which we know explains 81% of U’s variance), conclusions are robust.
# Add negative control
df$Y_nc <- U + rnorm(n, sd = 0.5) # Affected by U, not by A
spec_full <- causal_spec(
df, "A", "Y", c("W", "W2", "W3"),
negative_control = "Y_nc"
)
#> ✔ Created causal specification: n=500, 3 covariate(s)
# Complete analysis
def_full <- estimate_deficiency(
spec_full,
methods = c("unadjusted", "iptw"),
n_boot = 100
)
#> ℹ Estimating deficiency: unadjusted
#> ℹ Estimating deficiency: iptw
nc_full <- nc_diagnostic(spec_full, method = "iptw", n_boot = 100)
#> ℹ Using kappa = 1 (conservative). Consider domain-specific estimation or sensitivity analysis via kappa_range.
#> ✔ No evidence against causal assumptions (p = 0.74257 )
print(def_full)
#>
#> -- Deficiency Proxy Estimates (PS-TV) ------
#>
#> Method Delta SE CI Quality
#> unadjusted 0.1033 0.0233 [0.1016, 0.1825] Insufficient (Red)
#> iptw 0.0204 0.0089 [0.016, 0.0482] Excellent (Green)
#> Note: delta is a propensity-score TV proxy (overlap/balance diagnostic).
#>
#> Best method: iptw (delta = 0.0204 )
print(nc_full)
#>
#> -- Negative Control Diagnostic ----------------------------------------
#>
#> * screening statistic (weighted corr): 0.0173
#> * delta_NC (association proxy): 0.0173
#> * delta bound (under kappa alignment): 0.0173 (kappa = 1 )
#> * screening p-value: 0.74257
#> * screening method: weighted_permutation_correlation
#>
#> RESULT: NOT REJECTED. This is a screening result, not proof that confounding is absent.
#> NOTE: Your effect estimate must exceed the Noise Floor (delta_bound) to be meaningful.The causaldef approach provides a unified framework:
┌─────────────────────────────────────────────────────────────────┐
│ SENSITIVITY ANALYSIS │
├─────────────────────────────────────────────────────────────────┤
│ confounding_frontier() │
│ → Maps δ as function of confounding strength (α, γ) │
│ → Benchmarks observed covariates as reference points │
├─────────────────────────────────────────────────────────────────┤
│ nc_diagnostic() │
│ → Empirical falsification test │
│ → Bounds δ using observable negative control │
├─────────────────────────────────────────────────────────────────┤
│ policy_regret_bound() │
│ → Translates δ into decision-theoretic consequences │
│ → Transfer penalty = Mδ; minimax floor = (M/2)δ │
└─────────────────────────────────────────────────────────────────┘
Key Advantages:
Akdemir, D. (2026). Constraints on Causal Inference as Experiment Comparison. DOI: 10.5281/zenodo.18367347
VanderWeele, T. J., & Ding, P. (2017). Sensitivity Analysis in Observational Research: Introducing the E-value. Annals of Internal Medicine.
Cinelli, C., & Hazlett, C. (2020). Making Sense of Sensitivity: Extending Omitted Variable Bias. JRSS-B.
Torgersen, E. (1991). Comparison of Statistical Experiments. Cambridge University Press.