Type: | Package |
Title: | Assessing Proximal, Distal, and Mediated Causal Excursion Effects for Micro-Randomized Trials |
Version: | 0.3.0 |
Description: | Provides methods to analyze micro-randomized trials (MRTs) with binary treatment options. Supports three types of analyses: (1) proximal causal excursion effects, including weighted and centered least squares (WCLS) for continuous proximal outcomes by Boruvka et al. (2018) <doi:10.1080/01621459.2017.1305274> and the estimator for marginal excursion effect (EMEE) for binary proximal outcomes by Qian et al. (2021) <doi:10.1093/biomet/asaa070>; (2) distal causal excursion effects (DCEE) for continuous distal outcomes using a two-stage estimator by Qian (2025) <doi:10.48550/arXiv.2502.13500>; and (3) mediated causal excursion effects (MCEE) for continuous distal outcomes, estimating natural direct and indirect excursion effects in the presence of time-varying mediators by Qian (2025) <doi:10.48550/arXiv.2506.20027>. |
License: | GPL-3 |
Encoding: | UTF-8 |
LazyData: | true |
RoxygenNote: | 7.3.2 |
Imports: | rootSolve, stats, geepack, sandwich, mgcv, randomForest, ranger |
Depends: | R (≥ 4.2) |
Suggests: | knitr, rmarkdown, testthat (≥ 3.0.0), SuperLearner, earth, dplyr |
VignetteBuilder: | knitr |
Config/testthat/edition: | 3 |
NeedsCompilation: | no |
Packaged: | 2025-09-09 03:55:29 UTC; tqian |
Author: | Tianchen Qian |
Maintainer: | Tianchen Qian <t.qian@uci.edu> |
Repository: | CRAN |
Date/Publication: | 2025-09-09 07:20:31 UTC |
Assert that input is a data frame
Description
Assert that input is a data frame
Usage
.mcee_assert_df(data)
Arguments
data |
Object to check |
Value
Invisibly TRUE if valid, otherwise stops with error
Build basis matrix f(t) from time-varying effect formula
Description
Build basis matrix f(t) from time-varying effect formula
Usage
.mcee_build_f_matrix(time_varying_effect_form, data)
Arguments
time_varying_effect_form |
RHS-only formula |
data |
Data frame to evaluate formula on |
Value
Model matrix with basis functions evaluated at each row
Build per-row weights omega(i,t) for MCEE estimation
Description
Build per-row weights omega(i,t) for MCEE estimation
Usage
.mcee_build_weights(
data,
id,
dp,
weight_per_row = NULL,
specific_dp_only = NULL,
verbose = TRUE
)
Arguments
data |
Data frame |
id |
Subject ID column name |
dp |
Decision point column name |
weight_per_row |
Optional user-specified per-row weights |
specific_dp_only |
Optional vector of decision points to target (others get weight 0) |
verbose |
Logical; whether to print informative messages |
Value
Numeric vector of per-row weights
Validate binary column coding
Description
Validate binary column coding
Usage
.mcee_check_binary_col(data, col, allow_all1 = TRUE, label = NULL)
Arguments
data |
Data frame containing the column |
col |
Column name (can be NULL, in which case validation is skipped) |
allow_all1 |
Logical; if FALSE, column cannot be all 1s |
label |
Optional descriptive label for error messages |
Value
Invisibly TRUE if valid, otherwise stops with error
Validate control formula excludes treatment and outcome
Description
Validate control formula excludes treatment and outcome
Usage
.mcee_check_control_formula(control_formula, treatment, outcome, dp, label)
Arguments
control_formula |
RHS-only formula for control variables |
treatment |
Treatment column name |
outcome |
Outcome column name |
dp |
Decision point column name (used in error messages) |
label |
Descriptive label for error messages |
Value
Invisibly TRUE if valid, otherwise stops with error
Check that decision points are strictly increasing within each subject
Description
Check that decision points are strictly increasing within each subject
Usage
.mcee_check_dp_strictly_increasing(data, id, dp)
Arguments
data |
Data frame |
id |
Column name for subject ID |
dp |
Column name for decision point |
Value
Invisibly TRUE if valid, otherwise stops with error
Check config formula for inclusion/exclusion of mediator
Description
Check config formula for inclusion/exclusion of mediator
Usage
.mcee_check_formula_mediator(config, target, mediator)
Arguments
config |
A nuisance config list (may or may not contain 'formula'). |
target |
Character scalar, one of "p", "q", "eta", "mu", "nu". |
mediator |
Character scalar: mediator variable name. |
Value
Invisibly TRUE. Warnings are produced if formulas look suspicious.
Check that rows for each subject appear in contiguous blocks
Description
Check that rows for each subject appear in contiguous blocks
Usage
.mcee_check_id_rows_grouped(data, id, max_show = 5)
Arguments
data |
Data frame |
id |
Column name for subject ID |
max_show |
Maximum number of offending IDs to show in error message |
Value
Invisibly TRUE if valid, otherwise stops with error
Check data frame columns for missing/infinite values
Description
Check data frame columns for missing/infinite values
Usage
.mcee_check_no_missing_vars(data, vars, where = NULL, max_show = 5)
Arguments
data |
Data frame to check |
vars |
Character vector of column names to check |
where |
Optional context description for error messages |
max_show |
Maximum number of row indices to show per variable |
Value
Invisibly TRUE if no missing data found, otherwise stops with error
Check numeric vector for missing/infinite values
Description
Check numeric vector for missing/infinite values
Usage
.mcee_check_no_missing_vec(vec, name, max_show = 5)
Arguments
vec |
Numeric vector to check |
name |
Variable name for error messages |
max_show |
Maximum number of row indices to show |
Value
Invisibly TRUE if no missing data found, otherwise stops with error
Check that outcome is constant within each subject (required for distal outcomes)
Description
Check that outcome is constant within each subject (required for distal outcomes)
Usage
.mcee_check_outcome_constant_within_id(data, id, outcome)
Arguments
data |
Data frame |
id |
Column name for subject ID |
outcome |
Column name for outcome |
Value
Invisibly TRUE if valid, otherwise stops with error
Validate time-varying effect formula structure
Description
Validate time-varying effect formula structure
Usage
.mcee_check_time_varying_effect_form(time_varying_effect_form, dp)
Arguments
time_varying_effect_form |
RHS-only formula for basis functions |
dp |
Decision point column name |
Value
Invisibly TRUE if valid, otherwise stops with error
Generate compact one-line description of nuisance model object
Description
Generate compact one-line description of nuisance model object
Usage
.mcee_compact_model_info(obj)
Arguments
obj |
Fitted model object or known value descriptor |
Value
Character string describing the object
Numerical core implementing MCEE estimation mathematics
Description
Implements the core MCEE estimating equations and sandwich variance estimation.
This function contains the mathematical heart of the MCEE method, solving
the weighted estimating equations for \alpha
(NDEE) and \beta
(NIEE).
Usage
.mcee_core_rows(
n,
f_nrows,
omega_nrows,
i_index,
phi11_vec,
phi10_vec,
phi00_vec
)
Arguments
n |
Integer. Number of unique subjects. |
f_nrows |
Matrix |
omega_nrows |
Numeric vector of length |
i_index |
Integer vector of length |
phi11_vec , phi10_vec , phi00_vec |
Numeric vectors of length |
Details
**MCEE Estimating Equations:**
**NDEE**:
\alpha = S^{-1} \times (1/n) \sum_{i,t}\omega(i,t)\{\phi_t^{10} - \phi_t^{00}\} f(t)
**NIEE**:
\beta = S^{-1} \times (1/n) \sum_{i,t}\omega(i,t)\{\phi_t^{11} - \phi_t^{10}\} f(t)
where S = (1/n) \sum_{i,t}\omega(i,t) f(t)f(t)^T
.
**Sandwich Variance Formula:**
\text{Var}((\alpha,\beta)) = \text{Bread}^{-1} \times \text{Meat} \times \text{Bread}^{-1,T} / n
, where:
**Bread** =
\text{blockdiag}(S, S)
(2p \times 2p
matrix)**Meat** =
(1/n) \sum_i U_i U_i^T
, with subject-level score vectors:U_i = \sum_t \omega(i,t) \times [\{\phi_t^{10} - \phi_t^{00} - f^T\alpha\}f ; \{\phi_t^{11} - \phi_t^{10} - f^T\beta\}f]
**Mathematical Details:** The implementation follows the theoretical framework detailed in the MCEE vignette appendix. The estimating equations are based on efficient influence functions for the causal parameters of interest in the mediation analysis setting.
Value
List containing:
alpha_hat
Vector of length
p
: NDEE parameter estimatesalpha_se
Vector of length
p
: NDEE standard errorsbeta_hat
Vector of length
p
: NIEE parameter estimatesbeta_se
Vector of length
p
: NIEE standard errorsvarcov
Matrix
2p \times 2p
: Joint variance-covariance for(\alpha,\beta)
alpha_varcov
Matrix
p \times p
: Variance-covariance for\alpha
onlybeta_varcov
Matrix
p \times p
: Variance-covariance for\beta
only
Select default GLM family based on nuisance parameter type
Description
Select default GLM family based on nuisance parameter type
Usage
.mcee_default_family(target, method)
Arguments
target |
Nuisance parameter name ("p", "q", "eta", "mu", "nu") |
method |
Learning method name |
Value
GLM family object or NULL for non-GLM methods
Remove a variable from RHS-only formula
Description
Remove a variable from RHS-only formula
Usage
.mcee_drop_var_from_rhs(rhs_only_formula, var)
Arguments
rhs_only_formula |
RHS-only formula |
var |
Variable name to remove |
Value
Modified formula with variable removed
Fit a single nuisance component with flexible learner support
Description
Internal workhorse function that fits individual nuisance parameters using various machine learning methods or known constants. Handles the complexity of different learner APIs and provides consistent predictions.
Usage
.mcee_fit_nuisance(
config,
data_for_fitting,
data_for_predicting,
lhs_var,
param_name,
data_for_fitting_name
)
Arguments
config |
Configuration list describing how to fit the nuisance parameter.
Created by
|
data_for_fitting |
Data frame subset used to train the model (e.g., available rows only). |
data_for_predicting |
Data frame on which to generate predictions (usually full data). |
lhs_var |
Character. Column name of the response/outcome variable to model. |
param_name |
Character. Descriptive name for error messages (e.g., "p_t(1|H_t)"). |
data_for_fitting_name |
Character. Description of fitting data for model call display. |
Details
**Supported Methods:**
-
"glm"
: Usesstats::glm()
with automatic family detection -
"lm"
: Usesstats::lm()
(continuous outcomes only) -
"gam"
: Usesmgcv::gam()
supporting smooth terms -
"rf"
: UsesrandomForest::randomForest()
-
"ranger"
: Usesranger::ranger()
(faster random forest) -
"sl"
: UsesSuperLearner::SuperLearner()
**Automatic Family Detection:**
When family=NULL
in GLM/GAM configs:
- Binary outcomes (0/1 only): binomial()
- Continuous outcomes: gaussian()
**Known Values:**
If any of known
, known_a1
, known_a0
is provided, no model
is fitted. Returns constant predictions and a descriptor object.
Value
List with components:
pred
Numeric vector of length
nrow(data_for_predicting)
containing predictions/fitted values.model
Fitted model object (e.g.,
glm
,gam
,randomForest
) or a list descriptor for known values.
Print informative message if no availability column provided
Description
Print informative message if no availability column provided
Usage
.mcee_message_if_no_availability_provided(availability, verbose)
Arguments
availability |
Availability column name (or NULL) |
verbose |
Logical; whether to print messages |
Value
Invisibly TRUE
Print formatted coefficient table for MCEE results
Description
Print formatted coefficient table for MCEE results
Usage
.mcee_print_coef_table(tab)
Arguments
tab |
Data frame with coefficient estimates, standard errors, etc. |
Value
Used for side effects (printing)
Check that required columns exist in data frame
Description
Check that required columns exist in data frame
Usage
.mcee_require_cols(data, cols, where = "data")
Arguments
data |
Data frame to check |
cols |
Character vector of required column names |
where |
Context description for error messages |
Value
Invisibly TRUE if all columns exist, otherwise stops with error
Resolve randomization probability from column name or scalar
Description
Resolve randomization probability from column name or scalar
Usage
.mcee_resolve_rand_prob(data, rand_prob, availability = NULL)
Arguments
data |
Data frame |
rand_prob |
Either column name or numeric scalar |
availability |
Optional availability column name for validation |
Value
Numeric vector of randomization probabilities
Validate clipping bounds for probability predictions
Description
Validate clipping bounds for probability predictions
Usage
.mcee_validate_clipping(clipping)
Arguments
clipping |
Numeric vector of length 2 with lower and upper bounds |
Value
Invisibly TRUE if valid, otherwise stops with error
Validate that learning method is supported
Description
Validate that learning method is supported
Usage
.mcee_validate_method(method)
Arguments
method |
Method name to validate |
Value
Invisibly TRUE if valid, otherwise stops with error
Extract variables from nuisance configuration formula
Description
Extract variables from nuisance configuration formula
Usage
.mcee_vars_in_config(cfg)
Arguments
cfg |
Configuration list (may contain formula element) |
Value
Character vector of variable names from config formula
Extract variable names from RHS-only formula
Description
Extract variable names from RHS-only formula
Usage
.mcee_vars_in_rhs(rhs_only_formula)
Arguments
rhs_only_formula |
RHS-only formula object |
Value
Character vector of variable names (empty if not a valid formula)
A synthetic data set of an MRT with binary proximal outcomes
Description
Baseline model:
\log E\{Y_{t+1} \mid A_t = 0, I_t = 1\} =
\alpha_0 + \alpha_1 \cdot \mathrm{time} / \mathrm{total\_T}
+ \alpha_2 \cdot \mathbf{1}(\mathrm{time} > \mathrm{total\_T}/2).
Treatment effect model:
\log RR_t = \beta_0 + \beta_1 \cdot \mathrm{time} / \mathrm{total\_T}.
Randomization probabilities p_t
cycle over 0.3, 0.5, 0.7 (with repetition).
Availability is exogenous at 0.8 for all time points.
Usage
data_binary
Format
A data frame with 3000 observations and 10 variables:
- userid
Individual id number.
- time
Decision point index.
- time_var1
Time-varying covariate 1, the \"standardized time in study\", defined as the current decision point index divided by the total number of decision points.
- time_var2
Time-varying covariate 2, indicator of \"the second half of the study\", defined as whether the current decision point index is greater than the total number of decision points divided by 2.
- Y
Binary proximal outcome.
- A
Treatment assignment: whether the intervention is randomized to be delivered (=1) or not (=0) at the current decision point.
- rand_prob
Randomization probability
P(A=1)
for the current decision point.- avail
Availability indicator (=1 available, =0 not available) at the current decision point.
A synthetic data set of an MRT with continuous distal outcome
Description
Simulated longitudinal dataset suitable for illustrating the 'dcee()' function. Each row corresponds to one decision point for one subject. The distal outcome 'Y' is constant within subject (because it is measured at the end of the study, and here we append it to the long format data as an extra column to conform with the 'dcee()' function requirement.
Usage
data_distal_continuous
Format
a data frame with 1500 observations and 11 variables
- userid
Subject identifier
- dp
Decision point (1..T)
- X
Endogenous continuous time-varying covariate
- Z
Endogenous binary time-varying covariate
- avail
Availability indicator (0/1)
- A
Treatment (0/1)
- prob_A
Randomization probability P(A=1|H_t)
- A_lag1
Lagged treatment
- Y
Distal continuous outcome (constant per subject)
A synthetic data set that mimics the HeartSteps V1 data structure to illustrate the use of [wcls()] function for continuous proximal outcomes
Description
A synthetic data set that mimics the HeartSteps V1 data structure to illustrate the use of [wcls()] function for continuous proximal outcomes
Usage
data_mimicHeartSteps
Format
a data frame with 7770 observations and 9 variables
- userid
individual id number
- time
decision point index
- day_in_study
day in the study
- logstep_30min
proximal outcome: the step count in the 30 minutes following the current decision point (log-transformed)
- logstep_30min_lag1
proximal outcome at the previous decision point (lag-1 outcome): the step count in the 30 minutes following the previous decision point (log-transformed)
- logstep_pre30min
the step count in the 30 minutes prior to the current decision point (log-transformed); used as a control variable
- is_at_home_or_work
whether the individual is at home or work (=1) or at other locations (=0) at the current decision point
- intervention
whether the intervention is randomized to be delivered (=1) or not (=0) at the current decision point
- rand_prob
the randomization probability P(A=1) for the current decision point
- availability
whether the individual is available (=1) or not (=0) at the current decision point
Example longitudinal dataset with time-varying mediator and distal outcome
Description
A simulated long-format dataset used in the vignette and tests. Each row corresponds to one subject–decision point. The distal outcome 'Y' is constant within subject (repeated on every row for that subject).
Usage
data_time_varying_mediator_distal_outcome
Format
A data frame with n * T_val
rows and the following columns:
- id
Subject identifier (integer).
- dp
Decision point index, strictly increasing within subject (integer).
- I
Availability indicator at time
dp
(0/1).- A
Treatment at time
dp
(0/1).- M
Mediator at time
dp
(numeric; could be binary or continuous).- X
Time-varying covariate at time
dp
(numeric).- A_prev
Lagged treatment at time
dp-1
(0/1).- M_prev
Lagged mediator at time
dp-1
(numeric).- X_prev
Lagged covariate at time
dp-1
(numeric).- I_prev
Lagged availability at time
dp-1
(0/1).- p_A
Randomization probability for
A
at timedp
(numeric in (0,1)).- p_I
Availability probability for
I
at timedp
(numeric in (0,1)).- mu_M
Conditional mean of
M
given history (numeric; from DGM).- mu_X
Conditional mean of
X
given history (numeric; from DGM).- mu_Y
Conditional mean component for distal outcome
Y
(numeric; from DGM).- Y
Distal outcome, constant within subject (numeric).
Details
Generated by dgm_time_varying_mediator_distal_outcome()
in the package
source. Intended for illustrating mcee
usage. No missing values.
Source
Simulated.
See Also
mcee
, mcee_general
, mcee_userfit_nuisance
Examples
data(data_time_varying_mediator_distal_outcome)
str(data_time_varying_mediator_distal_outcome)
Distal Causal Excursion Effect (DCEE) Estimation
Description
Fits distal causal excursion effects in micro-randomized trials using a
**two-stage** estimator: (i) learn nuisance outcome regressions
\mu_a(H_t)
with a specified learner (parametric/ML), optionally with
cross-fitting; (ii) solve estimating equations for the distal excursion
effect parameters (\beta
).
This wrapper standardizes inputs and delegates computation to [dcee_helper_2stage_estimation()].
Usage
dcee(
data,
id,
outcome,
treatment,
rand_prob,
moderator_formula,
control_formula,
availability = NULL,
control_reg_method = c("gam", "lm", "rf", "ranger", "sl", "sl.user-specified-library",
"set_to_zero"),
cross_fit = FALSE,
cf_fold = 10,
weighting_function = NULL,
verbose = TRUE,
...
)
Arguments
data |
A data.frame in long format. |
id |
Character scalar: column name for subject identifier. |
outcome |
Character scalar: column name for proximal/distal outcome. |
treatment |
Character scalar: column name for binary treatment {0,1}. |
rand_prob |
Character scalar: column name for randomization probability
giving |
moderator_formula |
RHS-only formula of moderators of the excursion effect (e.g., '~ 1', '~ Z', or '~ Z1 + Z2'). |
control_formula |
RHS-only formula of covariates for learning nuisance outcome regressions. When 'control_reg_method = "gam"', 's(x)' terms are allowed (e.g., '~ x1 + s(x2)'). For SuperLearner methods, variables are extracted from this formula to build the design matrix 'X'. |
availability |
Optional character scalar: column name for availability indicator (0/1). If 'NULL', availability is taken as 1 for all rows. |
control_reg_method |
One of '"gam"', '"lm"', '"rf"', '"ranger"', '"sl"', '"sl.user-specified-library"', '"set_to_zero"'. See Details. |
cross_fit |
Logical; if 'TRUE', perform K-fold cross-fitting by subject id. |
cf_fold |
Integer; number of folds if 'cross_fit = TRUE' (default 10). |
weighting_function |
Either a single numeric constant applied to all
rows, or a character column name in 'data' giving decision-point weights
|
verbose |
Logical; print minimal preprocessing messages (default 'TRUE'). |
... |
Additional arguments passed through to the chosen learner (e.g., 'num.trees', 'mtry' for random forests; 'sl.library' when 'control_reg_method = "sl.user-specified-library"'). |
Details
**Learners.**
- 'gam' uses mgcv and supports 's(.)' terms in 'control_formula'.
- 'lm' uses base stats::lm
.
- 'rf' uses randomForest; 'ranger' uses ranger.
- 'sl' / 'sl.user-specified-library' use SuperLearner. For the former,
'sl.library = c("SL.mean", "SL.glm", "SL.earth")' are used. For the latter,
please provide 'sl.library = c("SL.mean", ...)' via '...'.
**Notes.** - Treatment must be coded 0/1; 'rand_prob' must lie strictly in (0,1). - 'control_formula = ~ 1' is only valid with 'control_reg_method = "set_to_zero"'.
Value
An object of class '"dcee_fit"' with components:
- call
The matched call to
dcee()
.- fit
A list returned by the two–stage helper with elements:
beta_hat
Named numeric vector of distal causal excursion effect estimates
\beta
. Names are"Intercept"
and the moderator names (if any) frommoderator_formula
.beta_se
Named numeric vector of standard errors for
beta_hat
(same order/names).beta_varcov
Variance–covariance matrix of
beta_hat
(square matrix; row/column names matchnames(beta_hat)
).conf_int
Matrix of large-sample (normal) Wald 95% confidence intervals for
beta_hat
; columns are"2.5 %"
and"97.5 %"
.conf_int_tquantile
Matrix of small-sample (t-quantile) 95% confidence intervals for
beta_hat
; columns are"2.5 %"
and"97.5 %"
; degrees of freedom are provided in$df
of the"dcee_fit"
object.regfit_a0
Stage-1 nuisance regression fit for
\mu_0(H_t)
(outcome model amongA=0
), orNULL
whencontrol_reg_method = "set_to_zero"
. Note: whencross_fit = TRUE
, this is the learner object from the last fold and is provided for inspection only (do not use for out-of-fold prediction).regfit_a1
Stage-1 nuisance regression fit for
\mu_1(H_t)
(outcome model amongA=1
); same caveats asregfit_a0
regardingcross_fit
.
- df
Small-sample degrees of freedom used for t-based intervals: number of unique subjects minus
length(fit$beta_hat)
.
Examples
data(data_distal_continuous, package = "MRTAnalysis")
## Fast example: marginal effect with linear nuisance (CRAN-friendly)
fit_lm <- dcee(
data = data_distal_continuous,
id = "userid", outcome = "Y", treatment = "A", rand_prob = "prob_A",
moderator_formula = ~1, # marginal (no moderators)
control_formula = ~X, # simple linear nuisance
availability = "avail",
control_reg_method = "lm",
cross_fit = FALSE
)
summary(fit_lm)
summary(fit_lm, show_control_fit = TRUE) # show Stage-1 fit info
## Moderated effect with GAM nuisance (allows smooth terms); may be slower
fit_gam <- dcee(
data = data_distal_continuous,
id = "userid", outcome = "Y", treatment = "A", rand_prob = "prob_A",
moderator_formula = ~Z, # test moderation by Z
control_formula = ~ s(X) + Z, # smooth in nuisance via mgcv::gam
availability = "avail",
control_reg_method = "gam",
cross_fit = TRUE, cf_fold = 5
)
summary(fit_gam, lincomb = c(0, 1)) # linear combo: the Z coefficient
summary(fit_gam, show_control_fit = TRUE) # show Stage-1 fit info
## Optional: SuperLearner (runs only if installed)
library(SuperLearner)
fit_sl <- dcee(
data = data_distal_continuous,
id = "userid", outcome = "Y", treatment = "A", rand_prob = "prob_A",
moderator_formula = ~1,
control_formula = ~ X + Z,
availability = "avail",
control_reg_method = "sl",
cross_fit = FALSE
)
summary(fit_sl)
Estimates the causal excursion effect for binary outcome MRT
Description
Returns the estimated causal excursion effect (on log relative risk scale) and the estimated standard error. Small sample correction using the "Hat" matrix in the variance estimate is implemented.
Usage
emee(
data,
id,
outcome,
treatment,
rand_prob,
moderator_formula,
control_formula,
availability = NULL,
numerator_prob = NULL,
start = NULL,
verbose = TRUE
)
Arguments
data |
A data set in long format. |
id |
The subject id variable. |
outcome |
The outcome variable. |
treatment |
The binary treatment assignment variable. |
rand_prob |
The randomization probability variable. |
moderator_formula |
A formula for the moderator variables. This should
start with ~ followed by the moderator variables. When set to |
control_formula |
A formula for the control variables. This should
start with ~ followed by the control variables. When set to |
availability |
The availability variable. Use the default value ( |
numerator_prob |
Either a number between 0 and 1, or a variable name for
a column in data. If you are not sure what this is, use the default value ( |
start |
A vector of the initial value of the estimators used in the numerical
solver. If using default value ( |
verbose |
If default ('TRUE'), additional messages will be printed during data preprocessing. |
Value
An object of type "emee_fit"
Examples
## estimating the fully marginal excursion effect by setting
## moderator_formula = ~ 1
emee(
data = data_binary,
id = "userid",
outcome = "Y",
treatment = "A",
rand_prob = "rand_prob",
moderator_formula = ~1,
control_formula = ~ time_var1 + time_var2,
availability = "avail"
)
## estimating the causal excursion effect moderated by time_var1
## by setting moderator_formula = ~ time_var1
emee(
data = data_binary,
id = "userid",
outcome = "Y",
treatment = "A",
rand_prob = "rand_prob",
moderator_formula = ~time_var1,
control_formula = ~ time_var1 + time_var2,
availability = "avail"
)
Estimates the causal excursion effect for binary outcome MRT
Description
Returns the estimated causal excursion effect (on log relative risk scale) and the estimated standard error.
Small sample correction using the "Hat" matrix in the variance estimate is implemented.
This is a slightly altered version of emee()
, where the treatment
assignment indicator is also centered in the residual term. It would have
similar (but not exactly the same) numerical output as emee()
. This
is the estimator based on which the sample size calculator for binary outcome
MRT is developed. (See R package MRTSampleSizeBinary
.)
Usage
emee2(
data,
id,
outcome,
treatment,
rand_prob,
moderator_formula,
control_formula,
availability = NULL,
numerator_prob = NULL,
start = NULL,
verbose = TRUE
)
Arguments
data |
A data set in long format. |
id |
The subject id variable. |
outcome |
The outcome variable. |
treatment |
The binary treatment assignment variable. |
rand_prob |
The randomization probability variable. |
moderator_formula |
A formula for the moderator variables. This should
start with ~ followed by the moderator variables. When set to |
control_formula |
A formula for the control variables. This should
start with ~ followed by the control variables. When set to |
availability |
The availability variable. Use the default value ( |
numerator_prob |
Either a number between 0 and 1, or a variable name for
a column in data. If you are not sure what this is, use the default value ( |
start |
A vector of the initial value of the estimators used in the numerical
solver. If using default value ( |
verbose |
If default ('TRUE'), additional messages will be printed during data preprocessing. |
Value
An object of type "emee_fit"
Examples
## estimating the fully marginal excursion effect by setting
## moderator_formula = ~ 1
emee2(
data = data_binary,
id = "userid",
outcome = "Y",
treatment = "A",
rand_prob = "rand_prob",
moderator_formula = ~1,
control_formula = ~ time_var1 + time_var2,
availability = "avail"
)
## estimating the causal excursion effect moderated by time_var1
## by setting moderator_formula = ~ time_var1
emee2(
data = data_binary,
id = "userid",
outcome = "Y",
treatment = "A",
rand_prob = "rand_prob",
moderator_formula = ~time_var1,
control_formula = ~ time_var1 + time_var2,
availability = "avail"
)
Mediated Causal Excursion Effects for MRTs (streamlined)
Description
Estimates the Natural Direct Excursion Effect (NDEE; \alpha
) and
Natural Indirect Excursion Effect (NIEE; \beta
) for distal outcomes
in micro-randomized trials (MRTs). Assumes the randomization probability
is known (via rand_prob
) and fits all nuisance functions using the
same learner specified by control_reg_method
.
Usage
mcee(
data,
id,
dp,
outcome,
treatment,
mediator,
availability = NULL,
rand_prob,
time_varying_effect_form,
control_formula_with_mediator,
control_reg_method = c("glm", "gam", "rf", "ranger", "sl"),
weight_per_row = NULL,
specific_dp_only = NULL,
verbose = TRUE,
SL.library = NULL
)
Arguments
data |
A data.frame in long format (one row per id-by-decision point). |
id |
Character. Column name for subject identifier. |
dp |
Character. Column name for decision point index (must increase strictly within subject). |
outcome |
Character. Column name for distal outcome (constant within subject). |
treatment |
Character. Column name for treatment (coded 0/1). |
mediator |
Character. Column name for mediator. |
availability |
Optional character. Column name for availability (0/1). If |
rand_prob |
Either a column name in |
time_varying_effect_form |
RHS-only formula for the basis |
control_formula_with_mediator |
RHS-only formula for control variables used in nuisance models that may include the mediator (the wrapper will drop the mediator internally for nuisances that must exclude it). |
control_reg_method |
Learner for nuisance fits: one of |
weight_per_row |
Optional numeric vector of row weights (nonnegative, length |
specific_dp_only |
Optional numeric vector of decision points to target; internally converted to |
verbose |
Logical; print progress messages. |
SL.library |
Optional character vector of SuperLearner libraries (used when |
Details
Requirements: rows grouped by subject, strictly increasing dp
within subject,
no missing (NA
/NaN
/Inf
) in relevant variables. If availability
is supplied, the wrapper enforces at I=0
: p_1=q_1=1
in the nuisances.
Value
An object of class "mcee_fit"
with elements:
-
mcee_fit
: list withalpha_hat
,beta_hat
,alpha_se
,beta_se
,varcov
,alpha_varcov
,beta_varcov
. -
nuisance_models
: fitted Stage-1 models forp,q,eta,mu,nu
. -
nuisance_fitted
: per-row fitted values for the nuisance functions. -
meta
: list with basis dimension, number of ids, per-id lengths, weights used. -
call
: the matched call.
See Also
summary.mcee_fit
, mcee_general
, mcee_userfit_nuisance
Examples
set.seed(1)
n <- 10
T <- 4
id <- rep(1:n, each = T)
dp <- rep(1:T, times = n)
A <- rbinom(n * T, 1, 0.5)
M <- rbinom(n * T, 1, plogis(-0.2 + 0.3 * A + 0.1 * dp))
Y <- ave(0.5 * A + 0.6 * M + 0.1 * dp + rnorm(n * T), id)
dat <- data.frame(id, dp, A, M, Y)
fit <- mcee(dat, "id", "dp", "Y", "A", "M",
time_varying_effect_form = ~1,
control_formula_with_mediator = ~ dp + M,
control_reg_method = "glm",
rand_prob = 0.5, verbose = TRUE
)
summary(fit)
Configure GAM for MCEE nuisance parameters
Description
Creates a configuration to fit nuisance parameters using generalized additive models
via mgcv::gam()
. Supports smooth terms like s()
.
Usage
mcee_config_gam(target, formula, family = NULL, clipping = NULL)
Arguments
target |
Character. Nuisance parameter name ("p", "q", "eta", "mu", "nu"). |
formula |
RHS-only formula (e.g., |
family |
Optional GLM family. Defaults to |
clipping |
Optional numeric vector |
Value
A configuration list for use with mcee_general
.
Examples
# GAM with smooth time effect
cfg_eta <- mcee_config_gam("eta", ~ X1 + s(dp, k = 4))
# GAM with multiple smooths
cfg_mu <- mcee_config_gam("mu", ~ s(dp) + s(M, X1, k = 10))
Configure GLM for MCEE nuisance parameters
Description
Creates a configuration to fit nuisance parameters using generalized linear models
via stats::glm()
.
Usage
mcee_config_glm(target, formula, family = NULL, clipping = NULL)
Arguments
target |
Character. Nuisance parameter name ("p", "q", "eta", "mu", "nu"). |
formula |
RHS-only formula (e.g., |
family |
Optional GLM family. Defaults to |
clipping |
Optional numeric vector |
Value
A configuration list for use with mcee_general
.
Examples
# Binary outcome model for propensity
cfg_q <- mcee_config_glm("q", ~ dp + M, family = binomial())
# Gaussian outcome model
cfg_eta <- mcee_config_glm("eta", ~ dp + X1)
Configure known constant values for MCEE nuisance parameters
Description
Creates a configuration for nuisance parameters with known constant values, bypassing model fitting. Useful for known randomization probabilities in MRTs.
Usage
mcee_config_known(target, value = NULL, a1 = NULL, a0 = NULL)
Arguments
target |
Character. Nuisance parameter name ("p", "q", "eta", "mu", "nu"). |
value |
Numeric scalar. Single constant value for all observations. |
a1 , a0 |
Numeric scalars. Arm-specific constants for A=1 and A=0 conditions.
If provided, these override |
Value
A configuration list for use with mcee_general
.
Examples
# Known randomization probability
cfg_p <- mcee_config_known("p", 0.6)
# Arm-specific known values
cfg_eta <- mcee_config_known("eta", a1 = 0.8, a0 = 0.2)
Configure linear model for MCEE nuisance parameters
Description
Creates a configuration to fit nuisance parameters using linear models
via stats::lm()
. Only appropriate for continuous outcomes.
Usage
mcee_config_lm(target, formula)
Arguments
target |
Character. Nuisance parameter name ("p", "q", "eta", "mu", "nu"). |
formula |
RHS-only formula (e.g., |
Value
A configuration list for use with mcee_general
.
Examples
# Linear model for continuous outcome
cfg_eta <- mcee_config_lm("eta", ~ dp + X1 + X2)
Build a nuisance-configuration object for mcee_general()
Description
Creates a configuration list describing **how to obtain a nuisance function**
used by mcee_general
. You may either:
supply **known values** (bypasses learning), or
specify a **learning method** (e.g., GLM/GAM/RF/Ranger/SL) with a formula.
Usage
mcee_config_maker(
target,
method = NULL,
formula = NULL,
family = NULL,
known = NULL,
known_a1 = NULL,
known_a0 = NULL,
clipping = NULL,
SL.library = NULL,
...
)
Arguments
target |
Character; which nuisance to configure. One of
|
method |
Optional character learner name when *not* using known values.
Supported:
Ignored if any of |
formula |
RHS-only formula describing predictors for the learner
(used when |
family |
Optional GLM/GAM family. If |
known |
Optional numeric scalar/vector of **known values** for the nuisance.
Commonly used for |
known_a1 , known_a0 |
Optional numeric scalar/vector providing known values
for the treatment-specific versions of a nuisance (e.g., |
clipping |
Optional numeric vector of length 2, |
SL.library |
Character vector of SuperLearner libraries (only used
when |
... |
Reserved for future extensions; currently ignored. |
Details
If any of known
, known_a1
, or known_a0
is provided,
the returned configuration is of type “known” and **no learner will be fit**.
Otherwise, the configuration records the requested learner, formula, family,
optional clipping, and (for SL) the library.
Internally, helper validators ensure method
is supported and
clipping
(if provided) is sane. Family defaults are chosen when
family = NULL
for GLM/GAM methods.
Value
A named list
describing the configuration. For known configs:
list( nuisance_parameter = <target>, known = <numeric or NULL>, known_a1 = <numeric or NULL>, known_a0 = <numeric or NULL>, clipping = <numeric length-2 or NULL> )
For learner configs:
list( nuisance_parameter = <target>, method = <character>, formula = <formula or NULL>, family = <family or NULL>, clipping = <numeric length-2 or NULL>, SL.library = <character vector; only when method == "sl"> )
See Also
mcee_general
,
helper constructors like
mcee_config_known()
,
mcee_config_glm()
,
mcee_config_gam()
,
mcee_config_lm()
,
mcee_config_rf()
,
mcee_config_ranger()
,
mcee_config_sl()
,
mcee_config_sl_user()
.
Examples
# Known p (MRT randomization), GLM for other nuisances
cfg_p <- mcee_config_maker("p", known = 0.5)
cfg_q <- mcee_config_maker("q", method = "glm", formula = ~ dp + M)
cfg_eta <- mcee_config_maker("eta", method = "glm", formula = ~dp)
cfg_mu <- mcee_config_maker("mu", method = "glm", formula = ~ dp + M)
cfg_nu <- mcee_config_maker("nu", method = "glm", formula = ~dp)
# SuperLearner with default library (set explicitly if you prefer)
# cfg_q_sl <- mcee_config_maker("q", method = "sl", formula = ~ dp + M,
# SL.library = c("SL.mean","SL.glm","SL.ranger"))
# Known treatment-specific outcome regressions (e.g., from external source)
# cfg_eta_known <- mcee_config_maker("eta", known_a1 = rep(1, 100),
# known_a0 = rep(0, 100))
Configure Ranger Random Forest for MCEE nuisance parameters
Description
Creates a configuration to fit nuisance parameters using ranger random forests
via ranger::ranger()
. Faster alternative to randomForest
.
Usage
mcee_config_ranger(target, formula)
Arguments
target |
Character. Nuisance parameter name ("p", "q", "eta", "mu", "nu"). |
formula |
RHS-only formula (e.g., |
Value
A configuration list for use with mcee_general
.
Examples
# Ranger random forest for outcome model
cfg_eta <- mcee_config_ranger("eta", ~ dp + X1 + X2 + X3)
Configure Random Forest for MCEE nuisance parameters
Description
Creates a configuration to fit nuisance parameters using random forests
via randomForest::randomForest()
. Good for nonlinear patterns.
Usage
mcee_config_rf(target, formula)
Arguments
target |
Character. Nuisance parameter name ("p", "q", "eta", "mu", "nu"). |
formula |
RHS-only formula (e.g., |
Value
A configuration list for use with mcee_general
.
Examples
# Random forest for complex propensity model
cfg_q <- mcee_config_rf("q", ~ dp + M + X1 + X2)
Configure SuperLearner for MCEE nuisance parameters
Description
Creates a configuration to fit nuisance parameters using SuperLearner
via SuperLearner::SuperLearner()
. Automatically selects among
multiple learning algorithms.
Usage
mcee_config_sl(target, formula, SL.library = NULL, clipping = NULL)
Arguments
target |
Character. Nuisance parameter name ("p", "q", "eta", "mu", "nu"). |
formula |
RHS-only formula (e.g., |
SL.library |
Optional character vector of learner names. If |
clipping |
Optional numeric vector |
Value
A configuration list for use with mcee_general
.
Examples
# SuperLearner with default library
cfg_q <- mcee_config_sl("q", ~ dp + M + X1)
# SuperLearner with custom library
cfg_eta <- mcee_config_sl("eta", ~ dp + X1,
SL.library = c("SL.glm", "SL.rf", "SL.ranger")
)
Configure SuperLearner with user-specified library for MCEE nuisance parameters
Description
Creates a configuration to fit nuisance parameters using SuperLearner with a user-specified library (required parameter).
Usage
mcee_config_sl_user(target, formula, SL.library, clipping = NULL)
Arguments
target |
Character. Nuisance parameter name ("p", "q", "eta", "mu", "nu"). |
formula |
RHS-only formula (e.g., |
SL.library |
Character vector of learner names (required). |
clipping |
Optional numeric vector |
Value
A configuration list for use with mcee_general
.
Examples
# SuperLearner with specific library
cfg_mu <- mcee_config_sl_user("mu", ~ dp + M + X1,
SL.library = c("SL.glm", "SL.earth", "SL.nnet")
)
Mediated Causal Excursion Effects (configurable nuisance models)
Description
Like mcee
, but each nuisance function is configured explicitly
via config_*
objects (formula/method/family or known).
Usage
mcee_general(
data,
id,
dp,
outcome,
treatment,
mediator,
availability = NULL,
time_varying_effect_form,
config_p,
config_q,
config_eta,
config_mu,
config_nu,
weight_per_row = NULL,
verbose = TRUE
)
Arguments
data |
A data.frame in long format (one row per id-by-decision point). |
id |
Character. Column name for subject identifier. |
dp |
Character. Column name for decision point index (must increase strictly within subject). |
outcome |
Character. Column name for distal outcome (constant within subject). |
treatment |
Character. Column name for treatment (coded 0/1). |
mediator |
Character. Column name for mediator. |
availability |
Optional character. Column name for availability (0/1). If |
time_varying_effect_form |
RHS-only formula for the basis |
config_p , config_q , config_eta , config_mu , config_nu |
Lists created by
|
weight_per_row |
Optional numeric vector of row weights (nonnegative, length |
verbose |
Logical; print progress messages. |
Details
Use this wrapper for observational studies (estimate p
) or when you want
different learners per nuisance. The same data requirements as mcee
apply.
Value
An "mcee_fit"
object; see mcee
.
See Also
mcee
, mcee_userfit_nuisance
, mcee_config_maker
Examples
set.seed(1)
n <- 10
T <- 4
id <- rep(1:n, each = T)
dp <- rep(1:T, times = n)
A <- rbinom(n * T, 1, 0.5)
M <- rbinom(n * T, 1, plogis(-0.2 + 0.3 * A + 0.1 * dp))
Y <- ave(0.5 * A + 0.6 * M + 0.1 * dp + rnorm(n * T), id)
dat <- data.frame(id, dp, A, M, Y)
cfg <- list(
p = mcee_config_known("p", 0.5),
q = mcee_config_glm("q", ~ dp + M),
eta = mcee_config_glm("eta", ~dp),
mu = mcee_config_glm("mu", ~ dp + M),
nu = mcee_config_glm("nu", ~dp)
)
fit_gen <- mcee_general(dat, "id","dp","Y","A","M",
time_varying_effect_form = ~ dp,
config_p=cfg$p, config_q=cfg$q, config_eta=cfg$eta, config_mu=cfg$mu, config_nu=cfg$nu)
Two-stage helper for mediated causal excursion effects (MCEE)
Description
Fits all nuisance components (Stage 1) and then computes the MCEE parameters (Stage 2) and their sandwich variance. This is a low-level driver used by the high-level wrapper; it assumes 'omega_nrows' and 'f_nrows' are already aligned to the rows of 'data'.
Usage
mcee_helper_2stage_estimation(
data,
id_var,
dp_var,
outcome_var,
treatment_var,
mediator_var,
avail_var = NULL,
config_p,
config_q,
config_eta,
config_mu,
config_nu,
omega_nrows,
f_nrows
)
Arguments
data |
A long-format 'data.frame' (one row per subject-by-decision point). |
id_var |
Character scalar. Name of the subject ID column. |
dp_var |
Character scalar. Name of the decision point column (values need not be consecutive; they may vary in count across subjects). |
outcome_var |
Character scalar. Name of the distal outcome column. |
treatment_var |
Character scalar. Name of the binary treatment column (coded 0/1). |
mediator_var |
Character scalar. Name of the mediator column. |
avail_var |
Character scalar or 'NULL'. Name of the availability column (1 = available, 0 = unavailable). If 'NULL', availability is treated as all 1. |
config_p |
Configuration for
|
config_q |
Configuration for |
config_eta |
Configuration for |
config_mu |
Configuration for |
config_nu |
Configuration for |
omega_nrows |
Numeric vector of length |
f_nrows |
Numeric matrix with |
Details
Availability handling:
When avail_var
exists and equals 0, Stage 1 sets the working probabilities
to 1 for that row (e.g., \hat{p}_t(1\mid H_t)=1
, \hat{p}_t(0\mid H_t)=1
, similarly
for \hat q_t
). This prevents division-by-zero in the estimating equations.
Auto-family rules:
If family
is omitted in a GLM/GAM config, it defaults to binomial()
for config_p
and config_q
, and to gaussian()
for
config_eta
, config_mu
, and config_nu
.
Learners:
-
"glm"
: usesstats::glm()
. -
"gam"
: usesmgcv::gam()
(supportss()
smooths). -
"rf"
: usesrandomForest::randomForest()
. -
"ranger"
: usesranger::ranger()
. -
"sl"
: usesSuperLearner::SuperLearner()
. IfSL.library
is not given, a simple default library is used: c("SL.mean", "SL.glm", "SL.gam").
Value
A list with components:
fit
A list with entries
alpha_hat
,alpha_se
,beta_hat
,beta_se
, andvarcov
(the2p\times 2p
sandwich variance for(\alpha^\top,\beta^\top)^\top
).nuisance_models
A list of fitted Stage-1 objects:
p
,q
,eta1
,eta0
,mu1
,mu0
,nu1
,nu0
. (Forknown
/known_a0
/known_a1
, a small descriptor list is returned.)
See Also
mcee_general
for a high-level wrapper that constructs
omega_nrows
and f_nrows
from user-friendly arguments.
Examples
## Not run:
# Minimal sketch (assuming `df` has columns id, t, A, M, Y, I):
fit <- mcee_helper_2stage_estimation(
data = df,
id_var = "id", dp_var = "t", outcome_var = "Y",
treatment_var = "A", mediator_var = "M", avail_var = "I",
config_p = list(formula = ~ t + M, method = "glm"), # binomial auto
config_q = list(formula = ~ t + M + A, method = "glm"), # binomial auto
config_eta = list(formula = ~t, method = "gam"), # gaussian auto
config_mu = list(formula = ~ t + s(M), method = "gam"), # gaussian auto
config_nu = list(formula = ~t, method = "glm"), # gaussian auto
omega_nrows = rep(1, nrow(df)),
f_nrows = cbind(1) # marginal (p = 1)
)
fit$fit$alpha_hat
fit$fit$beta_hat
## End(Not run)
Fit all nuisance models for MCEE Stage 1
Description
Fits all five nuisance components required for MCEE estimation and returns both per-row predictions and fitted model objects. This is Stage 1 of the two-stage MCEE procedure.
Usage
mcee_helper_stage1_fit_nuisance(
data,
id_var,
dp_var,
outcome_var,
treatment_var,
mediator_var,
avail_var,
config_p,
config_q,
config_eta,
config_mu,
config_nu
)
Arguments
data |
Data frame in long format. |
id_var , dp_var , outcome_var , treatment_var , mediator_var , avail_var |
Character column names (same as in |
config_p , config_q , config_eta , config_mu , config_nu |
Configuration lists
for each nuisance parameter (see |
Details
**Nuisance Parameters Fitted:**
-
p
: Propensity scoreP(A_t=1\mid H_t)
- fitted on available rows. (Technically, this isP(A_t=I_t\mid H_t)
, but the user is allowed to inputP(A_t=1\mid H_t)
and the function will automatically correct it by settingp1 = 1
whenI_t = 0
.) -
q
: Conditional propensityP(A_t=1\mid H_t, M_t)
- fitted on available rows. (Technically, this isP(A_t=I_t\mid H_t, M_t)
, but the user is allowed to inputP(A_t=1\mid H_t, M_t)
and the function will automatically correct it by settingq1 = 1
whenI_t = 0
.) -
eta1
,eta0
: Outcome regressionE(Y\mid A_t=a, H_t)
without mediator -
mu1
,mu0
: Outcome regressionE(Y\mid A_t=a, H_t, M_t)
with mediator -
nu1
,nu0
: Cross-world regressions for counterfactual outcomes
**Data Subsets Used for Fitting:**
- p
, q
: Only rows where availability==1
- eta1
, mu1
: Rows where A==1
OR availability==0
- eta0
, mu0
: Rows where A==0
- nu1
: Fitted on A==0
rows using mu1
predictions as outcome
- nu0
: Fitted on A==1
or unavailable rows using mu0
predictions as outcome
**Availability Handling:**
When availability==0
, predictions are forced to p1=p0=q1=q0=1
to
prevent division by zero in Stage 2.
Value
List with two components:
nuisance_fitted
List of numeric vectors (length
nrow(data)
) containing per-row predictions:p1
,p0
,q1
,q0
,eta1
,eta0
,mu1
,mu0
,nu1
,nu0
.nuisance_models
List of fitted model objects or "known" descriptors:
p
,q
,eta1
,eta0
,mu1
,mu0
,nu1
,nu0
.
Stage-2 MCEE parameter estimation given nuisance predictions
Description
Computes the Natural Direct Excursion Effect (NDEE; \alpha
) and Natural Indirect
Excursion Effect (NIEE; \beta
) parameters using Stage-1 nuisance predictions.
This is Stage 2 of the two-stage MCEE procedure.
Usage
mcee_helper_stage2_estimate_mcee(
data,
id_var,
dp_var,
outcome_var,
treatment_var,
avail_var = NULL,
p1,
p0,
q1,
q0,
eta1,
eta0,
mu1,
mu0,
nu1,
nu0,
omega_nrows,
f_nrows
)
Arguments
data |
Data frame in long format. |
id_var , dp_var , outcome_var , treatment_var , avail_var |
Character column names. |
p1 , p0 , q1 , q0 , eta1 , eta0 , mu1 , mu0 , nu1 , nu0 |
Numeric vectors of length |
omega_nrows |
Numeric vector of length |
f_nrows |
Numeric matrix with |
Details
**MCEE Estimating Equations:**
The function constructs influence functions \phi_t^{11}
, \phi_t^{10}
, \phi_t^{00}
for each row and
solves the estimating equations:
**NDEE (
\alpha
)**:\sum_{i,t}\omega(i,t) [\phi_t^{10} - \phi_t^{00}] f(t) = 0
**NIEE (
\beta
)**:\sum_{i,t}\omega(i,t) [\phi_t^{11} - \phi_t^{10}] f(t) = 0
**Influence Functions:**
- \phi_t^{11}
: Direct effect pathway influence function
- \phi_t^{10}
: Mediated effect pathway influence function
- \phi_t^{00}
: Control/reference pathway influence function
**Variance Estimation:** Uses sandwich variance estimation with subject-level clustering. The variance accounts for the two-stage estimation uncertainty.
Value
List containing MCEE parameter estimates and inference:
alpha_hat
Vector of length
p
: NDEE parameter estimatesalpha_se
Vector of length
p
: NDEE standard errorsbeta_hat
Vector of length
p
: NIEE parameter estimatesbeta_se
Vector of length
p
: NIEE standard errorsvarcov
Matrix
2p \times 2p
: Joint variance-covariance for(\alpha,\beta)
alpha_varcov
Matrix
p \times p
: Variance-covariance for\alpha
beta_varcov
Matrix
p \times p
: Variance-covariance for\beta
Mediated Causal Excursion Effects with user-supplied nuisance predictions
Description
Skips Stage-1 model fitting and uses user-provided nuisance predictions.
Usage
mcee_userfit_nuisance(
data,
id,
dp,
outcome,
treatment,
mediator,
availability = NULL,
time_varying_effect_form,
p1,
q1,
eta1,
eta0,
mu1,
mu0,
nu1,
nu0,
weight_per_row = NULL,
verbose = TRUE
)
Arguments
data |
A data.frame in long format (one row per id-by-decision point). |
id |
Character. Column name for subject identifier. |
dp |
Character. Column name for decision point index (must increase strictly within subject). |
outcome |
Character. Column name for distal outcome (constant within subject). |
treatment |
Character. Column name for treatment (coded 0/1). |
mediator |
Character. Column name for mediator. |
availability |
Optional character. Column name for availability (0/1). If |
time_varying_effect_form |
RHS-only formula for the basis |
p1 , q1 , eta1 , eta0 , mu1 , mu0 , nu1 , nu0 |
Numeric vectors (or column names) of
per-row predictions aligned with |
weight_per_row |
Optional numeric vector of row weights (nonnegative, length |
verbose |
Logical; print progress messages. |
Details
Nuisance definitions:
-
p1
:P(A_t=1\mid H_t)
(known in MRTs). (Technically, this isP(A_t=I_t\mid H_t)
, but the user is allowed to inputP(A_t=1\mid H_t)
and the function will automatically correct it by settingp1 = 1
whenI_t = 0
.) -
q1
:P(A_t=1\mid H_t, M_t)
. (Technically, this isP(A_t=I_t\mid H_t, M_t)
, but the user is allowed to inputP(A_t=1\mid H_t, M_t)
and the function will automatically correct it by settingq1 = 1
whenI_t = 0
.) -
eta1
,eta0
:E(Y\mid H_t, A_t=1)
andE(Y\mid H_t, A_t=0)
. -
mu1
,mu0
:E(Y\mid H_t, A_t=1, M_t)
andE(Y\mid H_t, A_t=0, M_t)
. -
nu1
,nu0
: cross-world regressions; see vignette and paper for definitions.
If availability
is provided, rows with I=0
are coerced to p1=q1=1
(and hence p0=q0=1
); a warning is emitted if overrides occur.
Value
An "mcee_fit"
object; see mcee
.
See Also
Examples
set.seed(1)
n <- 10
T <- 4
id <- rep(1:n, each = T)
dp <- rep(1:T, times = n)
A <- rbinom(n * T, 1, 0.5)
M <- rbinom(n * T, 1, plogis(-0.2 + 0.3 * A + 0.1 * dp))
Y <- ave(0.5 * A + 0.6 * M + 0.1 * dp + rnorm(n * T), id)
dat <- data.frame(id, dp, A, M, Y)
fit_usr <- mcee_userfit_nuisance(dat, "id","dp","Y","A","M",
time_varying_effect_form = ~ dp,
p1 = rep(0.5, nrow(dat)),
q1 = runif(nrow(dat),.3,.7),
eta1 = rnorm(nrow(dat)), eta0 = rnorm(nrow(dat)),
mu1 = rnorm(nrow(dat)), mu0 = rnorm(nrow(dat)),
nu1 = rnorm(nrow(dat)), nu0 = rnorm(nrow(dat)))
Print method for summary of MCEE fits
Description
Prints formatted coefficient tables and inference results for mediated causal excursion effects, including alpha (Natural Direct Excursion Effect) and beta (Natural Indirect Excursion Effect) parameters.
Usage
## S3 method for class 'summary.mcee_fit'
print(x, ...)
Arguments
x |
An object of class |
... |
Currently unused. |
Value
Invisibly returns the input object x
. Called for side effects.
See Also
Summary for DCEE fits
Description
Produce inference tables for distal causal excursion effects from a
[dcee()] model. By default uses small-sample t
-tests with
df = object$df
(subjects minus number of betas). If df
is missing or nonpositive, falls back to large-sample normal (z) inference.
Usage
## S3 method for class 'dcee_fit'
summary(
object,
lincomb = NULL,
conf_level = 0.95,
show_control_fit = FALSE,
...
)
Arguments
object |
An object of class |
lincomb |
Optional numeric vector or matrix specifying linear
combinations |
conf_level |
Confidence level for intervals (default |
show_control_fit |
Logical; if |
... |
Currently ignored. |
Value
A list of class "summary.dcee_fit"
with components:
-
call
— the original call -
df
— degrees of freedom used for t-tests (may beNA
) -
conf_level
— the confidence level -
excursion_effect
— data frame with coefficient table for\beta
-
lincomb
— optional data frame with linear-combination results -
control_fit
— optional list describing Stage-1 fits (only ifshow_control_fit
)
Summarize Causal Excursion Effect Fits for MRT with Binary Outcomes
Description
summary
method for class "emee_fit".
Usage
## S3 method for class 'emee_fit'
summary(
object,
lincomb = NULL,
conf_level = 0.95,
show_control_fit = FALSE,
...
)
Arguments
object |
An object of class "emee_fit". |
lincomb |
A vector of length p (p is the number of moderators including intercept) or a matrix with p columns. When not set to 'NULL', the summary will include the specified linear combinations of the causal excursion effect coefficients and the corresponding confidence interval, standard error, and p-value. |
conf_level |
A numeric value indicating the confidence level for confidence intervals. Default to 0.95. |
show_control_fit |
A logical value of whether the fitted coefficients for the control variables will be printed in the summary. Default to FALSE. (Interpreting the fitted coefficients for control variables is not recommended.) |
... |
Further arguments passed to or from other methods. |
Value
the original function call and the estimated causal excursion effect coefficients, confidence interval with conf_level, standard error, t-statistic value, degrees of freedom, and p-value.
Examples
fit <- emee(
data = data_binary,
id = "userid",
outcome = "Y",
treatment = "A",
rand_prob = "rand_prob",
moderator_formula = ~time_var1,
control_formula = ~ time_var1 + time_var2,
availability = "avail",
numerator_prob = 0.5,
start = NULL
)
summary(fit)
Summarize an mcee fit
Description
Prints coefficient tables for the Natural Direct Excursion Effect (alpha) and Natural Indirect Excursion Effect (beta), with small-sample t inference. Optionally reports linear combinations and Stage-1 nuisance summaries.
Usage
## S3 method for class 'mcee_fit'
summary(
object,
lincomb_alpha = NULL,
lincomb_beta = NULL,
lincomb_joint = NULL,
conf_level = 0.95,
show_nuisance = FALSE,
...
)
Arguments
object |
An object of class |
lincomb_alpha , lincomb_beta |
Optional numeric vector or matrix specifying
linear combinations of |
lincomb_joint |
Optional numeric vector or matrix specifying linear
combinations of the stacked parameter |
conf_level |
Confidence level for Wald intervals (default 0.95). |
show_nuisance |
Logical; if |
... |
Unused. |
Value
A list of class "summary.mcee_fit"
with printed side effects.
Examples
# s <- summary(fit, lincomb_alpha = c(1, 9), lincomb_beta = c(1, 9))
Summarize Causal Excursion Effect Fits for MRT with Continuous Outcomes
Description
summary
method for class "wcls_fit".
Usage
## S3 method for class 'wcls_fit'
summary(
object,
lincomb = NULL,
conf_level = 0.95,
show_control_fit = FALSE,
...
)
Arguments
object |
An object of class "wcls_fit". |
lincomb |
A vector of length p (p is the number of moderators including intercept) or a matrix with p columns. When not set to 'NULL', the summary will include the specified linear combinations of the causal excursion effect coefficients and the corresponding confidence interval, standard error, and p-value. |
conf_level |
A numeric value indicating the confidence level for confidence intervals. Default to 0.95. |
show_control_fit |
A logical value of whether the fitted coefficients for the control variables will be printed in the summary. Default to FALSE. (Interpreting the fitted coefficients for control variables is not recommended.) |
... |
Further arguments passed to or from other methods. |
Value
the original function call and the estimated causal excursion effect coefficients, 95 value or Wald-statistic value (depending on whether sample size is < 50), degrees of freedom, and p-value.
Examples
fit <- wcls(
data = data_mimicHeartSteps,
id = "userid",
outcome = "logstep_30min",
treatment = "intervention",
rand_prob = 0.6,
moderator_formula = ~1,
control_formula = ~logstep_pre30min,
availability = "avail",
numerator_prob = 0.6
)
summary(fit)
Estimates the causal excursion effect for continuous outcome MRT
Description
Returns the estimated causal excursion effect (on additive scale) and the estimated standard error. Small sample correction using the "Hat" matrix in the variance estimate is implemented.
Usage
wcls(
data,
id,
outcome,
treatment,
rand_prob,
moderator_formula,
control_formula,
availability = NULL,
numerator_prob = NULL,
verbose = TRUE
)
Arguments
data |
A data set in long format. |
id |
The subject id variable. |
outcome |
The outcome variable. |
treatment |
The binary treatment assignment variable. |
rand_prob |
The randomization probability variable. |
moderator_formula |
A formula for the moderator variables. This should
start with ~ followed by the moderator variables. When set to |
control_formula |
A formula for the control variables. This should
start with ~ followed by the control variables. When set to |
availability |
The availability variable. Use the default value ( |
numerator_prob |
Either a number between 0 and 1, or a variable name for
a column in data. If you are not sure what this is, use the default value ( |
verbose |
If default ('TRUE'), additional messages will be printed during data preprocessing. |
Value
An object of type "wcls_fit"
Examples
wcls(
data = data_mimicHeartSteps,
id = "userid",
outcome = "logstep_30min",
treatment = "intervention",
rand_prob = 0.6,
moderator_formula = ~1,
control_formula = ~logstep_pre30min,
availability = "avail",
numerator_prob = 0.6
)