Introduction to PMLE4SCR

Overview

The PMLE4SCR package implements two-stage pseudo maximum likelihood estimation (PMLE) for copula-based regression models with semi-competing risks data. In semi-competing risks data, a non-terminal event (e.g., disease relapse) is subject to dependent censoring by a terminal event (e.g., death), but not vice versa.

The package provides two main functions:

Data

We illustrate the package using the Bone Marrow Transplant (BMT) dataset from the SemiCompRisks package. This dataset contains 137 patients with acute leukemia aged 7 to 52. The non-terminal event is leukemia relapse and the terminal event is death. Three disease groups are considered: AML-low risk, AML-high risk, and ALL (acute lymphoblastic leukemia), with AML-low as the baseline group.

library(PMLE4SCR)
data(BMT, package = "SemiCompRisks")
BMT$g <- factor(BMT$g, levels = c(2, 3, 1),
                labels = c("AML-low", "AML-high", "ALL"))

The key variables in the dataset are described below:

Table 1: Key variables in the BMT dataset
Variable Description
T1 Time to death (terminal event)
T2 Time to leukemia relapse (non-terminal event)
delta1 Death indicator (1 = observed, 0 = censored)
delta2 Relapse indicator (1 = observed, 0 = censored)
g Disease group (AML-low, AML-high, ALL)

A preview of the first six observations:

head(BMT[, c("T1", "T2", "delta1", "delta2", "g")])
#>     T1   T2 delta1 delta2   g
#> 1 2081 2081      0      0 ALL
#> 2 1602 1602      0      0 ALL
#> 3 1496 1496      0      0 ALL
#> 4 1462 1462      0      0 ALL
#> 5 1433 1433      0      0 ALL
#> 6 1377 1377      0      0 ALL

Fitting the Model with PMLE

We fit a copula-based model using the Clayton copula, with proportional hazards (PH) marginals for both relapse and death. The disease group g is included as a covariate for both marginals and the copula parameter.

myfit <- PMLE4SCR(BMT, time = "T2", death = "T1",
                  status_time = "delta2", status_death = "delta1",
                  T.fmla = ~ g, D.fmla = ~ g,
                  copula.family = "Clayton",
                  copula.control = list(formula = ~ g))

Results

Dependence between relapse and death

The estimated copula parameters and corresponding Kendall’s \(\tau\) values measure the dependence between relapse and death times for each disease group. A larger \(\tau\) indicates stronger dependence.

knitr::kable(myfit$gamma,
             digits = 3,
             caption = "Table 2: Estimated copula parameters (PMLE)")
Table 2: Estimated copula parameters (PMLE)
para est se
(Intercept) 2.169 0.553
gAML-high -0.271 0.812
gALL -0.282 0.791

Regression coefficients for relapse (non-terminal event)

The estimated regression coefficients for the marginal distribution of relapse time. Positive values indicate higher risk of relapse compared to the AML-low baseline group.

Table 3: Regression coefficients for relapse time (PMLE)
Group Estimate SE
AML-high 1.168 0.311
ALL 0.710 0.324

Regression coefficients for death (terminal event)

Table 4: Regression coefficients for death time (PMLE)
Group Estimate SE
AML-high 1.022 0.276
ALL 0.611 0.285

Comparison: PMLE vs Simultaneous MLE

For comparison, we also fit the simultaneous MLE and compare the estimates from both methods side by side.

myfit_mle <- MLE4SCR(BMT, time = "T2", death = "T1",
                     status_time = "delta2", status_death = "delta1",
                     T.fmla = ~ g, D.fmla = ~ g,
                     copula.family = "Clayton",
                     copula.control = list(formula = ~ g))
Table 5: Comparison of PMLE and simultaneous MLE estimates
PMLE
Simultaneous MLE
Parameter Estimate SE Estimate SE
Copula (Intercept) 2.169 0.553 2.213 0.537
Copula (AML-high) -0.271 0.812 -0.270 0.803
Copula (ALL) -0.282 0.791 -0.293 0.770
Relapse: AML-high 1.168 0.311 1.116 0.306
Relapse: ALL 0.710 0.324 0.669 0.320
Death: AML-high 1.022 0.276 1.022 0.276
Death: ALL 0.611 0.285 0.611 0.285

The two methods produce similar estimates. The PMLE is computationally more efficient and robust against copula misspecification for the marginal parameters.

Reference

Arachchige, S. J., Chen, X., and Zhou, Q. M. (2025). Two-stage pseudo maximum likelihood estimation of semiparametric copula-based regression models for semi-competing risks data. Lifetime Data Analysis, 31, 52-75.