---
title: "uddbart: Dynamic Interval-Censored Risk Prediction"
author: "Xulin Pan"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{uddbart: Dynamic Interval-Censored Risk Prediction}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include=FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.width = 7,
  fig.height = 5
)
```

# Overview

The `uddbart` package provides tools for dynamic risk prediction from irregular longitudinal biomarker data with interval-censored outcomes.

The package is designed for studies where patients are followed over time, biomarker measurements are collected at irregular visit times, and the clinical event is known only to occur between two observation times.

A motivating example is chronic myeloid leukemia (CML), where patients are monitored using repeated BCR--ABL measurements and the event of interest is deep molecular response.

# Installation

```{r, eval=FALSE}
install.packages("uddbart")
```

The development version can be installed from GitHub:

```{r, eval=FALSE}
# install.packages("pak")
pak::pak("xulinpan/uddbart")
```

# Load the package

```{r setup}
library(uddbart)
```

# Example data

The package includes two example datasets:

```{r}
data("cml_long", package = "uddbart")
data("cml_event", package = "uddbart")
```

The longitudinal dataset contains repeated biomarker measurements:

```{r}
head(cml_long)
```

The event dataset contains interval-censored outcome information:

```{r}
head(cml_event)
```

# Data structure

The longitudinal biomarker data should contain one row per patient visit. A typical structure is:

```{r, eval=FALSE}
head(cml_long)
```

Required columns are usually:

- `patient_id`: patient identifier
- `t_months`: visit time
- `log_mrd`: longitudinal biomarker value

The event data should contain one row per patient:

```{r, eval=FALSE}
head(cml_event)
```

Required columns are usually:

- `patient_id`: patient identifier
- `L`: left endpoint of the event interval
- `R`: right endpoint of the event interval
- `C`: censoring time
- `delta`: event indicator

# Fitting a model

The following example demonstrates the basic workflow.

For CRAN checking, the full model fit is not evaluated in this vignette because Bayesian tree fitting can take time.

```{r, eval=FALSE}
fit <- uddbart(
  long_data = cml_long,
  event_data = cml_event,
  landmark = c(6, 12),
  horizon = 12,
  ntree = 20,
  ndpost = 50,
  nskip = 25,
  seed = 1
)
```

# Prediction

After fitting a model, predicted risks can be obtained using `predict()`.

```{r, eval=FALSE}
pred <- predict(fit)

head(pred)
```

The predicted values represent individualized probabilities of experiencing the event within the specified prediction horizon after each landmark time.

# Model output

A fitted `uddbart` object typically contains:

```{r, eval=FALSE}
str(fit)
```

Common components include:

- landmark-specific prediction data
- posterior risk estimates
- fitted Bayesian tree model
- model settings
- prediction horizon

# Practical interpretation

For a landmark time \(s\) and prediction horizon \(\Delta\), `uddbart` estimates:

\[
P(T \le s + \Delta \mid T > s, \mathcal{H}(s)),
\]

where \(T\) is the event time and \(\mathcal{H}(s)\) is the longitudinal biomarker history observed before or at time \(s\).

In the CML example, this can be interpreted as:

> the probability that a patient will achieve deep molecular response within the next prediction window, given their observed BCR--ABL monitoring history up to the landmark time.

# Notes for CRAN

The computationally intensive examples are wrapped in `eval=FALSE` so that the vignette can be built quickly during CRAN checks.

Users can copy and run these examples interactively after installing all required dependencies.

# References

Chipman, H. A., George, E. I., and McCulloch, R. E. (2010). BART: Bayesian additive regression trees. *The Annals of Applied Statistics*, 4(1), 266--298.

Rizopoulos, D. (2011). Dynamic predictions and prospective accuracy in joint models for longitudinal and time-to-event data. *Biometrics*, 67(3), 819--829.

van Houwelingen, H. C., and Putter, H. (2012). *Dynamic Prediction in Clinical Survival Analysis*. CRC Press.