Functional Data Analysis (FDA) is a branch of statistics that deals with data where each observation is a function, curve, or surface rather than a single number or vector. Examples include: - Temperature curves recorded over a day - Growth curves of children over time - Spectrometric measurements across wavelengths - Stock prices throughout trading hours
In FDA, we treat each curve as a single observation and develop methods to analyze collections of such curves.
fdars (Functional Data Analysis in Rust) provides a comprehensive toolkit for FDA with a high-performance Rust backend. Key features include:
The core data structure is the fdata class. Create
functional data from a matrix where rows are observations (curves) and
columns are evaluation points:
# Generate example data: 20 curves evaluated at 100 points
set.seed(42)
n <- 20
m <- 100
t_grid <- seq(0, 1, length.out = m)
# Create curves: sine waves with random phase and noise
X <- matrix(0, n, m)
for (i in 1:n) {
phase <- runif(1, 0, pi)
X[i, ] <- sin(2 * pi * t_grid + phase) + rnorm(m, sd = 0.1)
}
# Create fdata object
fd <- fdata(X, argvals = t_grid)
fd
#> Functional data object
#> Type: 1D (curve)
#> Number of observations: 20
#> Number of points: 100
#> Range: 0 - 1You can attach identifiers and metadata (covariates) to functional data:
# Create metadata with covariates
meta <- data.frame(
group = factor(rep(c("control", "treatment"), each = 10)),
age = sample(20:60, n, replace = TRUE),
response = rnorm(n)
)
# Create fdata with IDs and metadata
fd_meta <- fdata(X, argvals = t_grid,
id = paste0("patient_", 1:n),
metadata = meta)
fd_meta
#> Functional data object
#> Type: 1D (curve)
#> Number of observations: 20
#> Number of points: 100
#> Range: 0 - 1
#> Metadata columns: group, age, response
# Access metadata
fd_meta$id[1:5]
#> [1] "patient_1" "patient_2" "patient_3" "patient_4" "patient_5"
head(fd_meta$metadata)
#> group age response
#> 1 control 54 0.3533851
#> 2 control 43 -0.2975149
#> 3 control 55 0.5553262
#> 4 control 56 -0.3193581
#> 5 control 28 -0.7752047
#> 6 control 38 0.4711363Metadata is preserved when subsetting:
Depth measures how “central” a curve is within a sample. Higher depth indicates a more typical curve:
Compute distances between curves using various metrics:
Predict a scalar response from functional predictors:
Group curves into clusters:
Explore the other vignettes for detailed coverage of specific topics:
The Rust backend provides significant speedups for computationally intensive operations. For example, computing depth for 1000 curves: