Advanced Guide: Custom Kernels and Random Machines • FastSurvivalSVM

Introduction 🚀

Random Machines is an ensemble method that combines multiple Support Vector Machines (SVMs) using Bagging (Bootstrap Aggregating). Unlike a standard Random Forest (which uses decision trees), Random Machines uses SVMs as base learners.

A powerful feature of the FastSurvivalSVM package is the ability to mix standard kernels (like Linear and RBF) with custom user-defined kernels in the same ensemble.

This guide demonstrates:

How to define simple custom kernel functions in R ✍️
How to instantiate them for the ensemble using grid_kernel() 🧩
How to train a Random Machines model that automatically selects the best kernels 🎯

1. Data Preparation 📦

We start by generating a synthetic survival dataset with right censoring.

library(FastSurvivalSVM)

set.seed(42)

# Generate synthetic survival data (n = 300)
# This function creates nonlinear relationships suitable for kernel methods.
df <- data_generation(n = 300, prop_cen = 0.25)

# Split into Training (200) and Testing (100) sets
train_idx <- sample(seq_len(nrow(df)), 200)
train_df  <- df[train_idx, ]
test_df   <- df[-train_idx, ]

head(train_df)
#>          tempo cens          x1        x2        x3
#> 243  0.1189498    1  2.25848166 3.0642572 3.8047184
#> 282  0.2520948    0 -0.66262940 1.1302032 0.3690098
#> 34   2.1360394    1  0.39107362 2.3055282 2.8014030
#> 49   0.1223509    1  0.56855380 2.1746382 0.9993991
#> 111 14.1078406    1  0.97490745 0.5951211 0.7997705
#> 195  0.5372712    1  0.04082956 0.7702062 5.0694055

2. Defining Custom Kernels 🧠

In previous versions of this package, defining custom kernels required advanced R concepts like Function Factories and force(). This is no longer necessary.

Now, you simply define the mathematical logic of your kernel as a standard R function. The function must accept at least x and z (the data vectors) and any other parameters you need.

2.1 Wavelet Kernel 🌊

The Wavelet kernel (Ará et al., 2016) can be written as:

$K(x,z) = \prod \cos\left(1.75\frac{x - z}{A}\right)\exp\left(-\frac{1}{2}\left(\frac{x - z}{A}\right)^2\right)$

my_wavelet <- function(x, z, A = 1) {
  # Calculate scaled distance vector
  u <- (as.numeric(x) - as.numeric(z)) / A
  
  # Product of the mother wavelet function
  prod(cos(1.75 * u) * exp(-0.5 * u^2))
}

2.2 Polynomial Kernel (Custom Implementation) 🧮

A custom polynomial kernel can be written as:

$K(x,z) = (\gamma \langle x,z \rangle + \text{coef0})^{\text{degree}}$

Here we implement a simplified version with gamma = 1 (absorbed into scaling), so:

$K(x,z) = (\langle x,z \rangle + \text{coef0})^{\text{degree}}$

my_poly <- function(x, z, degree, coef0) {
  val <- sum(as.numeric(x) * as.numeric(z)) + coef0
  val ^ degree
}

3. Configuring the Ensemble 🧩

To build a Random Machine, we need to provide a list of kernels to be used as candidates.

For custom kernels, we need to fix their hyperparameters before passing them to the ensemble. We use the helper function grid_kernel() for this.

Note: grid_kernel() is smart. If you pass a single value for a parameter (e.g., A = 1), it returns a single function ready to use. If you pass a vector (e.g., A = 1:3), it returns a list for tuning. Here, we want single instances ✅.

# Instantiate specific versions of our custom kernels
# We don't need lists or [[1]]. The function returns the object directly.

k_wav_1 <- grid_kernel(my_wavelet, A = 1)
k_pol_2 <- grid_kernel(my_poly, degree = 2, coef0 = 1)

Now we define the Kernel Mix. We will combine:

Linear (Standard scikit-learn)
RBF (Standard scikit-learn)
Wavelet (Custom R)
Polynomial (Custom R)

We set rank_ratio = 0 for all of them, meaning we are solving a regression problem (predicting the survival time/risk score directly), rather than a ranking problem.

kernel_mix <- list(
  # --- Standard Kernels (processed in Python) ---
  linear_base = list(kernel = "linear", alpha = 1,   rank_ratio = 0),
  rbf_std     = list(kernel = "rbf",    alpha = 0.5, gamma = 0.1, rank_ratio = 0),

  # --- Custom Kernels (processed in R) ---
  wavelet_A1  = list(kernel = k_wav_1,  alpha = 1,   rank_ratio = 0),
  poly_deg2   = list(kernel = k_pol_2,  alpha = 1,   rank_ratio = 0)
)

4. Training Random Machines 🏗️

We fit the model using random_machines().

Key Parameters:

B: Number of bootstrap samples (machines).
mtry: Number of variables randomly sampled at each split (Random Subspace).
prop_holdout: Fraction of training data kept aside inside the function to calculate kernel weights.
crop: Pruning threshold. If a kernel’s performance weight is below this value (e.g., 0.10), it is removed from that bootstrap iteration.

# Use all available cores for parallel processing
cores <- parallel::detectCores()

set.seed(123)

rm_model <- random_machines(
  data         = train_df,
  newdata      = test_df,   # Predict on test set immediately
  time_col     = "tempo",
  delta_col    = "cens",
  kernels      = kernel_mix,
  B            = 50,        # Number of machines
  mtry         = NULL,      # Use all features (or set an integer)
  crop         = 0.10,      # Drop weak kernels (weight <= 10%)
  prop_holdout = 0.20,      # 20% internal validation for weighting
  cores        = cores,
  .progress    = TRUE
)
#> 
#> ── 🚀 Random Machines (Kernel Survival SVM) ────────────────────────────────────
#> ℹ Starting Random Machines (B=50, mtry=All) on 10 cores.
#> ℹ Kernel weights via Holdout: 160 training / 40 validation.
#> ℹ Computing kernel weights...
#> ℹ Executing parallel bootstrap...
#> ■■                                 2% | ETA:  3m
#> ■■                                 4% | ETA:  2m
#> ■■■■■■■■■                         26% | ETA: 28s
#> ■■■■■■■■■■■■■■■■■■■■■■            70% | ETA:  5s
#> ■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■  100% | ETA:  0s
#> ✔ Done. Valid Models: 50/50. Mean OOB: 0.6874

5. Evaluation and Interpretation 🔎

5.1 Model Summary 📋

Printing the model gives us insights into which kernels were actually selected and how much weight they received.

print(rm_model)
#> 
#> ── 📦 Random Machines (FastKernelSurvivalSVM) ──────────────────────────────────
#> • Models Trained: 50
#> • Features (mtry): All
#> • Mean OOB C-index: 0.6874
#> • Crop Threshold: 0.1
#> 
#> ── 📊 Kernel Usage (Bootstrap Selection) ──
#> 
#> ────────────────────────────────────────────────────────────────────────────────
#> Kernel   | Count | Probability
#> ────────────────────────────────────────────────────────────────────────────────
#> wavelet_A1  |    20 |      0.4000
#> rbf_std     |    11 |      0.2200
#> linear_base |    10 |      0.2000
#> poly_deg2   |     9 |      0.1800
#> ────────────────────────────────────────────────────────────────────────────────
#> 
#> ── ⚖️ Kernel Weights (Holdout Probabilities) ──
#> 
#> ────────────────────────────────────────────────────────────────────────────────
#> Kernel   | Probability | Status
#> ────────────────────────────────────────────────────────────────────────────────
#> wavelet_A1  |      0.3278 | ✅ Selected
#> rbf_std     |      0.2684 | ✅ Selected
#> poly_deg2   |      0.2077 | ✅ Selected
#> linear_base |      0.1961 | ✅ Selected
#> ────────────────────────────────────────────────────────────────────────────────

Kernel Usage: Shows how often each kernel type was picked during the bagging process.
Kernel Weights: Shows the probabilistic weight assigned to each kernel family based on the internal holdout performance.

5.2 Performance (C-index) 🏁

Finally, we evaluate the ensemble’s performance on the independent test set using the Concordance Index.

c_index <- score(rm_model, test_df)
cat(sprintf("Final C-Index on Test Data: %.4f\n", c_index))
#> Final C-Index on Test Data: 0.7952

Values closer to 1.0 indicate better predictive discrimination ✅.

Conclusion 🎉

The FastSurvivalSVM package makes it easy to build powerful survival ensembles. By using grid_kernel(), you can seamlessly integrate complex, domain-specific kernels (defined in R) alongside standard efficient kernels (from Python), allowing the Random Machines algorithm to automatically learn the best representation for your data.