Introduction ๐Ÿš€

Random Machines is an ensemble method that combines multiple Support Vector Machines (SVMs) using Bagging (Bootstrap Aggregating). Unlike a standard Random Forest (which uses decision trees), Random Machines uses SVMs as base learners.

A powerful feature of the FastSurvivalSVM package is the ability to mix standard kernels (like Linear and RBF) with custom user-defined kernels in the same ensemble.

This guide demonstrates:

  • How to define simple custom kernel functions in R โœ๏ธ
  • How to instantiate them for the ensemble using grid_kernel() ๐Ÿงฉ
  • How to train a Random Machines model that automatically selects the best kernels ๐ŸŽฏ

1. Data Preparation ๐Ÿ“ฆ

We start by generating a synthetic survival dataset with right censoring.

library(FastSurvivalSVM)

set.seed(42)

# Generate synthetic survival data (n = 300)
# This function creates nonlinear relationships suitable for kernel methods.
df <- data_generation(n = 300, prop_cen = 0.25)

# Split into Training (200) and Testing (100) sets
train_idx <- sample(seq_len(nrow(df)), 200)
train_df  <- df[train_idx, ]
test_df   <- df[-train_idx, ]

head(train_df)
#>          tempo cens          x1        x2        x3
#> 243  0.1189498    1  2.25848166 3.0642572 3.8047184
#> 282  0.2520948    0 -0.66262940 1.1302032 0.3690098
#> 34   2.1360394    1  0.39107362 2.3055282 2.8014030
#> 49   0.1223509    1  0.56855380 2.1746382 0.9993991
#> 111 14.1078406    1  0.97490745 0.5951211 0.7997705
#> 195  0.5372712    1  0.04082956 0.7702062 5.0694055

2. Defining Custom Kernels ๐Ÿง 

In previous versions of this package, defining custom kernels required advanced R concepts like Function Factories and force(). This is no longer necessary.

Now, you simply define the mathematical logic of your kernel as a standard R function. The function must accept at least x and z (the data vectors) and any other parameters you need.

2.1 Wavelet Kernel ๐ŸŒŠ

The Wavelet kernel (Arรก et al., 2016) can be written as:

K(x,z)=โˆcos(1.75xโˆ’zA)exp(โˆ’12(xโˆ’zA)2) K(x,z) = \prod \cos\left(1.75\frac{x - z}{A}\right)\exp\left(-\frac{1}{2}\left(\frac{x - z}{A}\right)^2\right)

my_wavelet <- function(x, z, A = 1) {
  # Calculate scaled distance vector
  u <- (as.numeric(x) - as.numeric(z)) / A
  
  # Product of the mother wavelet function
  prod(cos(1.75 * u) * exp(-0.5 * u^2))
}

2.2 Polynomial Kernel (Custom Implementation) ๐Ÿงฎ

A custom polynomial kernel can be written as:

K(x,z)=(ฮณโŸจx,zโŸฉ+coef0)degree K(x,z) = (\gamma \langle x,z \rangle + \text{coef0})^{\text{degree}}

Here we implement a simplified version with gamma = 1 (absorbed into scaling), so:

K(x,z)=(โŸจx,zโŸฉ+coef0)degree K(x,z) = (\langle x,z \rangle + \text{coef0})^{\text{degree}}

my_poly <- function(x, z, degree, coef0) {
  val <- sum(as.numeric(x) * as.numeric(z)) + coef0
  val ^ degree
}

3. Configuring the Ensemble ๐Ÿงฉ

To build a Random Machine, we need to provide a list of kernels to be used as candidates.

For custom kernels, we need to fix their hyperparameters before passing them to the ensemble. We use the helper function grid_kernel() for this.

Note: grid_kernel() is smart. If you pass a single value for a parameter (e.g., A = 1), it returns a single function ready to use. If you pass a vector (e.g., A = 1:3), it returns a list for tuning. Here, we want single instances โœ….

# Instantiate specific versions of our custom kernels
# We don't need lists or [[1]]. The function returns the object directly.

k_wav_1 <- grid_kernel(my_wavelet, A = 1)
k_pol_2 <- grid_kernel(my_poly, degree = 2, coef0 = 1)

Now we define the Kernel Mix. We will combine:

  • Linear (Standard scikit-learn)
  • RBF (Standard scikit-learn)
  • Wavelet (Custom R)
  • Polynomial (Custom R)

We set rank_ratio = 0 for all of them, meaning we are solving a regression problem (predicting the survival time/risk score directly), rather than a ranking problem.

kernel_mix <- list(
  # --- Standard Kernels (processed in Python) ---
  linear_base = list(kernel = "linear", alpha = 1,   rank_ratio = 0),
  rbf_std     = list(kernel = "rbf",    alpha = 0.5, gamma = 0.1, rank_ratio = 0),

  # --- Custom Kernels (processed in R) ---
  wavelet_A1  = list(kernel = k_wav_1,  alpha = 1,   rank_ratio = 0),
  poly_deg2   = list(kernel = k_pol_2,  alpha = 1,   rank_ratio = 0)
)

4. Training Random Machines ๐Ÿ—๏ธ

We fit the model using random_machines().

Key Parameters:

  • B: Number of bootstrap samples (machines).
  • mtry: Number of variables randomly sampled at each split (Random Subspace).
  • prop_holdout: Fraction of training data kept aside inside the function to calculate kernel weights.
  • crop: Pruning threshold. If a kernelโ€™s performance weight is below this value (e.g., 0.10), it is removed from that bootstrap iteration.
# Use all available cores for parallel processing
cores <- parallel::detectCores()

set.seed(123)

rm_model <- random_machines(
  data         = train_df,
  newdata      = test_df,   # Predict on test set immediately
  time_col     = "tempo",
  delta_col    = "cens",
  kernels      = kernel_mix,
  B            = 50,        # Number of machines
  mtry         = NULL,      # Use all features (or set an integer)
  crop         = 0.10,      # Drop weak kernels (weight <= 10%)
  prop_holdout = 0.20,      # 20% internal validation for weighting
  cores        = cores,
  .progress    = TRUE
)
#> 
#> โ”€โ”€ ๐Ÿš€ Random Machines (Kernel Survival SVM) โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
#> โ„น Starting Random Machines (B=50, mtry=All) on 10 cores.
#> โ„น Kernel weights via Holdout: 160 training / 40 validation.
#> โ„น Computing kernel weights...
#> โ„น Executing parallel bootstrap...
#> โ– โ–                                  2% | ETA:  3m
#> โ– โ–                                  4% | ETA:  2m
#> โ– โ– โ– โ– โ– โ– โ– โ– โ–                          26% | ETA: 28s
#> โ– โ– โ– โ– โ– โ– โ– โ– โ– โ– โ– โ– โ– โ– โ– โ– โ– โ– โ– โ– โ– โ–             70% | ETA:  5s
#> โ– โ– โ– โ– โ– โ– โ– โ– โ– โ– โ– โ– โ– โ– โ– โ– โ– โ– โ– โ– โ– โ– โ– โ– โ– โ– โ– โ– โ– โ– โ–   100% | ETA:  0s
#> โœ” Done. Valid Models: 50/50. Mean OOB: 0.6874

5. Evaluation and Interpretation ๐Ÿ”Ž

5.1 Model Summary ๐Ÿ“‹

Printing the model gives us insights into which kernels were actually selected and how much weight they received.

print(rm_model)
#> 
#> โ”€โ”€ ๐Ÿ“ฆ Random Machines (FastKernelSurvivalSVM) โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
#> โ€ข Models Trained: 50
#> โ€ข Features (mtry): All
#> โ€ข Mean OOB C-index: 0.6874
#> โ€ข Crop Threshold: 0.1
#> 
#> โ”€โ”€ ๐Ÿ“Š Kernel Usage (Bootstrap Selection) โ”€โ”€
#> 
#> โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
#> Kernel   | Count | Probability
#> โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
#> wavelet_A1  |    20 |      0.4000
#> rbf_std     |    11 |      0.2200
#> linear_base |    10 |      0.2000
#> poly_deg2   |     9 |      0.1800
#> โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
#> 
#> โ”€โ”€ โš–๏ธ Kernel Weights (Holdout Probabilities) โ”€โ”€
#> 
#> โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
#> Kernel   | Probability | Status
#> โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
#> wavelet_A1  |      0.3278 | โœ… Selected
#> rbf_std     |      0.2684 | โœ… Selected
#> poly_deg2   |      0.2077 | โœ… Selected
#> linear_base |      0.1961 | โœ… Selected
#> โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
  • Kernel Usage: Shows how often each kernel type was picked during the bagging process.
  • Kernel Weights: Shows the probabilistic weight assigned to each kernel family based on the internal holdout performance.

5.2 Performance (C-index) ๐Ÿ

Finally, we evaluate the ensembleโ€™s performance on the independent test set using the Concordance Index.

c_index <- score(rm_model, test_df)
cat(sprintf("Final C-Index on Test Data: %.4f\n", c_index))
#> Final C-Index on Test Data: 0.7952

Values closer to 1.0 indicate better predictive discrimination โœ….

Conclusion ๐ŸŽ‰

The FastSurvivalSVM package makes it easy to build powerful survival ensembles. By using grid_kernel(), you can seamlessly integrate complex, domain-specific kernels (defined in R) alongside standard efficient kernels (from Python), allowing the Random Machines algorithm to automatically learn the best representation for your data.