Fit FastKernelSurvivalSVM (scikit-survival) from an R data frame

This function wraps the Python implementation of sksurv.svm.FastKernelSurvivalSVM and provides a convenient R interface for fitting kernel-based survival SVMs to right-censored data.

fastsvm(
  data,
  time_col = "t",
  delta_col = "delta",
  kernel = "rbf",
  alpha = 1,
  rank_ratio = 0,
  fit_intercept = TRUE,
  ...
)

Arguments

data: A data.frame with survival times, event indicator, and covariates.
time_col: Name of the column in data containing survival times.
delta_col: Name of the column in data containing the event indicator (1 = event, 0 = right censoring).
kernel: Either a character string specifying a kernel supported by scikit-learn (for example "rbf", "poly", "sigmoid"), or an R function of the form function(x, z) ... that takes two numeric vectors and returns a scalar kernel value.
alpha: Regularization parameter controlling the weight of the squared hinge loss in the objective function (see scikit-survival documentation).
rank_ratio: Mixing parameter between regression and ranking objectives, with 0 <= rank_ratio <= 1. Use 0 for pure regression and 1 for pure ranking.
fit_intercept: Logical; if TRUE, an intercept is included in the regression objective (only relevant when rank_ratio < 1).
...: Additional arguments passed directly to sksurv.svm.FastKernelSurvivalSVM().

Value

An object of class "fastsvm", which wraps the underlying Python model and stores meta-information about the fit.

Details

The input data must contain a time column, an event indicator column, and one or more covariate columns. Internally, the function constructs the survival outcome in the format required by scikit-survival and calls the Python estimator via reticulate.

Examples

if (reticulate::py_module_available("sksurv")) {
  set.seed(1)
  n <- 100
  df <- data.frame(
    time   = rexp(n, rate = 0.1),
    status = rbinom(n, 1, 0.7),  # 1 = event, 0 = censoring
    x1     = rnorm(n),
    x2     = rnorm(n)
  )

  # Example 1: using a built-in RBF kernel from scikit-learn
  fit_rbf <- fastsvm(
    data        = df,
    time_col    = "time",
    delta_col   = "status",
    kernel      = "rbf",
    alpha       = 1,
    rank_ratio  = 0   # pure regression
  )

  # Predictions (transformed survival times / risk scores)
  y_hat <- predict(fit_rbf, df)
  head(y_hat)

  # Concordance index on the training data
  cidx <- score(fit_rbf, df)
  cidx

  # Example 2: using a custom RBF kernel defined in R
  rbf_r <- function(x, z, sigma = 1) {
    d2 <- sum((x - z)^2)
    exp(-d2 / (2 * sigma^2))
  }

  fit_custom <- fastsvm(
    data        = df,
    time_col    = "time",
    delta_col   = "status",
    kernel      = function(x, z) rbf_r(x, z, sigma = 0.5),
    alpha       = 1,
    rank_ratio  = 0
  )

  summary(fit_custom)
}
#> Summary of FastKernelSurvivalSVM model (kernel survival SVM)
#> ======================================================================
#> 
#> == Data ==
#> - n (observations) : 100
#> - p (covariates)   : 2
#> - Covariates       :  x1, x2 
#> 
#> == Hyperparameters ==
#> - kernel           : custom callable function
#> - alpha            : 1
#> - rank_ratio       : 0 (0 = pure regression)
#> - fit_intercept    : TRUE
#> 
#> == Estimated parameters (coef_ = sample-wise weights alpha_i) ==
#> - Number of support-like vectors (|alpha_i| > 1e-8): 100
#> - Summary of alpha_i (coef_):
#>       Min.    1st Qu.     Median       Mean    3rd Qu.       Max. 
#> -2.261e+00 -2.964e-01  4.666e-03  6.800e-07  3.650e-01  1.768e+00 
#> 
#> - Number of optimization iterations: 11
#> ======================================================================