ace() returns the empirical unweighted mean absolute calibration gap over
non-empty equal-width bins. Unlike ece(), each non-empty bin contributes
equally. For multiclass inputs the "classwise" form averages the binary ACE
over the one-vs-rest columns and the "confidence" form uses the top-label
confidence.
Usage
ace(p, y, bins = 10, type = c("classwise", "confidence"))Arguments
- p
Predicted probabilities. A numeric vector in
[0, 1]for binary problems, or a numeric matrix with one column per class for multiclass problems. Matrix inputs must have finite entries in[0, 1], at least two columns, and rows summing to one within absolute tolerance1e-6.- y
Outcome labels. A vector coded as
0and1for binary problems, or a factor or vector of integer class codes in1:Kfor multiclass problems.- bins
Number of equal-width bins on
[0, 1]. Must be a single positive integer.- type
Multiclass aggregation, either
"classwise"or"confidence". Ignored for binary inputs.
Details
Using the same bin notation and endpoint convention as ece(), let \(M\)
be the number of non-empty bins. The binary empirical average calibration
error is
$$\operatorname{ACE} = \frac{1}{M}\sum_{b: n_b > 0} |\operatorname{acc}(b) - \operatorname{conf}(b)|.$$
Unlike ECE, ACE does not weight bins by their sample sizes. Sparse bins and
dense bins therefore contribute equally once they are non-empty. This
implementation uses equal-width bins on [0, 1]; it does not construct
adaptive or equal-frequency bins. For a multiclass probability matrix,
type = "classwise" returns the arithmetic mean of the one-vs-rest binary
ACE values,
$$\operatorname{ACE}_{\mathrm{cw}} = \frac{1}{K}\sum_{k = 1}^K \operatorname{ACE}(p_{\cdot k}, \mathbf{1}\{y_i = k\}).$$
type = "confidence" returns \(\operatorname{ACE}(r, c)\)
using top-label confidence and correctness.
References
Niculescu-Mizil, A., & Caruana, R. (2005). Predicting good probabilities with supervised learning. Proceedings of the 22nd International Conference on Machine Learning.
Examples
predictions <- data.frame(
p = c(0.10, 0.20, 0.80, 0.90),
y = c(0, 0, 1, 1)
)
predictions |>
dplyr::summarise(ace = ace(p, y, bins = 2))
#> ace
#> 1 0.15
