Conformalization Strategies¶
Calibration strategies trade data efficiency, runtime, memory, and validity story. Start with the simplest strategy that fits your data size and operational constraints.
Quick Decision Guide¶
| Situation | Recommended starting point | Why |
|---|---|---|
| Large data, speed matters | Split(n_calib=0.2) |
Cleanest validity story and low runtime |
| Large data, speed is secondary | JackknifeBootstrap(n_bootstraps=100) |
More resampling stability |
| Medium data | JackknifeBootstrap(n_bootstraps=100) |
Avoids spending a large fixed holdout |
| Small data | CrossValidation.jackknife() |
Uses nearly all observations for fitting/calibration |
Split conformal is the cleanest strict finite-sample baseline because the fitted
model and calibration scores are separated. Resampling strategies
(CrossValidation, Jackknife, Jackknife+, and JaB+) use data more efficiently
and often work well in practice, but their guarantees are not interchangeable
with the split-conformal guarantee. For example, J+aB-style bounds can allow a
larger failure rate than the nominal split target.
If you need smoother p-values, consider Probabilistic() (KDE-based), but treat
that as model-based/asymptotic behavior rather than exact split-conformal
finite-sample validity.
For CV/Jackknife/Bootstrap-style conformalization, mode="plus" is the more
defensible validity-oriented default. mode="single_model" can perform
similarly in practice and is lighter at inference time, but it weakens the
validity story further.
For detailed guidance, see Choosing Strategies.
Available Strategies¶
Split Strategy¶
Simple train/calibration split. Fast and straightforward.
from nonconform import Split
# Use 30% of data for calibration
strategy = Split(n_calib=0.3)
# Use fixed number of samples for calibration
strategy = Split(n_calib=1000)
Characteristics: - Fastest computation - Simplest implementation - Least robust for small datasets - Memory efficient - Best validity story when calibration and test points are exchangeable
Cross-Validation Strategy¶
K-fold cross-validation for robust calibration using all data.
from nonconform import CrossValidation
# 5-fold cross-validation with one final model kept for inference
strategy = CrossValidation(k=5, mode="single_model")
# Plus mode keeps fold models for plus-style inference (recommended)
strategy = CrossValidation(k=5, mode="plus")
mode semantics
For CrossValidation (including CrossValidation.jackknife(...)) and JackknifeBootstrap:
- Default when omitted: mode="plus"
- Valid values: "plus" and "single_model" (or ConformalMode.PLUS / ConformalMode.SINGLE_MODEL)
- mode="plus": keeps per-fold/per-bootstrap models for plus-style inference
- mode="single_model": still calibrates via folds/bootstraps, then fits one final model on all training data for inference
- mode="single_model" can weaken conformal validity; use mode="plus" when validity is the priority
Characteristics:
- Uses data efficiently through folds
- Higher computational cost than Split
- Useful alternative when deterministic fold-based calibration is preferred
- Validity story depends on mode; prefer mode="plus" when validity is the
priority
JaB+ Strategy (Jackknife+-after-Bootstrap)¶
Bootstrap resampling with Jackknife+ for robust calibration [Kim et al., 2020].
from nonconform import JackknifeBootstrap
# Typical JaB+ starting point (100+ bootstraps recommended)
strategy = JackknifeBootstrap(n_bootstraps=100)
# Higher precision with more bootstraps
strategy = JackknifeBootstrap(n_bootstraps=200)
Characteristics: - Flexible ensemble size - Useful stability check for noisy/small data - Configurable computational cost - Typically recommended: 100+ bootstraps for stable behavior - Looser guarantee than split conformal in the J+aB theory
Jackknife Strategy¶
Leave-one-out cross-validation for maximum data utilization [Barber et al., 2021].
from nonconform import CrossValidation
# Standard jackknife with one final inference model
strategy = CrossValidation.jackknife(mode="single_model")
# Jackknife+ keeps leave-one-out models for plus-style inference
strategy = CrossValidation.jackknife(mode="plus")
Characteristics: - Maximum data utilization - Computationally intensive - Best for very small datasets - Can be infeasible when one model fit per observation is too expensive
Strategy Selection Guide¶
| Dataset Size | Computational Budget | Recommendation |
|---|---|---|
| Large (>5,000) | Low | Split |
| Large (>5,000) | High | JackknifeBootstrap |
| Medium (500-5,000) | Any | JackknifeBootstrap |
| Small (<500) | Any | Jackknife |
Mode Semantics¶
CrossValidation and JackknifeBootstrap strategies support "plus" mode for stronger conformal validity behavior in anomaly detection workflows [Barber et al., 2021; Kim et al., 2020]:
# Enable plus mode for CV strategies
strategy = CrossValidation(k=5, mode="plus")
strategy = CrossValidation.jackknife(mode="plus")
strategy = JackknifeBootstrap(n_bootstraps=100, mode="plus")
mode="plus" provides:
- Higher statistical efficiency in theory [Barber et al., 2021]
- Better resampling-based validity behavior than single_model
- Slightly higher computational cost
- A more defensible approximation or looser guarantee, depending on the method
The "plus" suffix (e.g., Jackknife+, CV+) indicates a refined variant that is typically preferred when validity is the priority, but it should not be read as equivalent to the strict split-conformal guarantee.
mode="single_model" provides:
- Lower inference-time memory footprint
- One final detector trained on full data for inference
- Can be close to mode="plus" in practice for some datasets
- No validity guarantee comparable to mode="plus"
Performance Comparison¶
| Strategy | Training Time | Memory Usage | Practical Calibration Stability |
|---|---|---|---|
| Split | Fast | Low | Good with enough calibration data |
| CrossValidation | Medium | Medium | Good for limited data |
| JackknifeBootstrap | Medium-High | Medium-High | Good for noisy or scarce data |
| Jackknife (LOO) | Slow | High | Good for very small data |
Integration with Detectors¶
All strategies work with any conformal detector:
from nonconform import ConformalDetector, CrossValidation, JackknifeBootstrap, logistic_weight_estimator
from pyod.models.lof import LOF
# Standard conformal with cross-validation
detector = ConformalDetector(
detector=LOF(),
strategy=CrossValidation(k=5)
)
# Weighted conformal with JaB+
detector = ConformalDetector(
detector=LOF(),
strategy=JackknifeBootstrap(n_bootstraps=100),
weight_estimator=logistic_weight_estimator(),
seed=42,
)
References¶
-
Vovk, V. (2015). Cross-conformal predictors. Annals of Mathematics and Artificial Intelligence, 74(1-2), 9-28. [Cross-conformal prediction and empirical validity]
-
Kim, B., Xu, C., & Barber, R. F. (2020). Predictive Inference Is Free with the Jackknife+-after-Bootstrap. Advances in Neural Information Processing Systems (NeurIPS), 33. [JaB+ method with looser coverage guarantees]
-
Barber, R. F., Candes, E. J., Ramdas, A., & Tibshirani, R. J. (2021). Predictive Inference with the Jackknife+. The Annals of Statistics, 49(1), 486-507. [Jackknife+ method with improved finite-sample efficiency]
-
Vovk, V., Gammerman, A., & Shafer, G. (2005). Algorithmic Learning in a Random World. Springer. [Foundational work on conformal prediction]
-
Lei, J., G'Sell, M., Rinaldo, A., Tibshirani, R. J., & Wasserman, L. (2018). Distribution-Free Predictive Inference for Regression. Journal of the American Statistical Association, 113(523), 1094-1111. [Split conformal prediction with theoretical guarantees]
Next Steps¶
- See choosing strategies for detailed decision framework
- Learn about conformal inference for theoretical foundations
- Check input validation for parameter constraints