Package: mlr3resampling 2025.3.30

mlr3resampling: Resampling Algorithms for 'mlr3' Framework

A supervised learning algorithm inputs a train set, and outputs a prediction function, which can be used on a test set. If each data point belongs to a subset (such as geographic region, year, etc), then how do we know if subsets are similar enough so that we can get accurate predictions on one subset, after training on Other subsets? And how do we know if training on All subsets would improve prediction accuracy, relative to training on the Same subset? SOAK, Same/Other/All K-fold cross-validation, <doi:10.48550/arXiv.2410.08643> can be used to answer these question, by fixing a test subset, training models on Same/Other/All subsets, and then comparing test error rates (Same versus Other and Same versus All). Also provides code for estimating how many train samples are required to get accurate predictions on a test set.

Authors:Toby Hocking [aut, cre], Michel Lang [ctb], Bernd Bischl [ctb], Jakob Richter [ctb], Patrick Schratz [ctb], Giuseppe Casalicchio [ctb], Stefan Coors [ctb], Quay Au [ctb], Martin Binder [ctb], Florian Pfisterer [ctb], Raphael Sonabend [ctb], Lennart Schneider [ctb], Marc Becker [ctb], Sebastian Fischer [ctb]

# Install 'mlr3resampling' in R:

install.packages('mlr3resampling', repos = c('https://tdhock.r-universe.dev', 'https://cloud.r-project.org'))

Bug tracker:https://github.com/tdhock/mlr3resampling/issues

Datasets:

AZtrees - Arizona Trees

On CRAN:

4.73 score 3 stars 289 downloads 5 exports 22 dependencies

Last updated 2 days agofrom:3f34a56eb4. Checks:9 OK. Indexed: yes.

Target	Result	Latest binary
Doc / Vignettes	OK	Mar 30 2025
R-4.5-win	OK	Mar 30 2025
R-4.5-mac	OK	Mar 30 2025
R-4.5-linux	OK	Mar 30 2025
R-4.4-win	OK	Mar 30 2025
R-4.4-mac	OK	Mar 30 2025
R-4.4-linux	OK	Mar 30 2025
R-4.3-win	OK	Mar 30 2025
R-4.3-mac	OK	Mar 30 2025

Exports:pvalue ResamplingSameOtherCV ResamplingSameOtherSizesCV ResamplingVariableSizeTrainCV score

Dependencies:backports checkmate codetools data.table digest evaluate future future.apply globals lgr listenv mlbench mlr3 mlr3measures mlr3misc palmerpenguins paradox parallelly PRROC R6 rlang uuid

Comparing sizes when training on same or other groups

Rendered fromNewer_resamplers.Rmdusingknitr::knitron Mar 30 2025.

Last update: 2024-09-06
Started: 2024-05-14

Older resamplers

Rendered fromOlder_resamplers.Rmdusingknitr::knitron Mar 30 2025.

Last update: 2025-02-04
Started: 2024-05-02

Help page	Topics
Arizona Trees	AZtrees
P-values for comparing Same/Other/All training	pvalue
Resampling for comparing training on same or other subsets	ResamplingSameOtherCV
Resampling for comparing train subsets and sizes	ResamplingSameOtherSizesCV
Resampling for comparing training on same or other groups	ResamplingVariableSizeTrainCV
Score benchmark results	score

Package: mlr3resampling 2025.3.30

mlr3resampling: Resampling Algorithms for 'mlr3' Framework

Comparing sizes when training on same or other groups

Older resamplers

Citation

Development and contributors

Readme and manuals

Help Manual

Usage by other packages (reverse dependencies)