crossinvalidation
Crossinvalidation is a resampling validation strategy used in statistics and machine learning to assess how well a model generalizes across different data sources or conditions. Unlike standard cross-validation, which randomizes and partitions samples within a single dataset, crossinvalidation explicitly tests transferability across sources, batches, studies, or domains. It can help reveal model performance under distribution shifts and detect overfitting to idiosyncrasies of a particular dataset.
In its simplest form, leave-one-source-out crossinvalidation: data are grouped into sources; in each fold, train on
Use cases include multi-study clinical research, genomics with batch effects, federated learning setups, and domain generalization
Advantages include improved estimation of cross-domain generalization and reduced risk of optimistic bias due to dataset-specific