15/01/2024
10 simple rules for predictive modeling presented in the slides:
Use out-of-sample prediction to generate more accurate and generalizable models. This helps avoid overfitting.
Keep training and testing data independent. Do not contaminate or mix the data sets.
Use cross-validation techniques like k-fold and leave-one-out cross validation to validate models.
Share data, code, and models to allow external validation and open science.
Choose performance metrics suited to the prediction task, whether continuous or categorical outcomes. Assess significance properly.
Be mindful of sample characteristics like distribution, sample size, and balance between groups.
Use nested cross-validation or multiple comparisons correction when testing multiple models and parameters to avoid false discoveries.
Check that predictions match the intended variable and are not driven by confounds.
Don't expect one model to generalize across traits, states, and populations.
Balance predictive performance and interpretability. Simpler models may trade some prediction for interpretability.