Philosophy of phenomic prediction and its incompatibility with causal inference
Mitchell J. Feldmann, Fangyi Wang, Daniel E. RuncieAbstract
Breeding programs need to make decisions frequently to improve populations and develop varieties efficiently. These needs led to the development of genomic prediction in the early 2000s and phenomic prediction in the mid‐2010s. In practice, phenomic prediction techniques rely on the same statistical tools and computational frameworks as other analyses (e.g., genomic prediction, variance component estimation, and heritability analysis); beyond that, they share no further similarities. Phenomic prediction should excel at predicting phenotypic values, whereas genomic prediction should excel at predicting breeding values, or total genetic values if nonadditive variance is incorporated. Phenomic prediction, as commonly implemented, should not be interpreted within a causal inference framework. This is because phenomic features often share a causal structure with the focal trait (e.g., genotype and environment) and may also be statistically related to the trait itself. Consequently, including phenomic features as predictors alongside shared determinants in linear (mixed) models can induce confounding and/or collider bias. For this reason, estimated effects in phenomic prediction models should be interpreted as associations that support prediction, rather than as evidence of causal relationships. We discuss these issues and propose the multivariate best linear unbiased predictor model as a solution for phenomic prediction within a framework more amenable to causal interpretation.