Impact of Estimating Genetic Variance in the Target Group on Reliability Metrics of the Linear Regression Validation Method Under Selection
Alan M. Pardo, Daniel O. Maizón, Sebastián Munilla, Andrés LegarraABSTRACT
The Linear Regression (LR) method is a validation approach based on comparing predictions from partial and whole data for a predefined target group (or focal group, in the LR sense). Reliability metrics derived from this comparison depend critically on the additive genetic variance of that group. However, this variance often differs from the base population because of selection, drift, and changes in relatedness, raising practical questions about how its estimation affects LR‐based reliability metrics. We simulated a beef cattle population under two contrasting selection schemes (phenotypic and EBV‐based), defined two focal groups with contrasting selection status (selected sires and unselected females), and evaluated three strategies for estimating focal‐group variance: (i) an empirical MCMC estimator (VarEst‐eMCMC), (ii) a closed‐form estimator (VarEst‐CF), and (iii) an inbreeding‐adjusted base variance (VarF). Both VarEst‐eMCMC and VarEst‐CF recovered the true focal‐group additive genetic variance in both focal groups, with estimates close to simulated values in sires (phenotypic: 0.202; EBV: 0.157–0.158 vs. 0.153) and small deviations in unselected females (0.262 vs. 0.256 under phenotypic selection; 0.247 vs. 0.237 under EBV‐based selection). In selected sires, VarF consistently overestimated the focal‐group genetic variance, by about 40% under phenotypic selection (0.289 vs. 0.202) and more than 80% under EBV‐based selection (0.282 vs. 0.153). In unselected females, the overestimation was more moderate (20%). These findings directly relate to LR metrics. In sires under EBV‐based selection, LR reliabilities computed using VarEst‐CF and VarEst‐eMCMC remained close to simulated values for selected and unselected individuals (0.079 vs. 0.091; 0.508 vs. 0.538), whereas VarF‐based estimates were markedly underestimated (0.044 and 0.084, respectively). A similar pattern was observed under phenotypic selection. In females, LR reliabilities calculated with VarEst‐CF and VarEst‐eMCMC also remained close to the simulated values, whereas those calculated with VarF were slightly underestimated. Overall, these findings confirm that properly estimating focal‐group additive genetic variance under selection is essential for meaningful LR metrics. We also provided the first implementation of the closed‐form estimator as a practical alternative for routine LR‐based validation.