Prediction-Based Family Selection in Early Stage Sugarcane Breeding: Comparing BLUP, BLUE, Phenotypic Indices, and Machine Learning
Farrag F. B. Abu-Ellail, Liping Zhao, Siqi Tang, Jiayong Liu, Li Yao, Peifang Zhao, Fenggang ZanSelecting superior families at the seedling stage is crucial for accelerating genetic gain in sugarcane, yet systematic comparisons of selection methods remain limited. This study evaluated seven selection strategies: phenotypic check-based selection (Pheno), a three-trait combined index (CI3), Best Linear Unbiased Prediction (BLUP), Best Linear Unbiased Estimation (BLUE), tiered family selection (Tiered), logistic regression (LASSO), and the Multi-Trait Family Ideotype Distance Index (MFIDI). The experiment followed an augmented block design with four blocks, two check varieties, and included 125 test families comprising 10,955 seedlings. Using a combined index of standardized cane and sugar yields, families were classified as elite (top 20%), moderate (60%), and weak (bottom 20%). BLUP and BLUE rankings were consistent (Spearman’s ρ > 0.95, TCI = 88%, Jaccard = 0.79). Elite families showed median index values of 0.90 (BLUP) and 0.88 (BLUE) with wide interquartile ranges, whereas weak families had medians of −0.70 with narrow ranges. LASSO achieved excellent predictive performance: AUC = 0.95, accuracy = 0.92, sensitivity = 0.90, specificity = 0.94, identifying cane yield, sugar yield, and millable cane as key drivers. Agreement for inferior families was lower across methods (BCI ≤ 68%). BLUP with a multi-trait index proved most effective for discriminating elite families. Families F31 and F71 consistently ranked top. Combining selection approaches with agreement indices improves early-stage decisions for family selection in sugarcane breeding.