DOI: 10.3390/agronomy15081978 ISSN: 2073-4395

Machine-Learning-Based Multi-Site Corn Yield Prediction Integrating Agronomic and Meteorological Data

Chenyu Ma, Zhilan Ye, Qingyan Zi, Chaorui Liu

Accurate maize yield forecasting under climate uncertainty remains a critical challenge for global food security, yet existing studies predominantly rely on single-model frameworks, limiting generalizability and actionable insights. This study selected three regions, specifically Dali, Lijiang, and Zhaotong, and collected data on 12 agronomic traits of 114 varieties, along with eight sets of meteorological data, covering the period from 2019 to 2023. We employed three machine learning models: Random Forest (RF), Support Vector Machine (SVM), and XGBoost. The results revealed a strong correlation between yield and multiple agronomic traits, particularly grain weight per spike (GWPS) and hundred-kernel weight (HKW). Notably, the XGBoost model emerged as the top performer across all three regions. The model achieved the lowest RMSE (0.22–191.13) and a good R2 (0.98–0.99), demonstrating exceptional predictive accuracy for yield-related traits. The comparative analysis revealed that XGBoost exhibited superior accuracy and stability compared to RF and SVM. Through feature importance analysis, four critical determinants of yield were identified: GWPS, shelling percentage (SP), growth period (GP), and plant height (PH). Furthermore, partial dependence plots (PDPs) provided deeper insights into the nonlinear interactive effects between GWPS, SP, GP, PH, and yield, offering a more comprehensive understanding of their complex relationships. This study presents an innovative, data-driven methodology designed to accurately forecast corn yield across diverse locations. This approach offers valuable scientific insights that can significantly enhance precision agricultural practices by enabling the precise tailoring of fertilizer usage and irrigation strategies. The results highlight the importance of integrating agronomic and meteorological data in yield forecasting, paving the way for development of agricultural decision-support systems in the context of future climate change scenarios. This study presents an innovative, data-driven methodology designed to accurately forecast corn yield across diverse locations. This approach offers valuable scientific insights that can significantly enhance precision agricultural practices by enabling the precise tailoring of fertilizer usage and irrigation strategies.

More from our Archive