DOI: 10.3390/foods15132358 ISSN: 2304-8158

Integrating Machine Learning and Expert Sensory Evaluation to Identify Key Drivers of Tomato Fruit Quality: A Multi-Model and Age-Stratified Analysis

Yihang Zhu, Chenxu Liu, Zhuping Yao, Rongqing Wang, Baoliang Xie, Yuan Cheng, Xiaobin Zhang

Individual biochemical indicators are insufficient for comprehensive tomato food flavor quality assessment, necessitating multi-parameter models of the core soluble taste matrix. We hypothesized that age stratification of trained sensory assessors would expose differential biochemical variable importance profiles in flavor quality prediction. Accordingly, this study aimed to: (1) construct and compare multiple regression models linking eight biochemical indicators to sensory scores, (2) identify key quality drivers via feature selection, and (3) examine whether age stratification alters the identified sensory drivers. Eight baseline taste indicators across 62 tomato cultivars were evaluated by 30 age-stratified trained sensory panelists (<40 and ≥40 years), using cross-validation to ensure model robustness against small-sample constraints. Partial least squares regression (PLSR), support vector regression (SVR), random forest (RF), and Boruta were applied. Random forest achieved the best performance (R2 = 0.82). In the full panel model, key variables were fructose, total free amino acids, and vitamin C. After age stratification, the under-40 group retained these variables, whereas the ≥40 group replaced vitamin C with soluble solids. Fructose and total free amino acids were consistently robust drivers, while total acidity remained least important. Deploying the RF–Boruta framework within an age-stratified context provides a structured analytical framework for investigating flavor perception from biochemical data. These findings suggest that fructose and total free amino acids represent highly robust candidate indicators for flavor quality prediction, while age-stratified variances suggest the utility of integrating demographic-specific metrics into precision breeding frameworks.

More from our Archive