DOI: 10.3390/clinbioenerg2030011 ISSN: 3042-5158

Interpretable Machine Learning for Mitochondrial Toxicity Prediction: Cross-Assay Generalization, Descriptor Transferability, and Context-Dependent Structural Effects

Yu-Heng Lin, Yu-Te Lin, An-Chi Wei

Mitochondrial toxicity is a major concern in drug development and safety assessment. Machine learning models trained on mitochondrial membrane potential (MMP) assay data offer promising toxicity predictions, but cross-scaffold and cross-assay transferability remain uncharacterized. We evaluated classical machine learning and deep learning architectures across six molecular representations and both random and scaffold-based splitting strategies, assessing performance on an internal MMP test set and an independent external set comprising flux and glucose–galactose assay data. Across all model configurations, we observed a consistent cross-assay generalization gap, independent of model type, feature choice, or augmentation strategy. Mordred descriptors provided the most transferable predictive signal, outperforming fingerprint-based representations on the external set. SHAP analysis of the best CatBoost and Random Forest models identified autocorrelation-family descriptors as dominant predictors. C3SP3 contributed through feature interactions while ATS5i showed positive association with toxicity across assays. Motif-level and descriptor-level associations proved to be strongly assay-dependent, supporting a mechanism in which mitochondrial toxicity arises from multivariate physicochemical interactions rather than single structural alerts.

More from our Archive