DOI: 10.3390/s26134093 ISSN: 1424-8220

Understanding the Performance of Deep Computer Vision Models: A Symbolic Regression Approach to Accuracy and Latency Prediction

Divyesh Rameshbhai Dhanani, Faraz Kayani, Saif U Din, Alice Arslanian, Dmitry Ignatov, Radu Timofte

Deploying deep vision models on edge hardware requires understanding how architecture and training hyperparameters jointly determine accuracy and inference latency, yet these relationships remain poorly characterized in a systematic, data-driven manner. This paper presents a two-stage statistical framework providing interpretable, closed-form insights into both. In the first stage, we apply distance correlation (dCor) and the maximal information coefficient (MIC) across seven image-classification datasets, revealing that batch size and total layer count are the strongest universal accuracy predictors (mean dCor: 0.228 and 0.174), while learning rate achieves the highest MIC (0.226), reflecting a non-monotonic relationship with accuracy. In the second stage, PySR symbolic regression (representing, to our knowledge, the first application to cross-dataset vision model accuracy prediction) derives compact, interpretable formulas. Dataset-specific models achieve R2 from 0.20 to 0.45; a universal model achieves a mean leave-one-dataset-out R2 of 0.23, remaining strictly positive on all held-out datasets, whereas ordinary linear regression collapses to R2=−0.71. We further derive device-specific inference latency formulas for CPU, GPU, and NPU, outperforming classical baselines by 6.7×–14.8× in R2 and confirming fundamental device heterogeneity. Together, these results offer interpretable surrogate models for screening deep vision architectures under accuracy and latency constraints in edge deployment.

More from our Archive