DOI: 10.17776/csj.1861928 ISSN: 2587-2680

Unlocking the Non-Linear Frontier: A New Era of Robust Biosensing by Data Augmentation

Emine Sezer, Sinan Akgöl
Developing robust calibration models for electrochemical biosensors is frequently constrained by the high cost of data collection and the non-linear nature of biological signals. This small data limitation leads to overfitting in regression models and restricts the sensor's sensitivity. This study introduces an Artificial Intelligence (AI)-driven framework to enhance prediction accuracy in low-concentration ranges and stabilize sensor response without hardware modifications. For this purpose, three independent sensor datasets (C-Reactive Protein, Vitamin C, and Vitamin E) using Machine Learning (ML) algorithms benchmarked against Linear Regression were analysed. To mitigate data scarcity, four augmentation strategies were evaluated: Gaussian noise to emulate instrumental variability, statistical SMOGN, Deep Learning-based CTGAN, and interpolation based-Mixup to reinforce signal continuity. Results indicated that linear models failed (R2 < 0.10) to capture complex kinetics. While baseline ML models showed instability due to overfitting, data augmentation substantially improved generalization. Notably, physically motivated methods like Mixup and Gaussian Noise outperformed complex Deep Learning approaches, increasing Test R2 values from 0.40 to 0.65. These findings demonstrate that AI-based augmentation can effectively stabilize non-linear models and extend the quantifiable range of biosensors.

More from our Archive