Observability- and Identifiability-Guided Sensor-Set Design for Digital-Twin-Assisted Consolidated Bioprocessing
Mark Korang Yeboah, Nana Yaw Asiedu, Ahmad AddoConsolidated bioprocessing (CBP) is difficult to monitor because enzyme production, lignocellulose degradation, sugar release, and fermentation occur simultaneously under sparse measurement, feedstock variability, and plant–model mismatch conditions. This study proposes a computational sensor-set design framework for digital-twin-assisted CBP monitoring. A five-state virtual plant, consisting of active biomass, cellulolytic enzyme activity, residual insoluble substrate, soluble sugar, and ethanol, was used to evaluate all 16 ethanol-mandatory measurement packages formed from ethanol, sugar, biomass, enzyme, and residual-substrate proxy channels. Candidate sensor sets were assessed using finite-difference output sensitivities, Fisher-information-based state-observability and parameter-identifiability analyses, eigenvalue and parameter-correlation diagnostics, and paired Monte Carlo unscented Kalman filter soft-sensing reconstruction. Within the tested five-state virtual-plant benchmark and with the specified excitation schedule, noise assumptions, burden indices, and scoring objective, ethanol-only sensing provided the weakest support for state-aware CBP digital-twin reconstruction. At a 6h sampling interval, the state-observability log-pseudodeterminant increased from 4.18 with ethanol-only sensing to 8.56 after adding soluble sugar and to 16.42 with full-proxy monitoring. The ethanol–sugar–biomass–substrate package also gave strong reduced state-observability performance, with log-pseudodeterminants of 15.12, 13.76, and 12.51 at 6, 12, and 24h, respectively. Biomass and enzyme proxies contributed strongly to parameter learning, and the ethanol–sugar–biomass–enzyme package gave the strongest active parameter-identifiability performance, with log-pseudodeterminants of 10.82, 9.06, and 6.67 at 6, 12, and 24h, respectively. In the paired soft-sensing analysis, full-proxy monitoring reduced the mean latent-state RMSE from 1.1899 to 0.3756, followed by ethanol–biomass–enzyme–substrate with 0.3843 and ethanol–sugar–biomass–substrate with 0.4121. The primary aggregate ranking identified ethanol–sugar–biomass–substrate as the best overall package, with a sensor-value score of 0.8432 and a burden index of 7.0, followed by full-proxy monitoring with a score of 0.8173 and a burden index of 10.0. Robustness tests showed that ethanol–sugar–biomass–substrate remained top-ranked under uniform noise scaling, full UKF missingness, delay and bias stress test conditions, most scoring-weight scenarios, and all tested sensor-specific burden workflows. Full-proxy monitoring remained a close competitor under independent sensor-specific noise variation conditions and became top-ranked for some alternative operating trajectories. The proposed framework provides a simulation-based method for prioritizing informative measurement packages before implementing CBP digital twins in laboratory and pilot-plant settings.