An AI-driven multimodal framework for automated waste segregation and energy recovery in smart cities: a case study for Saudi Vision 2030
Shadi Alshraah, Loubna AlajmiPurpose
This study develops an AI-driven multimodal framework integrating automated waste segregation with energy recovery prediction to support Saudi Vision 2030 sustainability goals.
Design/methodology/approach
A two-stage vision pipeline (YOLOv9 + Swin Transformer) performs real-time waste detection and classification. Multimodal physicochemical features are modeled using XGBoost, deep neural networks (DNNs), and graph neural networks (GNNs) to predict energy recovery potential. Reinforcement learning (RL) optimizes routing of waste streams to appropriate facilities. Experiments use a Saudi-specific dataset of 50,000 annotated images and 5,000 physico-chemical records.
Findings
The integrated framework achieved mAP = 94.3% and R2 = 0.96, improving landfill diversion and renewable energy contribution by 12.1% and 15.2%, respectively, compared with baseline models.
Research limitations/implications
Hazardous waste remains underrepresented in the dataset. Future work will address this via targeted data collection and active learning.
Practical implications
The framework provides a deployable solution for real-time waste classification and energy estimation in smart-city contexts, with a projected payback period of approximately 1.5 years for a 1,000-bin deployment.
Originality/value
This study introduces the first Saudi-specific multimodal waste dataset and a unified AI framework bridging waste segregation, energy prediction, and smart-city optimization—an end-to-end solution absent from prior literature.