Hybrid Simulation–Machine Learning Surrogates for Coordinate‐Based Solar and Wind Energy Yield Assessment in Iraq: A Streamlit Decision‐Support Tool
Bassam Musheer Kareem, Abdolsalam Ebrahimpour, Jafar Ghafouri, Roghayyeh MotallebzadehABSTRACT
Driven by the global shift away from fossil fuels, solar and wind resources are increasingly important for sustainable power planning, especially in data‐scarce regions. This study proposes a hybrid simulation–machine learning framework to estimate renewable energy yields across Iraq using minimal geographic inputs. Hourly meteorological variables for 2015–2024 (solar radiation, wind speed, temperature, and humidity) were retrieved from the NASA POWER database for ten representative Iraqi cities at a gridded resolution of approximately 0.1°–0.2°. These data were converted to EPW format and used in EnergyPlus to simulate electricity generation from a standardized PV system (20% efficiency, 1 m 2 area) at three fixed tilt angles (0°, 30°, 60°) and from a 1 kW wind turbine. The resulting EnergyPlus outputs were then used to train two surrogate predictors: a random forest (RF) model for solar yield and a Feedforward Neural Network (FNN) for wind yield. Using an 80/20 split and cross‐validation within the simulated dataset, the RF reproduced EnergyPlus‐simulated solar energy with strong agreement within the modeled dataset ( R 2 ≈ 0.98, MSE ≈ 1.45), while the FNN showed strong agreement with the simulated wind‐energy outputs ( R 2 ≈ 0.97, MSE ≈ 2.36). Feature analysis indicated that PV tilt and seasonal cycles dominate solar yield variability, whereas elevation and seasonality are the primary drivers for wind yield. For practical decision support, the trained models were deployed in a Streamlit web interface that returns monthly and annual kWh estimates from latitude/longitude, elevation, and configuration inputs. Because validation is performed against EnergyPlus simulation outputs (simulation‐derived ground truth), reported accuracy should be interpreted as simulation‐level fidelity rather than verified predictive performance against measured field generation.