DOI: 10.1093/ce/zkag036 ISSN: 2515-4230

Automated Reaction Path Discovery for Coal Pyrolysis: Scalable Dataset Construction and Mechanistic Insights for Clean Energy Applications

Dao Li, Xu Li, Jie Yun, Qizhao Liu, Shansong Gao, Haisheng Li, Tong Zhu, Rongheng Gou

Abstract

Elucidating the molecular mechanisms of high-temperature coal pyrolysis is of fundamental importance for advancing clean coal technologies, optimizing coal gasification, and assessing pollutant precursor formation tendencies. Here, we present an automated and scalable workflow for reaction-path discovery in complex pyrolytic systems, integrating ReaxFF-derived reactant generation, pyrolysis-tailored reaction enumeration based on the bond–electron matrix formalism, GFN2-xTB screening, and CI-NEB transition-state searches. The framework was applied to the secondary pyrolysis of light tar fragments derived from three representative coals, Yanzhou, Shendong, and Chahaquan, yielding large and chemically diverse datasets of elementary reactions with validated reactant and product structures, CI-NEB-derived transition-state candidates, and associated activation barriers. Across all three coal types, C–C and C–H bond scissions dominate the initial decomposition chemistry, while coal-specific differences remain pronounced: Yanzhou exhibits a higher propensity for C–N bond cleavage, Shendong contains the largest fraction of low-barrier pathways, and Chahaquan tends to produce more single-bond-rich and lower-molecular-weight product ensembles. In addition, bond-type statistics, molecular-weight distributions, and empirical cumulative distribution function (ECDF) analysis of normalized interatomic distances reveal distinct variations in saturation, conjugation, and structural complexity among the three datasets, providing microscopic clues to pollutant precursor release, kinetic accessibility, and downstream upgrading behavior of coal tar products. By balancing chemical-space coverage, computational efficiency, and reproducibility, the present workflow delivers a scalable reaction-data resource for coal pyrolysis and provides a robust foundation for data-driven kinetic modeling of complex reactive energy systems.

More from our Archive