RNALig: An ML‐Driven Structure‐Based Scoring Function for Estimating Binding Affinities of RNA‐Ligand Complexes
Priyanka Sharma, N. Latha, Leena Aggarwal, Pradeep Pant
Ribonucleic acid (RNA) has emerged as a crucial therapeutic target owing to its central roles in gene regulation, catalysis, and disease progression. Small molecules can modulate RNA function by binding specific structural motifs; therefore, accurately predicting RNA–ligand binding free energies (Δ
G
) is vital for structure‐guided RNA drug discovery. However, experimental determination of RNA–ligand affinities remains costly and low‐throughput. We present RNALig, a machine learning (ML)‐driven, structure‐informed scoring function that quantitatively predicts RNA–ligand binding affinities using three‐dimensional (3D) structural and physicochemical descriptors. The model employs a Random Forest Regressor trained on 164 experimentally resolved RNA–ligand complexes and validated on an independent test set of 70 complexes. RNALig integrates detailed RNA‐specific, ligand‐specific, and complex–level interaction features, achieving
R
2
= 0.81 and RMSE = 0.64 kcal/mol, thereby outperforming state‐of‐the‐art methods such as RSAPred (
R
2
= 0.52) and DeepRSMA (
R
2
= 0.67). By combining ML interpretability with structure‐based descriptors, RNALig offers a transparent, quantitative, and generalizable framework for modeling RNA–ligand binding thermodynamics, advancing ML‐driven RNA‐targeted drug discovery. The complete pipeline and dataset are available at