DOI: 10.3390/plants15131997 ISSN: 2223-7747

Genetic Diversity, Population Structure, Integration of Genome-Wide Association Studies and Machine Learning for Antibacterial Trait Analysis in the Mediterranean Spice Laurel (Laurus nobilis)

Gülşah Karataş, Amjad Ali, Ünal Karık, Muhammad Azhar Nadeem, Muhammad Aasim, Mehmet Bedir, Muhammad Tanveer Altaf, Waqas Liaqat, Sarmad Ali Qureshi, Fawad Ali, Ruziyev Farid, Pablo Federico Cavagnaro, Muhammad Qasim Shahid, Syeid Amjad Ali, Ahmad Alsaleh, Faheem Shehzad Baloch

Laural (Laurus nobilis) is a Mediterranean plant with reported antibacterial properties, yet the genetic basis of its antibacterial efficacy remains largely unexplored. This study evaluated the antibacterial activity of Laurus nobilis methanolic extracts against Escherichia coli, Staphylococcus aureus, and Bacillus cereus, combined with genome-wide association studies (GWAS) and machine learning (ML) approaches to identify genetic markers and predict antibacterial efficacy in 92 plant samples. Antibacterial tests revealed significant variability in inhibition zones, with E. coli showing the highest inhibition (Canakkale2: 24.5 mm), followed by S. aureus (Aydin2: 26.0 mm). Minimum inhibitory concentration (MIC) analysis demonstrated notable regional differences; extracts from Mersin3 showed the highest efficacy (MIC = 6.25 mg/mL), while Aydin1 exhibited the lowest activity (MIC = 100 mg/mL). Population structure and neighbor joining tree analysis split the germplasm into two groups. GWAS identified significant genetic markers associated with antibacterial traits, including marker 26557159 for EC-MEAN (Escherichia coli-Mean) (p = 1.10 × 10−4, MarkerR2 = 0.1799, genetic variance = 9.41792) and marker 26584774 for BC-MEAN (Bacillus cereus-Mean) (p = 8.89 × 10−5, MarkerR2 = 0.18512, genetic variance = 12.48948). Protein–protein interaction network of loci associated with marker trait association (MTA) marker (26557159) indicated involvement in high-affinity secondary active ammonium transmembrane transporter activity, providing insights into genetic regions influencing antibacterial properties. ML models predicted antibacterial activity with high accuracy. XGBoost achieved the best performance for MIC predictions (R2 = 0.999, RMSE = 0.434), while random forest (R2 = 0.984) demonstrated robust performance for both MIC and disc diffusion assays. LightGBM performed well for MIC prediction (R2 = 0.988) but showed limited accuracy for disc diffusion outcomes (R2 = 0.695). This study is the first to combine GWAS and ML for predicting antibacterial efficacy in L. nobilis, identifying specific genetic markers (e.g., 26557159, 26584774) and demonstrating that XGBoost achieves near-perfect MIC prediction (R2 = 0.999). These findings provide a genomic and computational foundation for marker-assisted breeding of laurel with enhanced antibacterial properties and support the sustainable use of plant-derived anti-microbials.

More from our Archive