Data Base similarity (DBsimilarity) of natural products to aid compound identification on MS and NMR pipelines, similarity networking, and more
Ricardo M. Borges, Gabriela de Assis Ferreira, Mariana Martins Campos, Andrew Magno Teixeira, Fernanda das Neves Costa, Fernanda Oliveira Chagas- Complementary and alternative medicine
- Drug Discovery
- Plant Science
- Molecular Medicine
- General Medicine
- Biochemistry
- Food Science
- Analytical Chemistry
Abstract
Introduction
We developed Data Base similarity (DBsimilarity), a user‐friendly tool designed to organize structure databases into similarity networks, with the goal of facilitating the visualization of information primarily for natural product chemists who may not have coding experience.
Method
DBsimilarity, written in Jupyter Notebooks, converts Structure Data File (SDF) files into Comma‐Separated Values (CSV) files, adds chemoinformatics data, constructs an MZMine custom database file and an NMRfilter candidate list of compounds for rapid dereplication of MS and 2D NMR data, calculates similarities between compounds, and constructs CSV files formatted into similarity networks for Cytoscape.
Results
The Lotus database was used as a source for
Conclusion
Chemical and biological properties are determined by molecular structures. DBsimilarity enables the creation of interactive similarity networks using Cytoscape. It is also in line with a recent review that highlights poor biological plausibility and unrealistic chromatographic behaviors as significant sources of errors in compound identification.