DOI: 10.1093/bioinformatics/btag447 ISSN: 1367-4811

SegJointGene: joint cell segmentation and spatial gene prioritization by information entropy guided convolutional neural networks

Haotian Ma, Daifeng Wang

Abstract

Motivation

Spatial sequencing technologies enable the single-cell-level study of molecular organization in tissues. Revealing such spatial patterns relies on accurate cell segmentation. In complex tissues with dense cell packing, segmentation based solely on nuclear staining is insufficient for accurate cell boundary detection. This limitation arises because accurate segmentation necessitates the delineation of cell morphology, which is driven by molecular activities such as cytoskeletal dynamics, cell-cell adhesion, and intercellular signaling. Thus, integrating molecular information, including gene or protein expression, has the potential to improve segmentation, but remains computationally challenging.

Results

To address this, we developed SegJointGene, a deep learning framework that jointly performs cell segmentation and spatial gene prioritization by integrating nuclei-based images with spatial gene or protein expression data. SegJointGene designs an information-entropy–guided convolutional neural network together with a computational information discarding score to identify genes that are important for cell-type–specific segmentation. The model iteratively refines gene prioritization and cell boundaries, producing convergent segmentation results along with prioritized spatial genes or proteins across cell types. We applied and benchmarked SegJointGene on both simulation and real spatial datasets, including spatial transcriptomics from the mouse hippocampus and distinct regions of the whole mouse brain, as well as spatial proteomics data from human tonsil. Across datasets, SegJointGene outperformed existing methods by 5–20% in accurately assigning molecular signals to cell boundaries. Robustness analyses further demonstrated stable performance across varying gene numbers and imaging resolutions. In addition, the genes prioritized by SegJointGene were enriched for structural, developmental, and synaptic signaling pathways, supporting their relevance to spatial tissue organization.

Availability

The source code and data are available at https://github.com/daifengwanglab/segjointgene.

Supplementary information

Supplementary figures, notes and data descriptions are available in Supplementalmaterials.pdf.

More from our Archive