CycleMix: Gaussian mixture modeling of the cell cycle
Jack Edward Peplinski, Tallulah S AndrewsAbstract
Motivation
The cell cycle is a crucial component of many biological processes, including cancer, tissue repair, and inflammation. However, due to the heterogeneity of this cycle it has been difficult to assess the extent of proliferation in clinical tissues. Single-cell RNAseq (scRNAseq) and spatial transcriptomics enables high resolution measurement of gene expression enabling the classification of individual cells into their cycling state. However, current methods are limited to classifying cells into only three states: G1, S, G2M and have unproven accuracy on modern datasets.
Results
We show that Seurat and cyclone the most widely used methods for cell cycle assignment have poor performance on modern droplet-based datasets. In particular, Seurat frequently labels mature non-cycling cells (e.g. neurons) as actively cycling. We present CycleMix, an alternative cell cycle assignment algorithm that can flexibly assign cells into any number of states provided sufficient marker genes as well as being capable of identifying when cells are not cycling. We demonstrate its superior performance for cell cycle assignment and regression of cell cycle expression patterns on six diverse droplet-based scRNAseq datasets.
Availability and implementation
CycleMix is available as an R package on Bioconductor, and on github: https://github.com/tallulandrews/CycleMix