Polar Gini Curve: A Technique to Discover Gene Expression Spatial Patterns from Single-cell RNA-seq Data.
10.1016/j.gpb.2020.09.006
- Author:
Thanh Minh NGUYEN
1
;
Jacob John JEEVAN
1
;
Nuo XU
2
;
Jake Y CHEN
3
Author Information
1. Informatics Institute, the University of Alabama at Birmingham, Birmingham, AL 35294, USA.
2. Collat School of Business, the University of Alabama at Birmingham, Birmingham, AL 35294, USA.
3. Informatics Institute, the University of Alabama at Birmingham, Birmingham, AL 35294, USA. Electronic address: jakechen@uab.edu.
- Publication Type:Journal Article
- Keywords:
Biomarker discovery;
Polar Gini curve;
Single-cell gene expression;
Spatial pattern
- From:
Genomics, Proteomics & Bioinformatics
2021;19(3):493-503
- CountryChina
- Language:English
-
Abstract:
In this work, we describe the development of Polar Gini Curve, a method for characterizing cluster markers by analyzing single-cell RNA sequencing (scRNA-seq) data. Polar Gini Curve combines the gene expression and the 2D coordinates ("spatial") information to detect patterns of uniformity in any clustered cells from scRNA-seq data. We demonstrate that Polar Gini Curve can help users characterize the shape and density distribution of cells in a particular cluster, which can be generated during routine scRNA-seq data analysis. To quantify the extent to which a gene is uniformly distributed in a cell cluster space, we combine two polar Gini curves (PGCs)-one drawn upon the cell-points expressing the gene (the "foreground curve") and the other drawn upon all cell-points in the cluster (the "background curve"). We show that genes with highly dissimilar foreground and background curves tend not to uniformly distributed in the cell cluster-thus having spatially divergent gene expression patterns within the cluster. Genes with similar foreground and background curves tend to uniformly distributed in the cell cluster-thus having uniform gene expression patterns within the cluster. Such quantitative attributes of PGCs can be applied to sensitively discover biomarkers across clusters from scRNA-seq data. We demonstrate the performance of the Polar Gini Curve framework in several simulation case studies. Using this framework to analyze a real-world neonatal mouse heart cell dataset, the detected biomarkers may characterize novel subtypes of cardiac muscle cells. The source code and data for Polar Gini Curve could be found at http://discovery.informatics.uab.edu/PGC/ or https://figshare.com/projects/Polar_Gini_Curve/76749.