Single-cell RNA sequencing (scRNA-seq) has yielded vast datasets covering diverse organisms, tissues, and disease states. Numerous public databases aggregate and curate these datasets to facilitate reuse by researchers. Below we compile an extensive list of credible, active scRNA-seq databases (global and regional), and discuss their features, advantages, limitations, metadata standards, accessibility, and scientific relevance. This report is organized into structured sections for clarity.
The table below summarizes key public databases and portals for single-cell transcriptomic data. Each entry includes the database name, a brief description, access link, and key features (e.g. number of datasets/cells, species coverage, data types). All listed resources are backed by peer-reviewed publications or major consortia, and mostly remain actively maintained.
Database Name | Description | Scope |
---|---|---|
Arc Virtual Cell Atlas: scBaseCamp | AI-curated repository integrating >230 million single-cell profiles across multiple species and tissues, with standardized metadata. | General, multi-species |
Arc Virtual Cell Atlas: Tahoe-100M | Massive perturbation atlas containing 100 million transcriptomic profiles from ~60,000 drug perturbation experiments across 50 cancer cell lines. | Perturbation-focused |
Cancer Single-cell Expression Map (CancerSCEM) | Integrates and visualizes scRNA-seq data from human cancers, providing extensive multidimensional analyses, including metabolic profiling. | Cancer-focused |
Cell Types Database: RNA-Seq Data (Allen Institute) | Offers extensive single-cell and single-nucleus transcriptomic data, linked to morphological and electrophysiological characterizations of neurons from human and mouse brain regions. | Tissue-specific (brain) |
CellxGene Census | Offers extensive single-cell and single-nucleus transcriptomic data, linked to morphological and electrophysiological characterizations of neurons from human and mouse brain regions. | Tissue-specific (brain) |
DISCO | Aggregates over 100 million cells from publicly available single-cell datasets, harmonized to facilitate consistent analysis across studies. | General |
Human Cell Atlas (HCA) | A global effort to build comprehensive reference maps of all human cells, facilitating insights into human health, disease, and development. | General (human) |
Human Protein Atlas: Single Cell Type | Integrates scRNA-seq data from 31 human tissues with protein-level immunohistochemical staining, linking transcriptomic and proteomic profiles. | General (human) |
PanglaoDB | Database of mouse and human scRNA-seq experiments with pre-annotated cell-type markers, facilitating gene and cell-type exploration. | General |
Perturbation Atlas (Perturb-seq) | Resource containing single-cell RNA-seq data systematically capturing cellular responses to genetic and chemical perturbations for functional genomic insights. | Perturbation-focused |
scRNASeqDB | Curated database of 36 human single-cell gene expression datasets from GEO, involving 8,910 cells categorized into 174 cell groups. | General |
Single Cell Expression Atlas (SCEA) | Cross-species repository providing uniformly processed single-cell RNA-seq data to facilitate cross-study comparisons and gene-expression searches. | General, multi-species |
Single Cell Portal (Broad Institute) | Broad Institute's portal hosting single-cell datasets including contributions from consortia like Human Cell Atlas, with user-friendly visual exploration tools. | General |
Tumor Immune Single-cell Hub (TISCH2) | Dedicated to tumor microenvironment analysis, providing detailed single-cell annotations of immune and stromal populations across cancer datasets. | Cancer-focused |
UCSC Genome Browser: Single Cell RNA-Seq | Single-cell RNA expression datasets from various human tissues (e.g., kidney, colon, heart, muscle, placenta, peripheral blood mononuclear cells), accessible via UCSC Genome Browser tracks. | General (human tissues) |
Public scRNA-seq databases provide a rich, diverse, and well-curated resource for researchers. While metadata inconsistencies and computational demands remain challenges, continuous improvements in metadata standardization, data curation, and accessibility particularly through large-scale platforms like the Arc Virtual Cell Atlas and Human Cell Atlas are significantly enhancing usability.
However, these databases typically do not offer comprehensive analytics solutions, leaving researchers to handle bioinformatics workflows independently. Companies such as Nygen Analytics address critical gaps by providing structured metadata tracking, version control, and reproducibility for single-cell data analysis. Platforms like LaminDB offer bioinformatics solutions, enabling seamless analytics for single-cell datasets by simplifying metadata tracking and analysis pipelines.
Thus, while public databases form the foundational resource for single-cell research, analytics solutions from companies such as Nygen remain essential for efficiently extracting actionable biological insights from complex single-cell datasets.
Find, merge and analyse hundreds of curated datasets on Nygen Database. Explore millions of cells in a single click. Easily publish and share your research, regardless of where your analysis was performed, with no platform restrictions. Explore Nygen Database.