In Part 1 of this 3 part series, we explored how to maximize biological insights from single-cell RNA sequencing (scRNA-seq) data alone. We discussed the power of scRNA-seq in uncovering cellular heterogeneity and touched on advanced analysis strategies. However, one critical element was left in the wings: the spatial context of cells. Spatial transcriptomics adds this missing piece, mapping gene expression onto tissue architecture. In this Part 2, we dive deeper into spatial transcriptomics methodologies and how integrating spatial data can elevate your single-cell analysis to a new dimension. We’ll cover when and how spatial context can enhance scRNA-seq, discuss best-practice integration methods (like Seurat label transfer, SpaGE, Tangram, pciSeq, and BayesSpace), and provide real-world examples where even minimal spatial data yields rich insights. Throughout, we maintain an engaging yet scientific tone – matching Part 1 – and highlight practical tips (including how platforms like Nygen.io support these workflows) without the sales pitch. Let’s unlock spatial context for your single-cell data!
Single-cell RNA-seq excels at profiling gene expression in individual cells, but it inherently loses information about where those cells were in the tissue. Tissues aren’t random collections of cells – they have structure: layers in a brain cortex, niches in bone marrow, tumor-immune cell neighborhoods, developmental gradients in an embryo, and so on. Cellular identity and function are often intimately linked to position and neighbors. Without spatial context, we might miss patterns of organization or interaction that are biologically important.
Consider an analogy: scRNA-seq gives us a list of characters (cell types) and their lines (gene expressions) in a play, but spatial transcriptomics is the stage direction – it tells us where each character is standing and whom they’re interacting with. By integrating the two, we start to see the full performance. Specifically, adding spatial data can:
In short, spatial context can turn a list of cell types into a map of cellular geography. For researchers, this means we can ask not only “what cell types are present and what are they expressing?” but also “where are these cells and how might location influence their gene expression?” Let’s briefly recap what spatial transcriptomics technologies entail, then explore how to integrate their data with scRNA-seq.
Spatial transcriptomics (ST) refers to techniques that measure gene expression while preserving information about the spatial location of those measurements in the tissue. There are two broad classes of ST methods:
There are also emerging platforms combining approaches, but the key point is no single current technology gives you both full transcriptome and exact single-cell resolution across a whole tissue. This is why integration with scRNA-seq is so powerful: we can compensate for each other’s weaknesses. Single-cell seq gives the high gene count and distinct cell identities; spatial gives the location but might have fewer genes or mixed-cell signals. Integrative computational methods are the bridge between them.
Not every single-cell study will require spatial data, but certain scenarios gain enormous value from spatial integration:
In summary, if your biological question involves how cells are arranged, who they neighbor, or how location might influence cell behavior, that’s when layering on spatial data is most rewarding. The good news is you don’t always need a massive spatial experiment to get these benefits; even minimal spatial data (like a single Visium slide or a targeted gene panel imaging) can complement a large scRNA-seq dataset. The key is smart integration, which we’ll discuss next.
There has been an explosion of computational methods to integrate scRNA-seq with spatial data. These methods fall into a few categories: label transfer/mapping, imputation of missing genes, deconvolution of mixed signals, and spatial domain detection. Below is a summary of some best-practice tools and what they are used for:
Platform / Tool | Integration Approach | Use Case / Strengths |
---|---|---|
Anchor-based label mapping (Canonical Correlation Analysis under the hood) | Maps cell identities from scRNA-seq onto spatial spots. Ideal for annotating spatial data: e.g., predict the cell type composition of each Visium spot by transferring labels from a well-annotated scRNA-seq reference. Fast and user-friendly; part of the popular Seurat pipeline. | |
Machine learning imputation of gene expression | Predicts the whole-transcriptome expression in spatial locations by integrating a high-genome-coverage scRNA-seq dataset with a limited-genome spatial dataset. In other words, SpaGE “fills in” the genes that weren’t measured in the spatial experiment. Great for increasing gene coverage – it even uncovered new spatial gene patterns in the mouse brain that were later confirmed by independent in situ hybridization (Abdelaal et al., 2020). | |
Deep learning-based cell mapping (probabilistic alignment) | Aligns single-cell data to spatial data by finding the best match of cell transcriptomes to spatial gene expression patterns. Tangram can map individual cells or cell types into a tissue, optimizing so that the density of mapped cells matches the observed spatial expressiontinnguyen-lab.com. It’s versatile – supports mapping to various spatial data types (even MERFISH or histology) – and can handle differences in resolution or throughput between datasets. | |
Probabilistic cell typing of spatial data using scRNA-seq reference | “pciSeq” stands for Probabilistic Cell Inference in Sequencing (from Qian et al. 2019). It uses a Bayesian model to assign cell types to spatial transcriptomics spots based on a reference scRNA-seq atlas. Useful when your spatial method captures limited genes per spot/cell – e.g., if a spatial experiment only has 100 genes, pciSeq leverages the scRNA-seq to tell you which cell type those gene combos most likely represent. It was demonstrated on early ST data to finely map closely-related neuron types in situ. | |
Bayesian spatial clustering and resolution enhancement | Not an integration per se (it doesn’t require scRNA-seq input), but often used alongside integration. BayesSpace improves the analysis of Visium-style data by modeling neighborhood information: it can increase effective resolution to “subspots” and detect finer spatial domains (Zhao et al., 2021). For instance, in a breast cancer Visium dataset, BayesSpace could delineate tumor substructures that were blurred at spot-level resolution. If you have scRNA-seq too, you might cluster with BayesSpace and then label those clusters with scRNA-seq-derived identities. |
These are just a selection of tools. Others include Cell2Location, STdeconvolve, Stereoscope, novoSpaRc, Harmony (for multi-modal integration), and more – each with their own algorithmic twists (regression, optimal transport, topic modeling, etc.). In fact, a 2023 review identified 19 different methods for integrating scRNA-seq with spatial data! The five above have become quite popular for their performance and usability, covering the common needs of single-cell researchers venturing into spatial analysis.
So, how do we put integration into practice? Let’s walk through a practical mindset with an example scenario:
From there, you could refine further. For example, use BayesSpace to subdivide spots in high-T-cell regions, or examine gene expression of interaction molecules (checkpoint ligands, etc.) in spatial context. Integration also works in the other direction: if your spatial data had unique expression patterns, you might adjust your scRNA-seq clustering to align better with spatial domains (some researchers iterate between datasets).
Best Practices: When integrating, always ensure that the datasets truly correspond (ideally same tissue or condition). If you’re mapping a scRNA-seq atlas to a spatial section, any batch differences should be corrected or accounted for. Methods like Harmony or Seurat’s CCA can handle batch effects to some degree during integration. Also, validation is key – for instance, Lohoff et al. validated their imputed gene expressions by comparing to actual in situ staining.
You might not always have ground truth, but sanity checks (do known markers localize correctly? do integrated clusters match histology if available?) increase confidence.
Let’s look at a few practical biological scenarios where integrating spatial data with single-cell transcriptomics provided powerful insights:
In this example from the mouse brain cortex, a massive dissociated single-cell dataset (red points) was integrated with a much smaller spatially-resolved dataset (blue points). Panel (A) shows a t-SNE of 71k scRNA-seq cells and 2.5k spatial cells (STARmap), colored by dataset (Welch et al., 2019).
After integration, the cells were jointly clustered; panel (B) shows the same t-SNE colored by the identified cell clusters, which include excitatory neurons of layers 2/3, 5, 6 (L2/3, L5, L6), interneurons, oligodendrocytes, etc. Critically, panel (D) (bottom) plots the spatial coordinates of the STARmap cells colored by those cluster identities, effectively mapping the scRNA-seq clusters onto the tissue. This recapitulated known cortex anatomy: for instance, “Astrocyte_Gfap” cells (purple) localized to the meninges (outer surface) and white matter, matching patterns from the Allen Brain Atlas Gfap staining. Such integration shows where each cell type resides and validates that the scRNA-seq clusters correspond to real spatially segregated populations (Welch et al., 2019)
In the figure above, panel (b) shows an E8.5 embryo section with each dot representing a cell, colored by its predicted cell type (focusing on gut tube and nearby cells). By integrating, they could assign each cell a type and even infer the expression of genes not measured by seqFISH. This uncovered subtle spatial patterning – for instance, progenitor cells of the future trachea (ventral lung, teal) and esophagus (dorsal lung, orange) were found segregated to the ventral vs. dorsal sides of the gut tube, respectively, even at this early stage. Such dorsal-ventral separation was confirmed by follow-up in situ hybridization (see panel (h) and (j), where markers like Tbx1 and Shh are expressed in complementary patterns) and was not evident from the dissociated data alone. This example highlights how minimal spatial data (a few hundred genes) can be leveraged with a rich scRNA-seq reference to strategically fill in the blanks, providing insights into tissue patterning and developmental biology.
For those eager to implement these integrations, it’s worth noting that many analysis platforms (including Nygen Analytics) support such multi-modal data layering. You don’t have to build everything from scratch in R or Python if that’s not your preference. For example, Nygen’s cloud platform enables no-code analysis of scRNA-seq data (from quality control and clustering to differential expression), and it also provides ways to incorporate spatial information. Users can import spatial coordinates of cells or spots into Nygen and visualize gene expression or clusters on tissue layouts. This means after you identify clusters in your single-cell data, you could map them onto an actual tissue image or coordinate system if you have it. Nygen’s knowledge base has an article titled “Importing Spatial and Clonotype Data” that guides researchers to upload spatial metadata (like x,y coordinates for each cell/spot) post-analysis, effectively marrying the expression data with spatial organization. The result can be interactive plots where you see your clustered cells scattered according to their original tissue positions, revealing spatial patterns without writing custom code.
Internally, such platforms may leverage the aforementioned algorithms. For instance, behind the scenes, label transfer in Seurat or cell2location’s Bayesian mapping could be part of a workflow – the user might simply see an option to “annotate spatial dataset with single-cell references” and get results with a few clicks. Nygen also emphasizes reproducibility and ease of use, so an academic researcher can focus on interpreting the spatial biology rather than wrangling complex pipelines. (For more on making single-cell analysis accessible, see our earlier post on addressing bioinformatics skill gaps and intuitive tools.)
Tip: If you’re curating a single-cell atlas with Nygen, consider also publishing any spatial data you have in the Nygen Database or linking to it in your project. It creates a more comprehensive resource (think of it like adding a map layer to your atlas). And if you lack spatial data, Nygen’s network might help you find sequencing core facilities that offer spatial transcriptomics services – a useful pointer if you decide to generate spatial data to complement your single-cell study.
By integrating spatial transcriptomics strategically with single-cell RNA-seq, researchers can maximize insights into how cells function together in tissues. We’ve seen that even limited spatial information, when combined with robust single-cell data, can elucidate tissue architecture, pinpoint cell interactions, and highlight patterns (like developmental axes) invisible to dissociated-cell analysis. The key is choosing the right integration approach for your question – whether it’s simple label transfer to annotate tissue regions or advanced probabilistic mapping to predict unseen genes. Tools like Seurat, SpaGE, Tangram, pciSeq, and BayesSpace have become invaluable in this endeavor, each addressing different aspects of the integration challenge.
As single-cell and spatial techniques continue to evolve, the line between “scRNA-seq vs spatial” is blurring. In Part III of this series, we will directly compare and contrast single-cell RNA-seq and spatial transcriptomics as complementary technologies. We’ll discuss their respective strengths and limitations and how, together, they form a more complete toolkit for understanding biology – much like how combining a microscope with a cell sorter gives you a fuller picture than either alone.
References
Liu, Y., Wu, Z., Feng, Y., Gao, J., Wang, B., Lian, C., & Diao, B. (2023). Integration analysis of single-cell and spatial transcriptomics reveal the cellular heterogeneity landscape in glioblastoma and establish a polygenic risk model. Frontiers in Oncology, 13, 1109037. https://doi.org/10.3389/fonc.2023.1109037
Lohoff, T., Ghazanfar, S., Missarova, A., Koulena, N., Pierson, N., Griffiths, J. A., Bardot, E. S., Eng, C.-H. L., Tyser, R. C. V., Argelaguet, R., Guibentif, C., Srinivas, S., Briscoe, J., Simons, B. D., Hadjantonakis, A.-K., Göttgens, B., Reik, W., Nichols, J., Cai, L., & Marioni, J. C. (2022). Integration of spatial and single-cell transcriptomic data elucidates mouse organogenesis. Nature Biotechnology, 40(1), 74-85. https://doi.org/10.1038/s41587-021-01006-2
Abdelaal, T., Mourragui, S., Mahfouz, A., & Reinders, M. J. T. (2020). SpaGE: Spatial Gene Enhancement using scRNA-seq. Nucleic Acids Research, 48(18), e107. https://doi.org/10.1093/nar/gkaa740
Looking to add spatial context to your single-cell data? Nygen allows you to import spatial coordinates and visualize your scRNA-seq results within their tissue context. Explore how this simple integration can reveal new biological insights in your research. Learn more about enhancing your single-cell analysis workflow.