FFPE Genomics: a Portal Into the World of Transcriptomic Research?
The emergence of spatial transcriptomics, with its ability to generate highly detailed maps of gene expression with spatial context, is revolutionizing biomedical research, particularly in developmental biology, cancer, immunology, and neuroscience.
The development of several spatial transcriptomics platforms has made it possible to bridge the gap between gene expression and tissue/cell function. In parallel, due to technological innovations, formalin-fixed, paraffin-embedded (FFPE) preserved human tissues from biorepositories are now accessible for genomics analysis. This offers an exciting opportunity for translational research as it is now possible to work directly on human tissue.
FFPE: the gold standard for tissue preservation
Preserved tissue samples for decades have been a rich source of study material for medical, forensic, and translational research. In the past, several tissue preservation techniques evolved involving alcohol preservation, formalin treatment, and cryopreservation that conserve cell morphology along with nucleic acid and protein content for further analysis. Translational research commonly used ultralow temperature frozen tissue (at −80 to −190°C) for diagnostics and research. However, the cost of ultralow temperature storage was high as it requires the proper facilities, electrical requirements, and equipment. Formalin treatment started to gain popularity as an alternative not only due to its proper architectural conservation of tissue structures and cellular morphology but its low cost. Consequently, FFPE tissue sections became the gold standard for the preservation of human tissue. FFPE tissue samples are extremely durable and can be stored at ambient temperature for decades.
There may be as many as a billion FFPE samples in hospitals and tissue banks worldwide.
FFPE preservation involves fixing the tissue in formalin first, followed by embedment in a paraffin wax block. Tissue slices are then cut off easily from the paraffin block to mount on a microscopic slide for examination. Thus, FFPE tissue has become the routine form of preservation in clinical practice for biopsies, and can be used for examination, experimental research, and diagnostic/drug development. It is estimated that there are 400 million to more than a billion FFPE samples in hospitals and tissue banks worldwide.
Challenges with FFPE-preserved tissue
Despite the ability of FFPE specimens to preserve morphological information, the preservation method degrades DNA and RNA, significantly impacting genomic analysis. Purification of nucleic acids in FFPE tissue sections degrade due to biomolecule crosslinking, nucleic acid fragmentation, and low yield. Several ready-to-use kits dedicated to nucleic acid extraction from FFPE tissues have been developed, but the quality and yield using these methods are highly variable making analysis difficult. As genomic technologies have continued to improve so has the prospect of doing genomic analysis from FFPE tissue, in other words, FFPE genomics. The development of direct, digital counting of nucleic acids with the nCounter® Analysis System has overcome the issues with nucleic acid degradation to enable transcriptomic profiling of 800+ transcripts from an FFPE tissue section, unlocking valuable insights from archival FFPE tissue from health and diseased patients.
The spatial genomics family
Humans approximately have 20,000 to 25,000 genes and linking gene expression to function is a herculean task. Biology is inherently spatial and in the case of the human body, a three-dimensional structure where the integration of several different complex systems works together promotes optimal physiology. Further, the building block of all multicellular organisms is the cell. Therefore, understanding how cells interact, organize themselves to make up an organ, and drive various biological functions becomes crucial. Tissues are where cells proliferate, differentiate, and function while interacting constantly with each other, and studying spatial gene expression of different tissues could provide the link between genotype and phenotype and disease mechanisms. Thus, spatial genomics offers unprecedented insights into the spatial organization of tissues and how fundamental cellular processes are orchestrated in multicellular organisms.
Knowledge thus gained has revolutionized both basic research and medicine and is being applied to several areas of translational research such as to identify and evaluate biomarkers for disease prognosis or develop novel therapeutics and drugs. Spatial genomics includes the study of genomics (total DNA and RNA), transcriptomics (RNA transcripts), and epigenomics (DNA modification) with spatial context using gene profiling methods performed on intact tissue.
Evolution of spatial genomics technologies
Early genomics analysis involved bulk genomic sequencing that replaced microarray technology. Bulk RNA-sequencing involves sequencing of cDNAs, in which abundance is derived from the number of counts of each transcript. Although this technique is high-throughput and can quantify expression levels of numerous genes across a large population of cells, spatial context was missing. Bulk RNA-sequencing was further refined with the development of single-cell RNA-sequencing (scRNA-seq) that provides transcriptional profiling of individual cells instead of a population of cells and in addition can reveal gene fusions, splicing variants, and mutations thus providing a more complete genetic picture than DNA sequencing alone. However, the scRNA-seq method still fails to preserve spatial information of transcripts as cells are dissociated from tissue before detection.
Single-cell RNA-sequencing fails to preserve spatial information of transcripts as cells are dissociated from tissue before detection.
On the other hand, antibody-based immunohistochemistry techniques such as fluorescent in situ hybridization that have become an essential tool for FFPE analysis do provide spatial resolution to study gene or protein expression in tissues, but quantification is often difficult, and analysis is limited to a handful of genes in a single experiment. The advent of next-generation sequencing (NGS) technology has enabled parallel multiplexed analysis of DNA sequences on a massive scale—tens of thousands of sequences from individual single strands of DNA analyzed simultaneously. Combining immunohistochemistry, fluorescent in situ hybridization (FISH), and NGS has led to the birth of spatial transcriptomics, which now allows researchers to cover a larger number of transcripts that can be quantified within a short amount of time. Today, spatial transcriptomics technologies can be partially automated and offer a multiplex capability that measures the expression of tens of hundreds of RNA molecules from large amounts of FFPE tissue sections in a short amount of time.
Spatial transcriptomics
Spatial transcriptomics looks at the transcriptome fingerprint or net gene expression within intact tissue while maintaining spatial context and can be used to study the biology of development, health, and disease. Today, several spatial transcriptomics platforms are commercially available that are compatible with FFPE tissue. Different spatial transcriptomics platforms available offer varying spatial resolution (the minimum geographic area that can be profiled), scan area (breadth of tissue covered), scale, and throughput (number of samples and the amount of time needed to get results) and can be primarily categorized into (1) sequencing-based approaches that encode positional information onto transcripts before sequencing and (2) imaging approaches based either on in situ sequencing (ISS) where transcripts are amplified and sequenced in the tissue block or by FISH where imaging probes are sequentially hybridized on to the tissue. Both sequencing and imaging-based technologies are equally powerful, but they each have their limitations.
Different spatial transcriptomics platforms offer varying spatial resolution, scan area, scale, and throughput.
NGS-based spatial transcriptomic platforms capture all polyadenylated (polyA) transcripts and can be used to profile an entire tissue section in cases where researchers do not necessarily know what they are looking for. Thus, these sequencing-based methods are unbiased, therefore most suitable for identifying previously unreported gene expression or generating tissue atlases that provide maps of the whole transcriptome or proteome. The development of NanoString’s probe-based spatial multi-omics platform the GeoMx® Digital Spatial Profiler (DSP) relies on the nCounter Analysis System or NGS for the readout of the probes and enables profiling of the whole transcriptome from a single FFPE slide. The GeoMx DSP was one of the first available spatial biology platforms that allowed for quantitative, spatial analysis of both mRNA transcripts and proteins.
Imaging-based in situ spatial multi-omic methods directly detect transcripts from tissue sections and offer high sensitivity down to single-cell or sub-cellular resolution. Imaging-based platforms such as NanoString’s CosMx™ Spatial Molecular Imager (SMI) can simultaneously profile up to thousands of transcripts or 64+ proteins from FFPE or fresh frozen tissue via cyclic in situ hybridization of nucleic acid-based probes. The CosMx SMI assay expands on the technology of the GeoMx DSP assay and provides the highest plex with the greatest level of cellular resolution.
Spatial proteomics
Proteins are the basic functional unit of the cell and predicting protein expression levels based solely on mRNA levels has proven to be unreliable. Most proteins accumulate at a defined cellular location with approximately a third residing in multiple compartments; knowledge of the spatial localization of proteins within cells is of critical importance to understand how cells regulate protein function such as trafficking to and between organelles. Moreover, many biochemical processes are regulated at the level of protein post-translational modification, localization, and physical association that determine many context-specific cellular functions. Further, protein expression and localization can vary even between genetically identical healthy cells, a fact that my underlie differences in cellular phenotypes such as growth and differentiation. From a clinical perspective as well as for translational research, multiple diseases including cancer have been linked to abnormal protein localization caused by mutations or dysregulated post-translational modification.
It is challenging to assess the entire repertoire of proteins present in cells or tissues, as proteins are easily degraded and cannot be amplified.
Spatial proteomics directly captures the location of proteins along with their expression levels in intact tissue sections. Unlike transcriptomics, it is more challenging to assess the entire repertoire of proteins present in cells or tissues as a protein cannot be amplified unlike nucleic acids and is easily denatured and degraded, increasing sample losses and detection bias. Nevertheless, several tissue-based spatial proteomics platforms have been developed that are compatible with FFPE tissues such as antibody-based profiling and imaging (GeoMx DSP/CosMx SMI) that can visualize and quantify tens to hundreds of proteins simultaneously within tissue compartments/single cells or quantitative mass spectrometry (MALDI, ESI).
What is the spatial multiomics revolution?
Spatial omics integrates the methodologies used to study the whole genome (spatial genomics), RNA expression (spatial transcriptomics), protein (spatial proteomics) abundance, metabolites (metabolomics), and the epigenome with spatial context to provide a comprehensive understanding of various complex cellular and molecular events within intact tissues. Data generated by spatial multiomics makes it possible to molecularly link genetic information to the proteome, as gene expression is regulated at multiple levels from transcription to translation to protein degradation.
Spatial multiomics offers unique ways to track molecular changes in tissue that occur in response to external stimuli and biological variation.
Additionally, spatial multiomics integrates different length scales from tissue compartments to single cells. This is particularly useful as single-cell proteome profiling is limited: proteins cannot be amplified, unlike DNA and RNA. Therefore, spatial multiomics offers unique ways to track molecular changes in tissue that occur in response to external stimuli and biological variation such as environmental changes and gene variants and can provides a useful toolkit for identifying novel disease signatures. For example, the CosMx SMI system is ideal for exploring protein as well as RNA expression at cellular and sub-cellular resolution, whereas the GeoMx DSP can carry out spatial analysis at the level of the whole transcriptome and 150+ proteins from different tissue compartments and entire cell populations. Together the CosMx SMI and GeoMx DSP systems enable spatial biology at multiple levels of plex and resolution for complementary experiments on FFPE or fresh frozen tissue sections.
Spatial multiomics platforms
All spatial multiomics techniques require instrumentation. NanoString’s GeoMx® DSP and CosMx SMI are multiomics-based instruments that allow the mapping of many different targets (RNA or protein) spatially within intact tissue samples such as FFPE. The assays carried out on GeoMx DSP combine standard immunofluorescence (IF) staining techniques with oligonucleotide barcoding technology to perform highly reproducible, non-destructive, and spatially resolved multiplexed expression analysis of RNA and protein.
CosMx SMI uses fluorescent molecular barcodes and sensitive cyclic in-situ hybridization chemistry to measure gene and protein expression at the single cell and subcellular level within an intact tissue section. Both GeoMx DSP and CosMx SMI instruments make spatial multiomics more accessible to researchers. Both instruments feature automation, and the data is simple to collect and interpret with the interactive and collaborative cloud-based AtoMx™ Spatial Informatics Platform.
The GeoMx® Digital Spatial Profiler
The GeoMx DSP uses RNA in situ hybridization probes or antibodies linked to indexing oligonucleotide barcodes via a photocleavable linker. These probes are hybridized to RNA or protein targets on a slide-mounted tissue sample that is also stained with morphological markers to identify regions of interest (ROIs) in the sample. The tissue is then imaged using fluorescence microscopy and UV light is projected onto the selected ROIs, releasing the oligonucleotide tags. These tags are collected and subsequently counted using either the nCounter® Analysis System or via NGS using an Illumina sequencer. The data generated by GeoMx DSP are not images, but spatially registered counts of the probes that can be mapped back to the fluorescent microscopy image.
There are several advantages of using the GeoMx DSP platform for spatial profiling. First, it requires very little tissue, and it is compatible to work with a variety of samples including FFPE tissue, fresh frozen tissue, core-needle biopsies, and tissue microarrays. The GeoMx DSP does not destroy the tissue while profiling and the tissue can be reused for further examination on other platforms thus preserving precious samples. To perform spatial profiling on regions as small as 10 microns in diameter, the GeoMx DSP instrument uses a digital micromirror device (DMD) that can shine UV light onto the surface of the tissue in any number of shapes and patterns, including discontinuous ROIs that can be used to profile an entire cell population across a given field of view. ROIs can be different shapes and sizes and can follow the complex contours of tissue morphology using the IF staining as a guide. Finally, the GeoMx DSP system comes with built-in data analysis software that simplifies data visualization and interpretation. Moreover, the software can identify statistical changes in gene expression between experimental groups (e.g., t-tests, linear mixed models), visualize large data matrices using dendrograms and clustering analyses to discover biomarkers, and perform pathway analyses to understand disease mechanisms of action and identify novel targets.
The CosMx™ Spatial Molecular Imager
CosMx SMI complements GeoMx DSP and provides single-cell and sub-cellular spatial resolution with quantification of up to 1000 RNAs and 50+ proteins. It helps to better understand biological processes controlled by ligand-receptor interactions, identify single-cell biomarkers, and characterize gene expression networks in cellular niches. The applications for CosMx are vast and versatile and include discovering and mapping cell types, generating cell atlases, analyzing cell-cell interactions, and discovering single-cell biomarkers. The CosMx SMI is ideal for answering questions that arise downstream of whole transcriptome spatial profiling carried out with GeoMx as these two systems can be used for complementary experiments on serial tissue sections from the same biopsy.
Applications for the GeoMx DSP
Studying tumor biology with the GeoMx DSP platform has resulted in multiple publications. For instance, the GeoMx DSP system was used to identify differential protein expression of immune markers within tumor cells of early-stage triple-negative breast cancer.1Carter JM, Polley M-YC, Leon-Ferre RA, Sinnwell J, Thompson KJ, Wang X, et al. Characteristics and Spatially Defined Immune (Micro)Landscapes of Early-Stage PD-L1–positive Triple-Negative Breast Cancer. Clin Cancer Res (2021) 27(20):5628. doi: 10.1158/1078-0432.CCR-21-0343 The unique ROI selection strategy of GeoMx DSP was used to assess protein expression in separate tissue compartments with the use of the marker Pancytokeratin (PanCK): tumor cells and the surrounding microenvironment, such as stroma and immune cells, The PanCK-positive tumor tissue and the PanCK-negative tissue or the stromal tissue compartment revealed two distinct expression profiles with a strong enrichment for immune markers in the stromal compartment. This finding can potentially be applied to the development of a more effective immunotherapy. GeoMx DSP could be further used to identify biomarkers for therapeutic response.
GeoMx DSP could be used to identify biomarkers for therapeutic response.
The work of Polverino et al. involved performing transcriptomic profiling with GeoMx DSP on lung tissue from patients with chronic obstructive pulmonary disease and non-small cell lung cancer.2Polverino F, Mirra D, Yang CX, Esposito R, Spaziano G, Rojas-Quintero J, Sgambato M, Piegari E, Cozzolino A, Cione E, Gallelli L, Capuozzo A, Santoriello C, Berrino L, de-Torres JP, Hackett TL, Polverino M, D’Agostino B. Similar programmed death ligand 1 (PD-L1) expression profile in patients with mild COPD and lung cancer. Sci Rep. 2022 Dec 27;12(1):22402. doi: 10.1038/s41598-022-26650-9. They showed that the expression of programmed death Ligand 1 (PD-L1), an immune checkpoint inhibitor, was associated with the upregulation of genes involved in tumor progression and downregulation of onco-suppressive genes; thus, enhancing the understanding of the innate immune mechanisms underlying the link between chronic obstructive pulmonary disease and lung cancer onset and progression.
In the field of neuroscience, the GeoMx DSP has been utilized to understand the underlying neurodegenerative changes in resilient individuals who demonstrated neuropathologic changes consistent with Alzheimer’s disease yet remained cognitively normal.3Walker JM, Kazempour Dehkordi S, Fracassi A, Vanschoiack A, Pavenko A, Taglialatela G, Woltjer R, Richardson TE, Zare H, Orr ME. Differential protein expression in the hippocampi of resilient individuals identified by digital spatial profiling. Acta Neuropathol Commun. 2022 Feb 14;10(1):23. doi: 10.1186/s40478-022-01324-9. Spatial expression analysis for proteins associated with CNS cell-typing or known neurodegenerative changes were done on the hippocampal neurofibrillary tangle (NFT)-bearing neurons, non-NFT-bearing neurons, and their immediate neuronal microenvironments in FFPE brain sections. The study identified 11 proteins displaying differential expression in NFT-bearing neurons from resilient individuals compared to affected individuals, suggestive of an environment containing less energetic and oxidative stress, which in turn results in the maintenance of neurons and their synaptic connections.
References
- 1Carter JM, Polley M-YC, Leon-Ferre RA, Sinnwell J, Thompson KJ, Wang X, et al. Characteristics and Spatially Defined Immune (Micro)Landscapes of Early-Stage PD-L1–positive Triple-Negative Breast Cancer. Clin Cancer Res (2021) 27(20):5628. doi: 10.1158/1078-0432.CCR-21-0343
- 2Polverino F, Mirra D, Yang CX, Esposito R, Spaziano G, Rojas-Quintero J, Sgambato M, Piegari E, Cozzolino A, Cione E, Gallelli L, Capuozzo A, Santoriello C, Berrino L, de-Torres JP, Hackett TL, Polverino M, D’Agostino B. Similar programmed death ligand 1 (PD-L1) expression profile in patients with mild COPD and lung cancer. Sci Rep. 2022 Dec 27;12(1):22402. doi: 10.1038/s41598-022-26650-9.
- 3Walker JM, Kazempour Dehkordi S, Fracassi A, Vanschoiack A, Pavenko A, Taglialatela G, Woltjer R, Richardson TE, Zare H, Orr ME. Differential protein expression in the hippocampi of resilient individuals identified by digital spatial profiling. Acta Neuropathol Commun. 2022 Feb 14;10(1):23. doi: 10.1186/s40478-022-01324-9.