BC 360 Data Analysis Report

Summary

    • Customer: Patricia Smith
    • Institute: Cancer Institute
    • Number of Samples: 79
    • Site of nCounter Run: NanoString Technologies
    • 360 Analyst: Nanostring Scientist
    • Scientific Reviewer: Nanostring BC Lead
    • Date: October 09, 2020

    This is a demo report used to showcase the functionality of the BC360 Standard Report.

Differential Expression Analysis

Response

HR Status

Survival Analysis

Recurrence Free Survival

PAM50

Quality Control

QC Summary for Response Grouping Variable

QC Summary for HR Status Grouping Variable

QC Summary for Recurrence Free Survival

Heatmap Overview

Heatmaps display gene expression values (or signature scores) across samples. These values are generally centered and scaled within each gene or signature for better graphical representation. Each tile within the heatmap displays a color that indicates the relative value of gene expression or signature score for the corresponding sample. If the samples can be sorted into groups, and those groups are provided to the analyst, they are indicated by the colored tiles across the top of the heatmap and described in the key to the right. The samples are sorted in order to place tiles with similar expression near each other to more easily identify patterns. The levels of relatedness are organized by unsupervised hierarchical clustering.

All Signatures

The 'All Signatures' heatmap uses unsupervised hierarchical clustering to show relatedness among signature scores for each sample. Scores are scaled by signature to have a mean of zero and a standard deviation of one. The standardized signature scores are truncated at ± 3 standard deviations to preserve greater clarity in color change within the largest proportion of data (99% of the data should fall within ± 3 standard deviations of the mean). Sample annotations are listed at the top of the heatmap. The signatures are displayed in rows and listed to the right of the heatmap. Each column is a unique sample, with a sample label displayed below the heatmap if there are 36 or fewer samples in the analysis.

All Genes

The 'All Genes' heatmap uses unsupervised hierarchical clustering to group normalized gene expression by sample. Expression values are scaled by gene to have a mean of zero and a standard deviation of one and then truncated at ± 3 standard deviations to preserve greater clarity in color change within the largest proportion of data (99% of the data should fall within ± 3 standard deviations of the mean). Sample annotations are listed at the top of the heatmap. The genes are displayed in rows. Each column is a unique sample, with a sample label displayed below the heatmap if there are 36 or fewer samples in the analysis.

Genes Within Signatures

The 'Genes Within a Signature' heatmap uses unsupervised hierarchical clustering to group normalized gene expression for genes within the signature. The different signatures are chosen from the dropdown menu. Scores are scaled by gene to have a mean of zero and a standard deviation of one and then truncated at ± 3 standard deviations to preserve greater clarity in color change within the largest proportion of data (99% of the data should fall within ± 3 standard deviations of the mean). Sample annotations are listed at the top of the heatmap and included genes are listed to the right of the heatmap. Only signatures comprised of 5 or more genes are present in this section.

Response

Differential Expression - Response Analysis

Differential expression analysis evaluates how different two groups are based on their gene/signature expression profiles. In Response Analysis, differential response to therapy, disease progression state, or binary category is used to classify each group, as defined for each sample in the annotation file. Comparison of the mean and range of expression between groups is used to understand if there are statistical similarities or differences.

The first plot on the page is a volcano plot that displays the fold-change and significance of the difference in signature scores between the two groups. Below the volcano plot, a forest plot sorts and summarizes the data for all the signatures. Next to the forest plot, a box plot displays the individual scores of samples in each group for a given signature that can be selected using the drop-down menu. Below the forest and box plot, a table is provided for the signature data. Beneath the signature section, a volcano plot and data table are provided to display the gene level analysis.

Following the Differential Expression Analysis, a Receiver Operator Characteristic (ROC) plot is displayed and presents the capacity of each signature to predict response. This plot may be stratified by a variable to investigate possible differences between the ROC curves for the different stratums.

All Signatures

Volcano Plot

The 'All Signatures' volcano plot displays each signature's difference between the response category, represented along the x-axis, with the significance (p-value) along the y-axis. Signatures that have greater statistical significance will produce points that are both larger and darker in hue, in addition to appearing higher on the plot. Signatures that have greater differential expression versus the baseline group appear further from the center of the plot. Signatures further to the right indicate an increase in expression and signatures further to the left indicate a decrease in expression relative to the baseline group. Horizontal lines indicate 0.01 and 0.05 adjusted p-values; when the adjusted p-values range higher than 0.05, the thresholds are not shown in the plot.

Forest and Box Plots

The 'All Signatures' forest plot shows the differential expression means and 95% confidence intervals between response variables, for each signature on an unadjusted scale. The vertical axis is shown at fold change equal to zero, indicating equivalent expression between the groups. As the marker shifts from the center line there is an increase (shift to the right), or decrease (shift to the left), in the differential expression of that signature when compared to the baseline group (represented as the vertical line at zero). The shape of the marker in each box indicates whether there is a significant difference in the signature as assessed by univariate analysis (note that this significance is not adjusted for multiple comparisons). A signature is considered significant if the 95% confidence interval (the horizontal line of the signature) does not cross the vertical axis representing the baseline group and therefore no difference to that baseline group, again not adjusted for multiple comparisons. If there is a complete absence of significant findings, no legend will accompany the plot.

Test Results

The 'All Signatures' results table provides the fold-change values for each signature used to generate the volcano plot and forest plot. This table reports the signature name, variable, group/response, Log(2) transformed fold change, 95% lower confidence of the mean limit, 95% upper confidence of the mean limit, students t-test distribution score, unadjusted significance (p-value), and significance adjusted for multiple tests (FDR).

All Genes

Volcano Plot

The 'All Genes' volcano plot displays each gene's fold change (or difference on the Log(2) scale) and significance (p-value). Genes that have greater statistical significance appear higher on the plot with larger, darker points, while genes that have greater differential expression appear further from the center of the plot. Genes further to the right indicate an increase in expression and genes further to the left indicate a decrease in expression relative to the baseline group. Horizontal lines indicate 0.01 and 0.05 adjusted p-values; when the adjusted p-values range higher than 0.05, the thresholds are not shown in the plot.

Test Results

The 'All Genes' results table provides the fold-change values for each gene used to generate the 'All Genes' volcano plot. This table reports the gene name, variable, group/response, Log(2) transformed fold-change, 95% lower confidence of the mean limit, 95% upper confidence of the mean limit, students t-test distribution score, unadjusted significance (p-value), and significance adjusted for multiple tests (FDR).

Receiver Operator Characteristic

All Signatures

Receiver Operator Characteristic (ROC) plots display the capacity of each signature to predict response. This plot may be stratified by a variable to investigate possible differences between the ROC curves for the different stratums. ROC plots are generated for the signature chosen from the dropdown menu and stratified by dichotomizing the signature using its median signature score as a threshold. The true positive rate (sensitivity) and false positive rate (1-specificity) are graphed and used to calculate the Area Under the Curve (AUC); a metric of predictive power. Signatures that are more predictive of response, have curves bent toward the upper left corner of the plot. This yields an AUC that approaches an upper bound of 1.0 at which point prediction is perfect. Signatures that are not predictive appear as diagonal lines from the lower left to the upper right corner of the plot. These curves have an AUC of approximately 0.5. The AUC value for each group in the analysis is displayed to the right of the plot.

HR Status

Differential Expression - Group Analysis

Differential expression analysis evaluates whether we can detect two or more groups as different based on their gene/signature expression profiles. In Grouping Analysis, intrinsic differences in the samples such as patient characteristics, or treatment arms of an investigational study, are used to define each group as specified in the annotation file. Comparison of the mean and range of expression between groups is used to understand if there are statistical differences.

The first plot on the page is a volcano plot that displays the fold-change and significance of difference between groups for the signature scores. Below the volcano plot, a forest plot summarizes mean differences between groups with their confidence interval for all the signatures. Next to the forest plot, a box plot displays the individual scores of samples in each group for a given signature that can be selected using the drop-down menu. Below the forest and box plot, a table is provided for the signature data. Beneath the signature section, a volcano plot and data table are provided to display the gene level analysis.

All Signatures

Volcano Plot

The 'All Signatures' volcano plot displays each signature's difference between the response category, represented along the x-axis, with the significance (p-value) along the y-axis. Signatures that have greater statistical significance will produce points that are both larger and darker in hue, in addition to appearing higher on the plot. Signatures that have greater differential expression versus the baseline group appear further from the center of the plot. Signatures further to the right indicate an increase in expression and signatures further to the left indicate a decrease in expression relative to the baseline group. Horizontal lines indicate 0.01 and 0.05 adjusted p-values; when the adjusted p-values range higher than 0.05, the thresholds are not shown in the plot.

Forest and Box Plots

The 'All Signatures' forest plot shows the differential expression means and 95% confidence intervals between response variables, for each signature on an unadjusted scale. The vertical axis is shown at fold change equal to zero, indicating equivalent expression between the groups. As the marker shifts from the center line there is an increase (shift to the right), or decrease (shift to the left), in the differential expression of that signature when compared to the baseline group (represented as the vertical line at zero). The shape of the marker in each box indicates whether there is a significant difference in the signature as assessed by univariate analysis (note that this significance is not adjusted for multiple comparisons). A signature is considered significant if the 95% confidence interval (the horizontal line of the signature) does not cross the vertical axis representing the baseline group and therefore no difference to that baseline group, again not adjusted for multiple comparisons. If there is a complete absence of significant findings, no legend will accompany the plot.

Test Results

The 'All Signatures' results table provides the fold-change values for each signature used to generate the volcano plot and forest plot. This table reports the signature name, variable, group/response, Log(2) transformed fold change, 95% lower confidence of the mean limit, 95% upper confidence of the mean limit, students t-test distribution score, unadjusted significance (p-value), and significance adjusted for multiple tests (FDR).

All Genes

Volcano Plot

The 'All Genes' volcano plot displays each gene's fold change (or difference on the Log(2) scale) and significance (p-value). Genes that have greater statistical significance appear higher on the plot with larger, darker points, while genes that have greater differential expression appear further from the center of the plot. Genes further to the right indicate an increase in expression and genes further to the left indicate a decrease in expression relative to the baseline group. Horizontal lines indicate 0.01 and 0.05 adjusted p-values; when the adjusted p-values range higher than 0.05, the thresholds are not shown in the plot.

Test Results

The 'All Genes' results table provides the fold-change values for each gene used to generate the 'All Genes' volcano plot. This table reports the gene name, variable, group/response, Log(2) transformed fold-change, 95% lower confidence of the mean limit, 95% upper confidence of the mean limit, students t-test distribution score, unadjusted significance (p-value), and significance adjusted for multiple tests (FDR).

Recurrence Free Survival

Survival Analysis - Progression-Free

Survival analysis evaluates whether the change in disease state over time, as observed in two or more groups, are associated with an expression profile or gene signature. For Progression-Free Survival Analysis, the survival of a patient is observed over time until a censoring event such as patient departure from the study, disease recovery, or death occurs. Signature score expression is stratified amongst each sample grouping to determine if there is a positive or negative relationship between a given signature and a change in disease state of the patient.

The first plot on the page is a volcano plot that displays the hazard ratio and significance (p-value) both estimated from a univariate cox regression model. Below the volcano plot, a forest plot displays the hazard ratio based upon the overall survival difference based upon a signature's expression. Following the forest plot is a table that displays the expression data and hazard ratios used to generate the forest plot. Below the forest plot, a Kaplan-Meier curve for each signature is rendered. For every Kaplan-Meier curve, a risk table displays the number of patients remaining at each observation time point. Below the Kaplan-Meier curve, a survival table provides all data points in above analysis.

Hazard Ratio

Volcano Plot

The survival volcano plot displays each signature's hazard ratio and significance (p-value). Signatures that have greater statistical significance appear higher on the plot with darker, larger points, while signatures that have more extreme hazard ratios appear further from the center of the plot. Signatures further to the right are associated with decreased risk of an event relative to the baseline and signatures further to the left are associated with greater risk of an event relative to the baseline. Horizontal lines indicate 0.01 and 0.05 adjusted p-value; when the adjusted p-values range higher than 0.05, the thresholds are not shown in the plot.

Forest Plot

The survival forest plot shows the distribution of log-hazard ratio median and unadjusted 95% confidence intervals for each signature across all samples. Confidence intervals for the median hazard are depicted along the x-axis, as 95% confidence intervals. Each signature name is listed on the y-axis and is sorted by the median value from highest to lowest. A vertical axis is shown at a ratio value equal to one (log-hazard of zero) which indicates no difference from the baseline group. If there is a complete absence of significant findings, no legend will accompany the plot.

Model Table

The Model Table provides the hazard ratio values for each signature used to generate the forest plot. This table reports the signature name, log-hazard ratio (coef), 95% lower confidence of the mean limit, 95% upper confidence of the mean limit, raw hazard ratio (exp(coef)), hazard ratio standard error (coef), z score, unadjusted significance (p-value), and significance adjusted for multiple tests (FDR).

Survival Probabilities

Survival Curves

The Survival Plot is a Kaplan-Meier curve that shows the association between a signature score and survival. Samples are categorized into groups based on the distribution of expression of a signature score within the cohort, then the survival probability of each group is presented over time.

Survival Table

The Survival Table contains all the data used to generate the Kaplan-Meier plot and Risk Table. For each signature, and its expression level sub-strata, the table displays the time of an event, the event type (n.event or n.censor), the number of patients at risk (n.risk), strata hazard ratio (surv), standard error, 95% lower confidence of the mean hazard ratio, and 95% upper confidence of the mean hazard ratio.

PAM50 Subtyping Analysis

PAM50 gene expression signatures characterize intrinsic subtypes of breast cancer that are biologically and clinically distinct. Each sample is subtyped into one of the following: Luminal A (Lum A), Luminal B (Lum B), HER2-enriched (HER2-E), or Basal-like (Basal). The subtypes presented in this section represent unique biological characteristics and have prognostic value when used clinically, however the results from this report should be limited to Research Use Only (RUO) and should not be used in any medical decision making where results will go back to patients or physicians, or used as inclusion/exclusion criteria or stratification in a prospective clinical trial.

Genes Within PAM50

The 'Genes Within PAM50' heatmap uses unsupervised hierarchical clustering to group normalized gene expression for genes within the PAM50 signature. Scores are scaled by gene to have mean zero and standard deviation of one, and then truncated at ± 3 standard deviations to preserve greater clarity in color change within the largest proportion of data (99% of the data should fall within ± 3 standard deviations of the mean). Sample annotations are listed at the top of the heatmap and included genes are listed to the right of the heatmap.

Differential Expression

Differential expression analysis evaluates whether we can detect differences between two or more groups based on their gene/signature expression profiles. PAM50 subtype is used to define comparison groups. Comparison of the mean and range of expression between groups is used to understand if there are statistical differences between intrinsic subtype pairs.

Volcano Plot

The 'All Signatures' volcano plot displays each signature's difference between PAM50 subtypes, represented along the x-axis, with the significance (p-value) along the y-axis. Signatures that have greater statistical significance will produce points that are both larger and darker in hue, in addition to appearing higher on the plot. Signatures that have greater differential expression versus the baseline group appear further from the center of the plot. Signatures further to the right indicate an increase in expression and signatures further to the left indicate a decrease in expression relative to the baseline group. Horizontal lines indicate 0.01 and 0.05 adjusted p-values; when the adjusted p-values range higher than 0.05, the thresholds are not shown in the plot.

All Signatures

Forest Plot

The 'All Signatures' forest plot shows the differential expression means and 95% confidence intervals between PAM50 subtypes, for each signature on an unadjusted scale. The vertical axis is shown at fold change equal to zero, indicating equivalent expression between the groups. As the marker shifts from the center line there is an increase (shift to the right), or decrease (shift to the left), in the differential expression of that signature when compared to the baseline subtype (represented as the vertical line at zero). The shape of the marker in each box indicates whether there is a significant difference in the signature as assessed by univariate analysis (note that this significance is not adjusted for multiple comparisons). A signature is considered significant if the 95% confidence interval (the horizontal line of the signature) does not cross the vertical axis representing the baseline group and therefore no difference to that baseline group, again not adjusted for multiple comparisons. If there is a complete absence of significant findings, no legend will accompany the plot.

Test Results

The 'All Signatures' results table provides the fold-change values for each signature used to generate the volcano plot and forest plot. This table reports the signature name, variable, group/response, Log(2) transformed fold change, 95% lower confidence of the mean limit, 95% upper confidence of the mean limit, students t-test distribution score, unadjusted significance (p-value), and significance adjusted for multiple tests (FDR).

Survival Analysis

Progression-Free Survival

Survival Curve

Progression-Free Survival analysis evaluates whether patient survival over time observed differs between the four PAM50 subtypes. A Kaplan-Meier curve is rendered below with patients stratified based on PAM50 subtype. The risk table below the plot displays the number of patients remaining at each observation time point.

Survival Table

The Survival Table contains all the data used to generate the Kaplan-Meier plot and Risk Table. For each PAM50 Subtype, the table displays the time of an event, the event type (n.event or n.censor), the number of patients at risk (n.risk), strata hazard ratio (surv), standard error, 95% lower confidence of the mean hazard ratio, and 95% upper confidence of the mean hazard ratio.

Subtype Wheelplots

The subtype Wheel Plot depicts the average signature score values for each subtype. Signatures are grouped along the perimeter of the wheel based on the biological process in which they belong, and the Lum A, Lum B, HER2-E, and Basal subtype correlation scores are shown as a radial arc.

Subtype Wheel Plots

Subtype Table

The PAM50 Subtyping data table contains all of the outputs of the PAM50 algorithm. For each sample it provides the correlation values to each of the subtype calls, the clinical data for each sample if provided, ROR score or genomic risk, and risk categorization.

Single Sample Analysis

The single-sample analyses present a way of looking at the expression data that focuses on the expression levels within an individual sample, rather than groups or the entire cohort. Within this tab, the wheel plots enable a summary of the overall expression for all signatures within a sample, as comparisons of it's expression to the expression of all the samples provided. The scatter plots show the sample relative to the expression of all the samples in the report. Each scatter plot shows the signatures versus the TIS signature. The expression of the individual samples is highlighted as a larger dot within the expression of all the samples. The signature scores table provides the numerical score of each signature for each individual sample.

Wheel Plots

The Wheel Plot depicts the relative expression of each signature for an individual sample. Signatures are grouped based on the biological process in which they belong. The Lum A, Lum B, HER2-E, and Basal subtype correlation scores are shown as a radial arc. Signature scores are represented as radial projections, with negative score values highlighted by a grey outline. An individual sample can be selected from the drop-down menu.

Scatter Plots

The Subtype Scatter Plots depict the association between each of the PAM50 subtype correlation values and TIS score. An individual sample can be selected from the drop down menu and is represented on the graphs as a larger point.

Signature Scores

The signature scores for each sample are listed in the table below. Typically, the signatures are computed as a weighted linear combination of Log(2) gene expression values. The weights applied sum to one. Thus, each unit increase in score corresponds to a doubling of the biological processes that are measured. A notable exception to this general method are the calculations for TIS, PAM50, and TNBC; these are described in more detail on the Methods page.

Quality Control Details

Summary of QC Results

This page provides a summary of the quality control metrics used to assess the technical performance of the nCounter profiling assay in this study. First, housekeeper genes assess sample integrity by comparing the observed value versus a predetermined threshold for suitability for data analysis. The machine performance is assessed using percentage of fields of view that were attempted versus those successfully analyzed. The binding density of the probes within the imaging area, ERCC linearity, and limit of detection are used as readouts of the efficiency and specificity of the chemistry of the assay. Any sample deemed as failing any one of these QC checkpoints will be removed from the analysis.

Housekeeping Genes QC

This plot shows the geometric mean of housekeeper genes in each sample. Samples with low housekeeper signal suffer from either low sample input or low reaction efficiency. Ideally the geometric mean of counts will be above 404 for all samples, and a minimum geometric mean of 202 counts is required for analysis. Samples in-between these two thresholds are considered in the analysis, but results from these "borderline" samples should be treated with caution.

Imaging QC

This metric reports the percentage of fields of view (FOVs) the Digital Analyzer or SPRINT was able to capture. At least 75% of FOVs should be successful to obtain robust data.

Binding Density QC

The binding density represents the concentration of barcodes measured by the instrument in barcodes per square micron. The Digital Analyzer may not be able to distinguish each probe from the others if too many are present. The ideal range for assays run on an nCounter MAX or FLEX system is 0.1 - 2.25 spots per square micron and assays run on the nCounter SPRINT system should have a range of 0.1 - 1.8 spots per square micron.

Positive Control Linearity QC

This metric performs a correlation analysis after Log(2) transformation of the expression values. The correlation is tested between the known concentrations of positive control target molecules added by NanoString and the resulting Log(2) counts. Correlation values lower than 0.95 may indicate an issue with the hybridization reaction and/or assay performance.

Limit of Detection QC

The limit of detection of the assay compares the positive control probes and the negative control probes. Specifically, it is expected that the 0.5 fM positive control probe (Pos_E) will produce raw counts that are at least two standard deviations higher than the mean of the negative control probes (represented by the boxplot). The critical value for each sample is drawn as a red horizontal line for each sample.

Table of QC Flags

Table of Sample Annotations

BC360 Biological Signatures

Signature Introduction

Signatures are organized here in alphabetical order. They are color-coded by biology, similar to the color-coding in this image. Tumor signatures are listed in orange, Immune signatures in blue, and Micronenvironment signatures in green. Breast cancer specific signatures are listed in pink. Below each signature name is the signature category with which it is associated.

Biological Signatures

Signature Scales

Most scores can be interpreted on a log2 scale, with a unit increase in score corresponding to a doubling of its gene expression levels.

APM

Tumor Immunogenicity

Antigen presenting (or processing) machinery. This signature measures the abundance of genes in the MHC Class I antigen presentation pathway and some key genes involved in processing the antigens prior to presentation. Typically, antigens from the cell cytoplasm are presented on Class I and recognized by the TCR on cytolytic CD8+ T cells. MHC Class I is expressed by all nucleated cells in the body, but downregulation of Class I MHC pathways is an evasion strategy that can be employed by tumor cells. An effective anti-tumor immune response depends on cytolytic T cells encountering neoantigens presented on the tumor cell surface. Strong anti-tumor immune responses are typically accompanied by high expression of antigen presentation genes.

Apoptosis

Tumor Regulation

This signature captures genes associated with apoptotic processes, specifically with genes involved in mitochondrial membrane integrity. It includes both pro- and anti-apoptotic genes.

AR

Breast Cancer Receptors

This gene is a type of nuclear receptor that is activated by binding any of the androgenic hormones. AR is widely expressed in breast cancer and has been shown to characterize a distinct molecular subset of triple negative breast cancer (TNBC) and suggested as a potential target candidate in this form of breast cancer.

B7-H3

Inhibitory Immune Mechanisms

B7-H3 (CD276) gene expression. B7-H3 is a negative regulator of T cell activity that is expressed on both tumor and immune cells.

Basal-Like

Breast Cancer Subtyping

Basal-like tumors are typically characterized as having low expression of ER, PR, and HER2. Most clinically triple negative tumors are Basal-like subtype by molecular profiling. These tumors are poorly differentiated invasive high-grade ductal carcinomas that by have metastatic properties.

BC p53

Tumor Mutational Response

This signature categorizes p53 status by mutant-like vs wild-type-like in breast cancer and the signature is significantly associated with overall survival in breast cancer, identifying a group with high unmet need.

BC Proliferation

Tumor Regulation

This signature outputs the PAM50 proliferation score by measuring key genes involved in breast tumor proliferation. In some cases, a highly proliferative breast tumor may correlate with an increase in disease progression or metastasis.

BRCAness

Tumor Mutational Response

This signature captures breast cancer biology that is informative as to defects in the DNA damage repair-genes BRCA1 and BRCA2. Similar to the Homologous Recombination Deficiency signature this captures breakdown in DNA damage repair, however, these are specific to BRCA-related mutations and more heavily weighted to BRCA1 mutants.

CD8 T-Cells

Immune Cell Abundance

This signature measures the abundance of CD8+ T cells in the tumor microenvironment.

CDK4 Expression

Breast Cancer Signaling Pathways

Cyclin-dependent kinases 4 and 6 (CDK4/6) play a key role in the regulation of proliferation in normal breast tissue and breast tumors. CDK4/6 inhibitors have been indicated in hormone receptor (HR) positive metastatic breast cancer. Cyclin-dependent kinase 4 (CDK4) is an enzyme encoded by the CDK4 gene, mutations in this gene as well as in its related proteins have been shown to be associated with tumorigenesis.

CDK6 Expression

Breast Cancer Signaling Pathways

Cyclin-dependent kinases 4 and 6 (CDK4/6) play a key role in the regulation of proliferation in normal breast tissue and breast tumors. CDK4/6 inhibitors have been indicated in hormone receptor (HR) positive metastatic breast cancer. CDK6, as well as CDK4, has been shown to phosphorylate and regulate the activity of the tumor suppressor protein Retinoblastoma and indicating a role in cancer development.

Cell Adhesion

Tumor Regulation

Epithelial cells use tight junction complexes to adhere to each other. Some breast cancers have greatly down-regulated expression of one or more of the genes coding for tight junction proteins. This phenomenon is common in claudin-low breast cancers, but it is not confined to that subtype. This signature scores samples for down-regulation in any of these tight junction genes.

Claudin-Low

Breast Cancer Subtyping

This molecular subtype is characterized by low levels of luminal differentiation markers, high enrichment for epithelial-to-mesenchymal transition markers, immune response and cancer stem cell-like genes.

Cytotoxic Cells

Immune Cell Abundance

This signature measures the abundance of cytotoxic cells in the tumor microenvironment. Cytotoxic cells such as natural killer (NK) and CD8+ T cells use a number of molecules, including perforin, granzymes and killer cell lectin-like receptor (KLRG) family members to recognize, penetrate and kill infected cells. Cytotoxic activity is the mechanism by which the immune system most effectively kills tumor cells.

Cytotoxicity

Anti-Tumor Immune Activity

This signature measures the molecules used by natural killer (NK) and CD8+ T cells to mount a cytolytic attack on tumor cells. Cytotoxic cells such as NK and CD8+ T cells, use a number of molecules, including perforin, granzymes and granulysin to penetrate and kill infection cells and tumors. Cytotoxic activity is the mechanism by which the immune system most effectively kills tumor cells.

Differentiation

Tumor Regulation

This signature assigns a score of differentiation to the sample. Well-differentiated tumors that is phenotypically more similar to normal cells or tissue will grow and spread at a slow rated compared with poorly differentiated tumors, these present with abnormal cells that often grow rapidly.

Endothelial Cells

Stromal Factors

This signature measures genes associated with vascular tissue and angiogenesis. Angiogenesis is important for nutrient trafficking to the tumor and proper oxygenation for tumor growth. Tumor angiogenesis forms leaky inefficient vessels that can reduce efficiency of lymphocyte trafficking to tumors.2

ER Signaling

Breast Cancer Signaling Pathways

Estrogen-binding systems associate with various proteins that direct cell cycle signaling, proliferation and survival. This signature captures ER-mediated signaling pathways to elucidate how ER modulates activity of key transcription factors through stabilizing DNA-protein complexes and recruiting co-activators. This signature also captures the impact to other signaling pathways induced by the binding of estrogens in the nuclear causing conformational changes in the receptors.

ERBB2

Breast Cancer Receptors

This gene encodes a member of the EGF receptor family of receptor tyrosine kinases. This protein has no ligand binding domain of its own and therefore cannot bind growth factors. However, it does bind tightly to other ligand-bound EGF receptor family members to form a heterodimer, stabilizing ligand binding and enhancing kinase-mediated activation of downstream signaling pathways. Amplification and overexpression are well established in breast cancer and the associated protein is a key pathological marker.

ESR1

Breast Cancer Receptors

This gene encodes an estrogen receptor, a ligand-activated transcription factor composed of several domains important for hormone binding, DNA binding, and activation of transcription. The associated ER protein is a key pathological marker of breast cancer.

FOXA1

Tumor Regulation

This transcription factor is involved in the regulation of gene expression in differentiated tissues. Sometimes associated with BRCA1 through cell cycle regulation. Also involved in ESR-1 mediated transcription and required for ESR1 binding to the NKX2-1 promoter in breast cancer.

Genomic Risk

Breast Cancer Prognosis

The Genomic Risk of Recurrence score (Genomic Risk) is calculated by comparing the expression profiles of 46 genes in the sample with the four PAM50 centroids, to calculate four different correlation values. These correlation values are then combined with the PAM50 proliferation score to estimate the genomic risk of distant recurrence. The results are reported on a scale of 0 to 100, with 0 being lowest risk and 100 being highest risk. This score is distinct from the Risk of Recurrence (ROR) score, as it does not include the tumor size included in the score calculation – it is solely based on the genomic data.

HER2-E

Breast Cancer Subtyping

HER2-Enriched tumors are typically characterized as clinically HER2 positive breast cancer as defined by traditional IHC/FISH criteria. Some studies have indicated that the HER2-Enriched molecular subtype may be a better predictor of response to HER2-targeted therapies when compared with IHC and FISH.

HRD

Tumor Mutational Response

This signature is used to functionally assess Homologous Recombination Repair status, with potential to predict sensitivity to DNA-damage repair inhibitors such as PARP inhibitors. This captures cell cycle regulation, DNA damage, DNA replication, and DNA recombination and repair pathways. Additionally, this signature is also used to predict overall survival in breast cancer.

Hypoxia

Inhibitory Metabolism

This signature measures genes associated with reduced oxygenation in the tumor. Hypoxia can induce expression of many cancer promoting processes (e.g. invasion, motility, metabolic reprogramming) and can promote resistance to immune cell-mediated cytolysis and reduced cytolytic activity in natural killer (NK) and CD8+ T cells.

IDO1

Inhibitory Immune Mechanisms

Indoleamine 2,3-dioxygenase 1 gene expression. IDO1 is expressed by tumor, immune, and stromal cells and is the rate-limiting enzyme of tryptophan catabolism. By catalyzing the degradation of tryptophan, which is necessary for cytolytic T cell proliferation and activity, IDO1 inhibits anti-tumor immune responses.

IFN Gamma

Anti-Tumor Immune Activity

This signature tracks the canonical response to type II interferon, including the most universal components of that response. IFNγ induces macrophage and natural killer (NK) cell activation, increases antigen presentation, and induces gene transcription patterns that can lead to immune cell recruitment to the tumor. IFNγ signaling expression is associated with response to anti-PD1/L1 therapy.

Inflammatory Chemokines

Inhibitory Immune Signaling

Inflammatory chemokines recruit both myeloid and lymphoid populations to the tumor microenvironment.

Lum A

Breast Cancer Subtyping

Luminal A tumors are typically characterized by high expression of estrogen receptor (ER), progesterone receptor (PR), and genes associated with ER activation1. These tumors are low-grade, tend to grow slowly, exhibit low expression of genes associated with cell cycle activation and have the best prognosis.

Lum B

Breast Cancer Subtyping

Luminal B tumors are typically characterized by high expression of estrogen receptor (ER), progesterone receptor (PR), and genes associated with ER activation1. These tumors tend to grow slightly faster than Luminal A tumors, exhibit high expression of genes associated with cell cycle activation and proliferation, and have a slightly worse prognosis than Luminal A tumors.

Macrophages

Immune Cell Abundance

This signature measures the abundance of macrophages in the tumor microenvironment. Macrophages can either augment tumor immunity (e.g. by presenting antigen) or suppress tumor immunity (e.g. by releasing immunosuppressive cytokines).

Mammary Stemness

Tumor Regulation

This signature measures a cluster of epithelial-to-mesenchymal transition (EMT) genes that are up-regulated in tumors with stem-cell-like expression profiles. Higher signature scores indicate more stem-like tumors.

Mast Cells

Immune Cell Abundance

This signature measures the abundance of mast cells in the tumor microenvironment.

MHC2

Anti-Tumor Immune Activity

This signature measures the major human leukocyte antigens (HLA) involved in MHC Class II antigen presentation. Professional antigen presenting cells (dendritic cells, macrophages and B cells) use the class II MHC to present extracellular antigens to CD4+ T cells. Activation of CD4+ T cells induces expression of cytokines that can promote cytotoxic T cell activation and effective anti-tumor adaptive immune responses. Presence of MHC Class II molecules is associated with improved patient outcome.

PAM50

Breast Cancer Subtyping

This 50-gene signature measures a gene expression profile that allows for the classification of breast cancer into four biologically distinct subtypes (Luminal A, Luminal B, HER2-Enriched, Basal-like).

PD-1

Inhibitory Immune Signaling

Program cell death receptor 1 gene expression. Program cell death receptor 1 (PD-1, PDCD1, CD279) is expressed predominantly on lymphocytes. It is upregulated upon activation and becomes a negative regulator of activation by preventing proliferation and cytokine secretion. PD-1 expression has been shown to be associated with tumor-specific T cells.

PD-L1

Inhibitory Immune Mechanisms

Program cell death ligand 1 gene expression. Program cell death ligand 1 (PD-L1, CD274) is a ligand for PD-1 and negative regulator of T cell activity that is expressed on both tumor and immune cells.

PD-L2

Inhibitory Immune Signaling

Program cell death ligand 2 gene expression. Program cell death ligand 2 (PD-L2, PDCDLG2, CD273) is a ligand for PD-1 and negative regulator of T cell activity that is expressed on antigen-presenting cells.

PGR

Breast Cancer Receptors

This gene encodes a member of the steroid receptor superfamily. The encoded protein mediates the physiological effects of progesterone, which plays a central role in reproductive events and the associated protein is a key pathological marker of breast cancer.

PTEN

Breast Cancer Signaling Pathways

Phosphatase and tensin homolog gene expression. PTEN is a tumor suppressor gene that functions through the regulation of the Akt/PKB signaling pathway. Mutations or loss of PTEN expression are common across a range of cancer types, including breast cancer.

Rb1

Tumor Regulation

Retinoblastoma protein gene expression. RB is involved in cell cycle regulation and tumor progression. RB gene loss is occurs predominantly in triple negative breast cancer, as a result of homozygous deletion.

SOX2

Tumor Regulation

SRY (sex determining region Y)-box 2 transcription factor gene expression. SOX2 regulates a number of critical processes in breast cancer including cell proliferation and metastasis. SOX2 expression has been shown to be associated with the prognosis of metastatic tumors and disease recurrence.

Stroma

Stromal Factors

This signature measures stromal components in the tumor microenvironment. The tumor stroma is the collection of non-cancerous and nonimmune tissue components surrounding the tumor. Stroma can act as a physical barrier that excludes immune cells from the tumor, preventing effective anti-tumor immunity even when tumor-associated antigens have induced immune cell priming and activation. These cells can also secrete important signals to the tumor, affecting tumor biology and response to the immune system.

TGF-Beta

Inhibitory Immune Mechanisms

Transforming Growth Factor Beta gene expression. TGFβ (TGFB1) is a pleotropic cytokine which inhibits anti-tumor immune activity and promotes tumor growth and survival.

TIGIT

Inhibitory Immune Signaling

T cell immunoreceptor and Ig and ITIMS gene expression. T cell immunoreceptor and Ig and ITIMS domains (TIGIT) is an immune checkpoint molecule that suppresses anti-tumor immune activity in CD8+ T cells and NK cells.

TIS

Anti-Tumor Immune Activity

Tumor Inflammation Signature. TIS measures the abundance of a peripherally suppressed adaptive immune response within the tumor.

  • This signature is trained to predict response to anti-PD1 therapy (pembrolizumab). It consists of genes related to Interferon gamma signaling (IFNγ), antigen presentation, natural killer (NK) and T cells and inhibitory pathways. It also consists of normalization genes that have been selected to give consistent expression levels across most tissue or tumor types.
  • This signature is useful for predicting response to anti-PD1 therapy and determining hot and cold immune status across multiple cancer types.

Treg

Immune Cell Abundance

Regulatory T cell abundance. Treg is measured by gene expression of Forkhead box P3 (FOXP3). FOXP3 is the canonical transcription factor that defines the regulatory T cell (Treg) population and is used to measure Treg abundance. Regulatory T cells suppress other T cell activities through a variety of mechanisms.

Selected Publications

Wallden, Bret, et al. Development and verification of the PAM50-based Prosigna breast cancer gene signature assay. BMC Med Genomics 8:54 (2015).

Perou, Charles M., et al. Molecular portraits of human breast tumours. Nature 406.6797 (2000): 747.

Ayers, Mark, et al. IFN-γ-related mRNA profile predicts clinical response to PD-1 blockade. The Journal of Clinical Investigation 127.8 (2017).

Haddad, Robert I., et al. Genomic determinants of response to pembrolizumab in head and neck squamous cell carcinoma (HNSCC). The Journal of Clinical Investigation (2017): 6009-6009.

Prat, Aleix, et al. Phenotypic and molecular characterization of the claudin-low intrinsic subtype of breast cancer. Breast cancer research 12.5 (2010): R68.

Burstein, Matthew D., et al. Comprehensive genomic analysis identifies novel subtypes and targets of triple-negative breast cancer. Clinical Cancer Research (2014).

Troester, Melissa A., et al. Gene expression patterns associated with p53 status in breast cancer. BMC cancer 6.1 (2006): 276.

Severson, Tesa M., et al. The BRCA1 ness signature is associated significantly with response to PARP inhibitor treatment versus control in the I-SPY 2 randomized neoadjuvant setting. Breast Cancer Research 19.1 (2017): 99.

Peng, Guang, et al. Genome-wide transcriptome profiling of homologous recombination DNA repair. Nature communications 5 (2014): 3361.

Methods

Normalization

Normalization in BC 360 differs from normalization in nSolver. The goal is to adjust for cartridge differences using either a panel standard or reference sample, such that comparisons can be made between the scores across batches. Panel standard is a DNA oligo blend containing all BC 360 probe target sequences, that is run on each cartridge within the experiment for normalization of non-PAM50 genes; while reference sample is an RNA oligo blend containing PAM50 probe target sequences for PAM50 genes. Normalization takes place in two steps. The first step differs depending on whether the genes are in the PAM50 or TIS signatures, or not, and is described below. Zero counts on the raw scale are converted to ones prior to normalization.

Housekeeper Normalization for non-TIS and non-PAM50 Genes

Genes are normalized using a ratio of the expression value to the geometric mean of all housekeeping genes on the panel.

Housekeeper Normalization for TIS Genes

Genes in the TIS signature are normalized using a ratio of the expression value to the geometric mean of the housekeeper genes used only for the TIS signature.

Housekeeper Normalization for PAM50 Genes

Genes in the PAM50 signature are normalized using a ratio of the expression value to the geometric mean of the housekeeper genes used only for the PAM50 signature.

Panel Standard Normalization for non-PAM50 Genes

Genes not in the PAM50 signature are additionally normalized using a ratio of the housekeeper-normalized data and a panel standard run on the same cartridge as the observed data. In the absence of a panel standard column, values from panel standard run on the same codeset lot as the observed data may be substituted. If a cartridge is missing a panel standard run, the average of all panel standards present is substituted for that cartridge.

Reference Sample Normalization for PAM50 genes

Genes in the PAM50 signature are additionally normalized using a ratio of the housekeeper-normalized data and a reference sample run on the same codeset lot from a NanoString archive is used.

Final Adjustments

The housekeeper-normalized and panel standard-normalized data is Log(2) transformed. A constant of 8 is added to TIS so that it is on the same scale as investigational use only (IUO) TIS, making scores comparable across research use only (RUO) and IUO assays. Other non-TIS signatures are also adjusted with constants to express values in a similar range.

Differential Expression Analysis

Grouping Variable

Differential expression is fit on a per gene or per signature basis using a linear model for analyses without a blocking factor. The statistical model uses the expression value or signature score as the dependent variable and fits a grouping variable as a fixed effect to test for differences in the levels of that grouping variable.

Expression(gene or signature)= μ+Group+ε

P-values are adjusted within each analysis, gene or signature, and on the grouping variable level difference t-test using the Benjamini and Yekutieli False Discovery Rate (FDR) adjustment to account for correlations amongst the tests. All models are fit using the limma package in R.

Survival Analysis

Grouping Variable

If a grouping variable is present, the survival analysis used to create the forest plot incorporates a proportional hazards model with the survival outcome as a dependent variable, the observed normalized gene expression or signature score data as a continuous covariate, and the grouping variable included as a strata variable in the model which results in the model being a frailty model.

Survival(time,event)= μ+Expression_(gene or signature)+Group+ ε

The analysis method is performed on a by gene or by signature basis, as appropriate, and uses the regression routines implemented in the R package survival. All p-values are adjusted for the number of tests within each type of analysis (gene or signature) using the Benjamini and Yekutieli False Discovery Rate (FDR) method to account for correlations amongst the tests.

There are no Kaplan-Meier curves available for frailty models and thus are not present when the analysis is fit as a frailty model with a random effect.

PAM50 Subtypes

PAM50 Subtype calls are the result of a three step algorithm. The first step involves a scaling using two sets of scaling factors to bring the housekeeper and reference sample expression values into the scale necessary for the next step. This second step calculates the correlation between the observed scaled expression for the PAM50 genes and a centroid for each of the four subtypes resulting in a set of four correlation values for each sample. The remaining step is to identify the subtype correlation with the greatest value and set that subtype as the subtype call for that sample.

Genomic Risk of Recurrence (Genomic Risk)

Genomic Risk scores are the result of a multiple step algorithm. The first step involves a scaling using two sets of scaling factors to bring the housekeeper and reference sample expression values into the scale necessary for the next step. This second step calculates the correlation between the observed scaled expression for the PAM50 genes and a centroid for each of the four subtypes that is different than that for calling subtypes and results in a set of four correlation values for each sample. The next step is to calculate a proliferation score for each sample, followed by taking a weighted sum of the proliferation score and the four subtype correlations. This last score is then scaled to be between 0 and 100. No tumor size information is utilized, only the genomic information portion of ROR.

References

Wallden, Brett et al. Development and verification of the PAM50-based Prosigna breast cancer gene signature assay. BMC medical genomics 2015, (8)54: doi:10.1186/s12920-015-0129-6

Sestak I, et al. Prediction of Late Distant Recurrence After 5 Years of Endocrine Treatment: A Combined Analysis of Patients From the Austrian Breast and Colorectal Cancer Study Group 8 and Arimidex, Tamoxifen Alone or in Combination Randomized Trials Using the PAM50 Risk of Recurrence Score. Journal of Clinical Oncology 2016,33(8):916-922

Parker JS, et al. Supervised Risk Predictor of Breast Cancer Based on Intrinsic Subtypes. Journal of Clinical Oncology 2009, 27(8): 1160–1167.

Geiss G, et al. Direct multiplexed measurement of gene expression with color-coded probe pairs. Nature Biotechnology 2008; 26: 317–25.

Benjamini Y and Yekutieli D. 2001. The control of the false discovery rate in multiple testing under dependency. Annals of Statistics 29:4:1165-1188 https://projecteuclid.org/euclid.aos/1013699998

Ritchie ME, Phipson B, Wu D, Hu Y, Law CW, Shi W, Smyth GK (2015). limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Research, 43(7), e47. https://bioconductor.org/packages/release/bioc/html/limma.html

Therneau TM (2015). A Package for Survival Analysis in S. version 2.38, https://CRAN.R-project.org/package=survival.

Therneau TM and Grambsch PM (2000). Modeling Survival Data: Extending the Cox Model. Springer, New York. ISBN 0-387-98784-3.