For each pathway, an activity level is summarized into a single score based on the gene expression of all genes within the pathway.
Plot Descriptions and Notes Top
Note: Plots displayed depend on analysis options selected
Heatmap of the correlation matrix of pathway scores across all samples. Red indicates pairs of pathway score profiles that are highly correlated across all samples, grey indicates correlations close to zero, and blue denotes highly negative correlations.
Heatmap of all pathway scores. Red indicates low pathway score; yellow indicates high pathway score.
Principal components analysis maps high-dimensional datasets onto a smaller number of highly informative dimensions. Here, the first four principal components of the pathway scores are plotted against each other and colored by the selected covariate.
Selected pathway score is plotted against each selected covariate. The boxplot whiskers are 1.5 IQR and the middle line debnotes the median.
Each plot compares the selected pathway score (X-axis) to another pathway score (Y-axis).
Genes are tested for differential expression in response to each selected covariate. For each gene, a single linear regression is fit using all selected covariates to predict expression. This approach eliminates confounding due to measured covariates and isolates the independent association of each covariate with gene expression, measuring each variable's association with a gene after holding all other variables constant.
Plot Descriptions and Notes Top
Volcano plot displaying each gene's -log10(p-value) and log2 fold change with the selected covariate. Highly statistically significant genes fall at the top of the plot, and highly differentially expressed genes fall to either side. Horizontal lines indicate thresholds for p=0.01 and p=0.001. The 40 most statistically significant genes are named.
Table presenting the 20 most statistically significantly differentially expressed genes with the selected covariate. "Estimated log fold-change" estimates a gene's differential expression. For categorical covariates, a gene is estimated to have 2^(log fold change) times its expression in baseline samples, holding all other variables in the analysis constant. If the covariate is continuous, for each unit increase in the selected covariate, a gene's expression is estimated to increase by 2^(log fold change)-fold, holding all other variables in the analysis constant. The 95% confidence interval for the log fold change is also presented, along with a p-value and an adjusted p-value or FDR if requested.
The results of differential expression testing are summarized at the pathway level. Each pathway's most differentially expressed genes are identified, and the extent of differential expression in each pathway is summarized using a "global significance statistic."
Plot Descriptions and Notes Top
Note: Plots displayed depend on analysis options selected
Heatmap displaying each sample's global significance scores. Global significance statistics measure the extent of differential expression of a pathway's genes with a covariate, ignoring whether each gene is up- or down-regulated. Yellow denotes pathways whose genes exhibit extensive differential expression with the covariate, red denotes pathways with less differential expression.
Heatmap displaying each sample's directed global significance scores. Directed global significance statistics measure the extent to which a pathway's genes are up- or down-regulated with the variable. Red denotes pathways whose genes exhibit extensive over-expression with the covariate, blue denotes pathways with extensive under-expression.
Volcano plot displaying each gene's -log10(p-value) and log2 fold change for the selected covariate. Highly statistically significant genes fall at the top of the plot, and highly differentially expressed genes fall to either side. Selected pathway genes are highlighted in blue. Horizontal lines indicate thresholds for p=0.01 and p=0.001.
Table presenting the selected pathway's 10 most statistically significantly differentially expressed genes with the selected covariate. "Estimated log fold-change" estimates a gene's differential expression with the selected covariate. For the samples in a given reference level, if the selected covariate is categorical, a gene is estimated to have 2^(log fold change) times its expression in baseline samples, holding all other variables in the analysis constant. If the selected covariate is continuous, for each unit increase in the selected covariate, a gene's expression is estimated to increase by 2^(log fold change)-fold, holding all other variables in the analysis constant. The 95% confidence interval for the log fold change is also presented, along with a p-value and an adjusted p-value or FDR if requested.
Pathview plots show differential expression results from the selected covariate in the selected pathway.
Plot Descriptions and Notes Top
Note: Only .png files are created for Pathview even if additional plot types are selected.
 
Note: If an invalid custom KEGG ID was selected, the pathway page will be created, but the image links will be broken. Please check your KEGG ID and re-run analysis.
Each node represents a protein; node colors are derived from the differential expression of all genes corresponding to the protein. White indicates a pathway gene not present in the data. Red indicates a high log fold-change or t-statistic; green indicates a low log fold-change or t-statistic.
Displays plots that detail the impact of normalization on the data.
Plot Descriptions and Notes Top
Displays the geNorm pairwise variation statistic after successive genes are removed. This statistic cannot be computed for the final two genes, which are therefore not displayed. The ideal normalization gene set will minimize the pairwise variation statistic.
Histograms of the samples' mean log expression before and after normalization using the selected genes. Successful normalization will tend to produce a narrower range of mean log expression.
This module generates a series of high level plots that describe the data overall and may be useful for identifying anomalous data and/or covariates.
Plot Descriptions and Notes Top
Note: Plots displayed depend on analysis options selected
The Pearson correlation coefficient of gene expression is calculated between each set of samples to create a correlation matrix across all samples. Red indicates pairs of samples with highly correlated gene expression profiles, grey indicates correlations close to zero, and blue denotes highly negative correlations.
Heatmap of all normalized gene expression data, scaled to give all genes equal variance. Red indicates low expression; yellow indicates high expression. Genes are arranged by pathway (and duplicated when necessary) in the following order: Notch, Wnt, HH, ChromMod, TXmisReg, DNARepair, TGFB, MAPK, STAT, PI3K, RAS, Apop, CC.
If normalization is performed, each gene's variance in the log-scaled, normalized data is plotted against its mean value across all samples. Highly variable genes are indicated by gene name. Housekeeping genes are color coded according to their use (or disuse) in normalization.
For each covariate included in the analysis, a histogram of p-values testing each gene's univariate association with the chosen covariates is displayed. Covariates with largely flat histograms tend to have little association with gene expression; covariates with histograms with significantly more mass on the left are either associated with the expression of many genes or are confounded with a covariate that is associated with the expression. Low P values indicate strong evidence for an association.
Pairwise comparisons of all covariates in the analysis. The type of plot is dependent on the types of variables compared; A categorical vs. categorical covariate plot is shown as a bar chart of counts (Y axis). Continuous vs. categorical covariates generate a boxplot with whiskers denoting 1.5 IQR. Continuous vs. continuous covariates are compared via a scatter plot. Variables that are correlated with a biological variable of interest are potential confounders that may influence downstream analyses.
Principal component analysis maps high-dimensional datasets onto a smaller number of highly informative dimensions. Here, the first four principal components of the gene expression data are plotted against each other and colored by the values of the selected covariate. This plot may be used to identify clusters in the data and to identify variables associated with prominent signal in the data. Variables that are associated with these leading principal components should be considered in downstream analyses.
Heatmap of the pathway's normalized gene-scaled data, scaled to give all genes equal variance. Yellow indicates high expression, red low expression.
Each pathway's relationship with each covariate is viewed through two lenses: through the results of the differential expression analyses, and through its pathway score's association with the covariates.
Plot Descriptions and Notes Top
Two measures of a pathway's relationship with a covariate are displayed.
 
The first summarizes the pathways' behavior in the differential expression analysis using their global significance statistics. A high global significance statistic indicates a pathway's genes are extensively differentially expressed in response to the covariate.
 
The second measure summarizes the behavior of the pathway scores with the covariate. For each pathway score, a linear regression has been fit predicting the pathway score from the selected covariates. The horizontal axis plots each regression's -log10(p-value) for the association between its pathway score and the covariate. A high -log10(p-value) indicates a pathway score that is highly statistically significantly associated with the covariate.