How to use this website
About the browser
This site provides through the gene expression data from 4 curated databases (GEO, TCGA, CCLE and GDSC), with over 34.000 samples combined, originating from patient tumor material, normal tissue, and about 2.000 cell line samples. To discern the different subtle gene expression patterns contained in this huge amount of data, the gene expression data is statistically analyzed and decomposed into estimated sources of gene expression, called Transcriptional Components. Each of these transcriptional components captures a part of the variance seen in mRNA expression: it therefore contains its own unique gene expression pattern, which may be e.g. under control of a transcription factor, be the result of a copy number alteration, or contain some biological process. The Transcriptional Component might be associated to a specific cancer subtype, or broadly expressed over several types. In this metabolic browser specifically, we filtered the complete dataset for those transcriptional components that capture a (part of) a metabolic process: a metabolic gene expression landscape, so to say. Every one of the four gene expression databases has its own set of (metabolic) Transcriptional Components, which can be linked to eachother.
Using the gene browser
The gene browser gives the possibility to search for a single gene in all of the Transcriptional Components: in every Component, the gene has a certain weight. This gene weight gives information on the importance of that gene in the relevant component. The higher the weight, the higher the contribution of this gene to the expression pattern that is described by the component. The search result gives a sorted list of these gene weights per Transcriptional Component. Click "browse" to view data on a specific Transcriptional Component such as the weights of other (coregulated) genes within that component, gene set enrichment scores and information on the activation score of the component in several tumor types.
Using the tissue type browser
Similarly, it is possible to search for a tumor and/or tissue type in the metabolic landscape: each sample of that tissue type has an activation score attached to it per Transcriptional Component. When the activation score of a sample is multiplied with the corresponding gene expression pattern in a Transcriptional Component, we can generate back the original gene expression pattern of that sample. If this is done for every TC, the total gene expression pattern for a sample is regenerated. Thus, looking at the average activation score of all samples of a certain tissue type gives a measure of the amount at which the gene expression pattern within that concerned transcriptional component is active in that tissue. Furthermore, the highest and lowest sample scores are also given, and gives an estimate of the spread of all the activation scores for that tissue type. So: use this tool to find out if your tissue type has a high average activation score in one of the transcriptional components, indicating whether or not this tissue type might have one or several (unique) gene expression patterns. Then click "browse" to view data on the gene expression pattern within a transcriptional component, as well as gene set enrichment scores and information on the activation score of the component in this and several other tumor types.
Using the gene set browser
One might also be interested in a metabolic process as a whole, instead of just the expression of a single gene. Therefore, for every Transcriptional Component, Gene Set Enrichment Scores have been calculated. The (absolute) enrichment score gives information on the enrichment of the genes of your gene set in each of the transcriptional components; and thus is an estimate of the importance of the genes of your gene set within the relevant transcriptional component. Use this tool to find out if a biological process might be represented by the gene expression pattern in one of the transcriptional components, and to dive into the biology of that component. Click "browse" to view data on the genes within that component, as well as the enrichment scores of other gene sets and information on the activation score of the component in several tumor types.
The Transcriptional Component and its data
As introduced, every Transcriptional Components comes with its own sets of data: the gene level data, tissue information (activation scores of samples), gene set enrichment scores, and its correlation to other transcriptional components. Here, we introduce each of these datasets.
First off, the gene level data. This table contains the gene expression pattern of the Transcriptional Component, i.e. the weights of every gene, ordered by their absolute values. This gene weight might be negative or positive; whether or not the gene is over- or under-expressed in a sample depends on the sign of the activation score of the TC in that sample (a very negative gene weight in a sample with a negative activation score means that the gene in reality is overexpressed in that sample: after all, negative times negative equals positive).
To investigate the biology that lies behind the genes of a transcriptional component, one can take a look at the gene set enrichment scores. A high enrichment score for a gene set means that the genes of that gene set are relatively near the top (or bottom) of the complete list of genes. The top and bottom of the 'gene list' contain the genes with the highest weights, thus a gene set with a high enrichment score might describe the biological process that underpins the transcriptional component. Included are the gene sets as defined by Biocarta, KEGG, GO molecular functions, GO biological processes, Reactome and Transcription Factor targets. To investigate the actual weights of the genes that are a member of a gene set, click "show member genes", and a window will pop up with the data on only the genes of that gene set.
To view the metabolic landscape of several cancer types, importantly, the "Tissue level data" plot gives information on the activation score of the transcriptional component, for every single sample within the selected database: every single dot in the plot corresponds to a single sample or cell line. These activation scores are plotted per tissue (tumor) type, indicating how the transcriptional component is activated in a tissue type as a whole. Higher positive or more negative scores mean that the gene expression pattern found in the Transcriptional Component is more prominent in that sample.
Correlated Transcriptional Components
Finally, the gene expression pattern of a transcriptional component can be compared to that of others. In this way, correlated transcriptional components can be found in the datasets from other databases. To this end, select a database in the dropdown menu and subsequently choose a transcriptional component from this database to browse through. Highly correlated components are interesting, as it extends findings from one gene expression database to others, meaning that the patterns found are independent of the measurement platform. Furthermore, by comparing components from patient tissue databases to the cell line compendia (CCLE or GDSC), one might be able to find a cell line that contains a certain gene expression pattern also found in a tumor subtype.