Displaying experimental data using the Omics Viewer
The Omics Viewer is a tool for evaluating the results of large scale gene expression, metabolomic or proteomic studies in a metabolic context.
  • Users can upload their own data to overlay onto the metabolic map.
  • Depending on the type of data, reactions lines or compounds will be colored according to the submitted values.

Note: The Omics Viewer can be used only in SINGLE SPECIES databases such as AraCyc.

We also provide a short visual presentation on the Omics Viewer as a Powerpoint or PDF file.

To access the Omics Viewer:

  • From any page, use the menu bar to select Tools -> Omics Viewer and then select the species of interest

  • From the Query page select the organism of interest using the Select a dataset drop-down menu and then click on the Cellular Overview:"Omics Viewer" link found halfway down the page.

  • From the PMN homepage, click on the "Overlay my data" button next to the single-species database of interest.

The Omics Viewer page gives instructions and also provides links to additional help materials from the Pathway Tools software.

This tutorial will provide information about:
  1. Preparing data files for analysis
  2. Entering the data into the program
  3. Understanding and analyzing the output

Preparing data files for analysis

In order to visualize data on the Omics viewer

  • The data should be formatted as a tab delimited text file.

  • Word documents or Excel spreadsheets MUST be saved as TEXT only -> .txt .

  • Use decimal points in numerical data - e.g. 1.24

  • Use minus signs to represent negative numbers, not parenthesis - e.g. -2.36

  • The first column of the file (referred to as column zero) should only contain the name of the object, such as a gene or compound name.

    • It is important to know how the genes and/or proteins from your organism are named in the database to generate a usable input file

    • For example, when entering a gene from Arabidopsis, you may enter At1g23450, but not At1g23450.1.

    • Please contact us if you have any questions about how to select the right "names" for your input data.

  • The remaining columns (1 and higher) will contain the data values.

  • The program can accept absolute or relative data (e.g. fold-difference or absolute levels of transcripts, proteins, or metabolites)

  • The program can calculate ratios using two columns of different values (e.g. mutant versus wild-type, treated vs. untreated, or time point 1 vs. 0)

  • The program can display data series (e.g. changes in expression at time 1, 2, and 3)


You can practice using this tool with a sample file for transcriptional data from. Arabidopsis.
Please note that you will have to select the species Arabidopsis thaliana COL at the Omics Viewer site to use this file

.
  • This file contains 6 columns.

  • The first column (a.k.a. column zero) contains gene names.

  • The remaining columns display a series of time points showing expression values from a microarray experiment.



back to top
________________________________________________________________________________

Entering the data into the program

  1. In the Omics viewer page go to the section 'Select a dataset' and from the drop down menu select a database (For example: Arabidopsis thaliana COL).

  2. Upload the data file from your computer. Type in the location of the file or use the browse function to locate and select the file.

  3. Set the data values to absolute or relative.


  • A. Select relative if your data express a fold-change.

    • If you have already calculated the fold-change and have this as a column of data, select the "a single data column" option.
    • If you want to calculate relative values from two or more columns in your dataset (for example, the ratio of expression values for a later time point compared to the starting (T=0) time point), select "the ratio of two data columns" option.

  • B. Select absolute if the values for each column should be displayed without taking a ratio.
    • If your dataset has log values (negative numbers) check the box next to 0-centered scale (e.g. log scale). This will ensure that negative values (e.g. decreased expression) will be displayed.

    • Indicate the types of data contained in your file. The options are "genes", "proteins", "reactions", "compounds", or "any of the above".

    • Note: Choosing "any of the above" might work well for combined data in which some compound and some other type of data is entered. But, a combination of gene, protein, and/or reaction data might be very hard to distinguish on the same Omics Viewer diagram.

  • Specify the data columns to display

    • In the sample file, the gene name is in the first column (Column 0).

    • There are data for six time points in the second (Column 1), third (Column 2), fourth (Column 3), fifth (Column 4), sixth (Column 5), and seventh (Column 6) columns.

    • The data in the second column (Column 1) is the zero time point.

    • To show the starting expression levels, select "absolute" and enter 1 in the left text box

    • To show the changing absolute expression levels over time, select "absolute" and enter 1,2,3,4,5,6 in the left text box. (see more about animations below)

    • To show the change in expression level from the zero time point to the first time point, select "relative" plus "the ratio of two columns" and enter 2 (the numerator) in the left box and enter 1 (the denominator) in the right box.

    • To show the change in expression level from the zero time point to every other time point, select "relative" plus "the ratio of two columns" and enter 2,3,4,5,6 (the numerators) in the left box and enter 1 (the denominator) in the right box.(see more about animations below)

  • Choose a color scheme for displaying values on the metabolic map overview diagram - either the default cutoff or a cutoff of your choice.

    • NOTE: The default cut off takes the values from the dataset. So if your values are extreme or there are a few outliers the color range available to display differences will be spread out over a wider range. In most cases, it is better to set your own cutoff because:
      1. First, having a smaller cutoff makes it easier to distinguish small differences in values
      2. Second, if you want to compare maps from more than one experiment, the color range associated with values should be the same

  • Submit the data.
    • Please be patient. For large datasets it may take some time to draw the map.

back to top
________________________________________________________________________________

Understanding and analyzing the output

Depending on the type of data used, the Omics viewer returns either a single page (for a single time point or experiment) or, if the data had multiple time points, an animation will play showing all the time points.

Note: An important consideration when uploading data with multiple data types (e.g. expression plus proteomic data) is that there is currently no way to distinguish color values from expression data vs. proteomic data. For both data types, the reaction lines are colored reflecting either changes in expression of the enzymes or increase/decrease in protein concentration. Also, the Omics viewer currently does not distinguish between isozymes that may catalyze identical reactions, nor does it allow for distinctions to be made for relative expression values for genes that are part of multimeric enzyme complexes. So if there is data for more than one subunit, isozyme or for both the gene and protein for a single reaction, the software determines which data point to use. If the numeric values have opposite signs, the highest positive value is selected. If the numeric values have the same sign (e.g. negative) the larger of the absolute value is displayed.

Color Scale

The color spectrum used ranges from yellow / green to red.
  • Red marks the highest values
  • Yellow marks the lowest values.
  • If there is no or little change, the color range is blue.
  • The color key to the right of the diagram shows the assignment of color values to numeric values.
  • Below the color key is a histogram of color values indicating the range of values within the dataset.
  • If there are values for data that cannot be shown in the overview (such as expression values for genes not included in AraCyc) they will also be displayed.
  • Comparing the histogram of the two datasets can provide a quick sense of how different the expression of some or all metabolic pathways is affected in an experiment compared to other pathways.

Statistics

Along with the metabolic overview, a table of statistics is generated.

  • The first table lists any gene/protein/compound names that could not be found, or for which the data was missing or incorrectly formatted.
  • A second list may be generated for all of the data types in the sample set that could not be matched to anything (Objects not found).
  • Objects with names that could not be resolved are also listed. These might include names which match more than one data type.

  • Note: Statistics are not calculated for time series experiments.

Animation Controls

For time series experiments (or other series data), the Omics viewer uses Dynamic HTML to display the series.

  • Animation controls can be used to stop playing the animation at any point, start playing, go forward or go backward one time point.

  • Note:Browsers that do not support DHTML will not be able to run the animations.

back to top
________________________________________________________________________________

Please contact us if you have any questions or check our FAQs page for answers to common questions.



leaf