PMN 9.0 Release Notes - August 2014

There are two types of databases developed and maintained by the PMN project: a reference database called PlantCyc and species-specific databases, such as AraCyc.

PMN databases (created and/or maintained at the PMN)

PMN database information
PlantCyc 9.0
AraCyc 12.0 BarleyCyc 2.0 BrachypodiumCyc 2.0 CassavaCyc 4.0 ChineseCabbageCyc 2.0
ChlamyCyc 4.0 CornCyc 5.0 GrapeCyc 4.0 MossCyc 3.0 OryzaCyc 2.0
PapayaCyc 3.0 PoplarCyc 7.0 SelaginellaCyc 3.0 SetariaCyc 2.0 SorghumBicolorCyc 2.0
SoyCyc 5.0 SwitchgrassCyc 2.0

________________________________________________________________________ Top

PlantCyc 9.0 [Search]

PlantCyc is a metabolic pathway reference database containing more than 800 pathways and their catalytic enzymes and genes, as well as compounds from over 350 plant species (See Database Statistics).

  • The majority of the pathways have been curated from experimental literature by curators at the PMN and collaborators' sites. In addition, PlantCyc includes hypothetical pathways that are published in peer-reviewed journals based on the educated conjectures of experts, and computationally predicted pathways that have been manually validated by PMN curators.

  • Similarly, enzymes in PlantCyc may have experimental support or may be based solely on computational predictions.

  • For both pathways and enzymes, evidence codes are assigned to clearly indicate the type of support associated with these database items.

  • PlantCyc Pathways
  • PlantCyc Content Statistics
  • Taxonomic range:primarily Viridiplantae
  • Protein sequence source: Varies according to source: All enzymes have been imported from PMN databases or MetaCyc.
  • Enzyme functional annotation method: Varies according to source: All enzymes have been imported from PMN databases or MetaCyc.
  • Enzyme evidence:
    • Substantial manual curation of enzymes
    • In addition, large-scale computational predictions of enzyme function not subject to curator review
  • Pathway prediction program: All pathways have been manually imported from PMN databases or MetaCyc
  • SAVI pathway validation lists: UPP 5.0, NPP 5.0, MCP 1.0, AIPP 2.0, and CAPP 2.0
  • Pathway evidence:
    • Substantial manual curation of pathways
    • Some computational prediction of pathways followed by SAVI refinement and curator review
    • Additional pathways imported from MetaCyc based on curator inference

________________________________________________________________________ Top

AraCyc 12.0

AraCyc is the most highly curated species-specific database present at the PMN. It has a large number of experimentally supported enzymes and metabolic pathways, but it also houses a substantial number of computationally predicted enzymes and pathways

.

More information on AraCyc curation is available.

________________________________________________________________________ Top

BarleyCyc 2.0
  • BarleyCyc Pathways
  • BarleyCyc Content Statistics
  • Taxonomic range: Hordeum vulgare
  • Protein sequence source: Helmholtz Zentrum Munchen - Munich Information Center for Protein Sequences (MIPS): barley_HighConf_genes_MIPS_23Mar12_ProteinSeq.fa
  • Enzyme functional annotation method: E2P2 v2.1
  • Enzyme evidence:
    • Almost exclusively large-scale computational predictions of enzyme function not subject to curator review
    • Small number of manually curated enzymes
  • Pathway prediction program: Pathway Tools 17.5
  • SAVI pathway validation lists: UPP 5.0, NPP 5.0, MCP 1.0, AIPP 2.0, and CAPP 2.0
  • Pathway evidence:
    • Primarily computational prediction of pathways followed by SAVI refinement and curator review
    • Additional pathways imported from MetaCyc based on experimental evidence or curator inference

________________________________________________________________________ Top

BrachypodiumCyc 2.0
  • BrachypodiumCyc Pathways
  • BrachypodiumCyc Content Statistics
  • Taxonomic range: Brachypodium distachyon
  • Protein sequence source: Phytozome 9.0: Bdistachyon_192_peptide.fa
  • Enzyme functional annotation method: E2P2 v2.1
  • Enzyme evidence:
    • Almost exclusively large-scale computational predictions of enzyme function not subject to curator review
    • Small number of manually curated enzymes
  • Pathway prediction program: Pathway Tools 17.5
  • SAVI pathway validation lists: UPP 5.0, NPP 5.0, MCP 1.0, AIPP 2.0, and CAPP 2.0
  • Pathway evidence:
    • Primarily computational prediction of pathways followed by SAVI refinement and curator review
    • Additional pathways imported from MetaCyc based on experimental evidence or curator inference

________________________________________________________________________ Top

CassavaCyc 4.0

________________________________________________________________________ Top

ChineseCabbageCyc 2.0
  • ChineseCabbageCyc Pathways
  • ChineseCabbageCyc Content Statistics
  • Taxonomic range: Brassica rapa ssp. pekinensis
  • Protein sequence source: Phytozome 9.0: Brapa_197_peptide.fa
  • Enzyme functional annotation method: E2P2 v2.1
  • Enzyme evidence:
    • Almost exclusively large-scale computational predictions of enzyme function not subject to curator review
    • Small number of manually curated enzymes
  • Pathway prediction program: Pathway Tools 17.5
  • SAVI pathway validation lists: UPP 5.0, NPP 5.0, MCP 1.0, AIPP 2.0, and CAPP 2.0
  • Pathway evidence:
    • Primarily computational prediction of pathways followed by SAVI refinement and curator review
    • Additional pathways imported from MetaCyc based on experimental evidence or curator inference

________________________________________________________________________ Top

ChlamyCyc 4.0
  • ChlamyCyc Pathways
  • ChlamyCyc Content Statistics
  • Taxonomic range: Chlamydomonas reinhardtii
  • Protein sequence source: Phytozome 9.0: Creinhardtii_236_protein.fa
  • Enzyme functional annotation method: E2P2 v2.1
  • Enzyme evidence:
    • Primarily large-scale computational predictions of enzyme function not subject to curator review supplemented with data and references curated by the GoFORSYS project at the Max Planck Institute of Molecular Plant Physiology
    • Small number of manually curated enzymes
  • Pathway prediction program: Pathway Tools v.17.5 (primarily)
  • SAVI pathway validation lists: CVP 3.0, NPP 5.0, MCP 1.0, AIPP 2.0, and CAPP 2.0
  • Pathway evidence:
    • Primarily computational prediction of pathways followed by SAVI refinement and curator review
    • Additional pathways imported from MetaCyc and ChlamyCyc 1.0 (created by the GoFORSYS project at the Max Planck Institute of Molecular Plant Physiology) based on experimental evidence or curator inference

________________________________________________________________________ Top

CornCyc 5.0
  • CornCyc Pathways
  • CornCyc Content Statistics
  • Taxonomic range: Zea mays mays
  • Protein sequence source: MaizeGDB (downloaded from EnsemblPlants): Zea_mays.AGPv3.22.pep.all.fa
  • Enzyme functional annotation method: E2P2 v2.1
  • Enzyme evidence:
    • Almost exclusively large-scale computational predictions of enzyme function not subject to curator review
    • Small number of manually curated enzymes
  • Pathway prediction program: Pathway Tools 17.5
  • SAVI pathway validation lists: UPP 5.0, NPP 5.0, MCP 1.0, AIPP 2.0, and CAPP 2.0
  • Pathway evidence:
    • Primarily computational prediction of pathways followed by SAVI refinement and curator review
    • Additional pathways imported from MetaCyc and MaizeGDB based on experimental evidence or curator inference

________________________________________________________________________ Top

GrapeCyc 4.0
  • GrapeCyc Pathways
  • GrapeCyc Content Statistics
  • Taxonomic range: Vitis vinifera
  • Protein sequence source: Phytozome 9.0: Vvinifera_145_peptide.fa
  • Enzyme functional annotation method: E2P2 v2.1
  • Enzyme evidence:
    • Almost exclusively large-scale computational predictions of enzyme function not subject to curator review
    • Small number of manually curated enzymes
  • Pathway prediction program: Pathway Tools 17.5
  • SAVI pathway validation lists: UPP 5.0, NPP 5.0, MCP 1.0, AIPP 2.0, and CAPP 2.0
  • Pathway evidence:
    • Primarily computational prediction of pathways followed by SAVI refinement and curator review
    • Additional pathways imported from MetaCyc based on experimental evidence or curator inference

________________________________________________________________________ Top

MossCyc 3.0
  • MossCyc Pathways
  • MossCyc Content Statistics
  • Taxonomic range: Physcomitrella patens
  • Protein sequence source: Cosmoss (downloaded from Phytozome 9.0): Ppatens_152_peptide.fa
  • Enzyme functional annotation method: E2P2 v2.1
  • Enzyme evidence:
    • Almost exclusively large-scale computational predictions of enzyme function not subject to curator review
    • Small number of manually curated enzymes
  • Pathway prediction program: Pathway Tools 17.5
  • SAVI pathway validation lists: UPP 5.0, NPP 5.0, MCP 1.0, AIPP 2.0, and CAPP 2.0
  • Pathway evidence:
    • Primarily computational prediction of pathways followed by SAVI refinement and curator review
    • Additional pathways imported from MetaCyc based on experimental evidence or curator inference
________________________________________________________________________ Top
OryzaCyc 2.0

________________________________________________________________________ Top

PapayaCyc 3.0
  • PapayaCyc Pathways
  • PapayaCyc Content Statistics
  • Taxonomic range: Carica papaya
  • Protein sequence source: Phytozome 9.0: Cpapaya_113_peptide.fa
  • Enzyme functional annotation method: E2P2 v2.1
  • Enzyme evidence:
    • Almost exclusively large-scale computational predictions of enzyme function not subject to curator review
    • Small number of manually curated enzymes
  • Pathway prediction program: Pathway Tools 17.5
  • SAVI pathway validation lists: UPP 5.0, NPP 5.0, MCP 1.0, AIPP 2.0, and CAPP 2.0
  • Pathway evidence:
    • Primarily computational prediction of pathways followed by SAVI refinement and curator review
    • Additional pathways imported from MetaCyc based on experimental evidence or curator inference

________________________________________________________________________ Top

PoplarCyc 7.0

More information on PoplarCyc curation is available.

________________________________________________________________________ Top

SelaginellaCyc 3.0
________________________________________________________________________ Top

SetariaCyc 2.0
  • SetariaCyc Pathways
  • SetariaCyc Content Statistics
  • Taxonomic range: Setaria italica
  • Protein sequence source: Phytozome 9.0: Sitalica_164_peptide.fa
  • Enzyme functional annotation method: E2P2 v2.1
  • Enzyme evidence:
    • Almost exclusively large-scale computational predictions of enzyme function not subject to curator review
    • Small number of manually curated enzymes
  • Pathway prediction program: Pathway Tools 17.5
  • SAVI pathway validation lists: UPP 5.0, NPP 5.0, MCP 1.0, AIPP 2.0, and CAPP 2.0
  • Pathway evidence:
    • Primarily computational prediction of pathways followed by SAVI refinement and curator review
    • Additional pathways imported from MetaCyc based on experimental evidence or curator inference

________________________________________________________________________ Top

SorghumBicolorCyc 2.0
  • SorghumBicolorCyc Pathways
  • SorghumBicolorCyc Content Statistics
  • Taxonomic range: Sorghum bicolor
  • Protein sequence source: Phytozome 9.0 (early release): Sbicolor_v2.1_255_protein.fa
  • Enzyme functional annotation method: E2P2 v2.1
  • Enzyme evidence:
    • Almost exclusively large-scale computational predictions of enzyme function not subject to curator review
    • Small number of manually curated enzymes
  • Pathway prediction program: Pathway Tools 17.5
  • SAVI pathway validation lists: UPP 5.0, NPP 5.0, MCP 1.0, AIPP 2.0, and CAPP 2.0
  • Pathway evidence:
    • Primarily computational prediction of pathways followed by SAVI refinement and curator review
    • Additional pathways imported from MetaCyc based on experimental evidence or curator inference

________________________________________________________________________ Top

SoyCyc 5.0
  • SoyCyc Pathways
  • SoyCyc Content Statistics
  • Taxonomic range: Glycine max
  • Protein sequence source: Phytozome 9.0 (early release): Gmax_275_Wm82.a2.v1.protein.fa
  • Enzyme functional annotation method: E2P2 v2.1
  • Enzyme evidence:
    • Almost exclusively large-scale computational predictions of enzyme function not subject to curator review
    • Small number of manually curated enzymes
  • Pathway prediction program: Pathway Tools 17.5
  • SAVI pathway validation lists: UPP 5.0, NPP 5.0, MCP 1.0, AIPP 2.0, and CAPP 2.0
  • Pathway evidence:
    • Primarily computational prediction of pathways followed by SAVI refinement and curator review
    • Additional pathways imported from MetaCyc based on experimental evidence or curator inference


________________________________________________________________________ Top

SwitchgrassCyc 2.0
  • SwitchgrassCyc Pathways
  • SwitchgrassCyc Content Statistics
  • Taxonomic range: Panicum virgatum
  • Protein sequence source: Phytozome 9.0 (early release): Pvirgatum_v1.1_273_protein.fa
  • Enzyme functional annotation method: E2P2 v2.1
  • Enzyme evidence:
    • Almost exclusively large-scale computational predictions of enzyme function not subject to curator review
    • Small number of manually curated enzymes
  • Pathway prediction program: Pathway Tools 17.5
  • SAVI pathway validation lists: UPP 5.0, NPP 5.0, MCP 1.0, AIPP 2.0, and CAPP 2.0
  • Pathway evidence:
    • Primarily computational prediction of pathways followed by SAVI refinement and curator review
    • Additional pathways imported from MetaCyc based on experimental evidence or curator inference


To see comparable data for past releases, please see our Database Overview Archive.

________________________________________________________________________ Top

External Metabolic Pathway Databases

Although several external single species databases are affiliated with the PMN, those databases are developed and maintained exclusively by our collaborators. Some of the data from those databases has been incorporated into PlantCyc, as described in our release notes. We provide collaborator contact information and direct links to their homepages below.

Noble Foundation database


SGN (Solanaceae Genomics Network) databases

SRI International databases

SRI International plays a special role as a collaborator.

Through the MetaCyc database, SRI provides some of the data content for PlantCyc. Prior to the creation of PlantCyc, any plant metabolic pathways with experimental or literature support were entered into MetaCyc by curators from the PMN or collaborating groups. More information about the inclusion of MetaCyc data in PlantCyc is described in our release notes.

  • MetaCyc (one database / many organisms)

SRI also provides access to a number of externally generated species-specific databases from outside the plant kingdom through the BioCyc website. They are not part of the PMN, but they may prove useful for comparative studies.

  • BioCyc (multiple databases / many organisms)

The Pathway Tools software used for displaying existing PMN databases and for predicting new databases is also generated and supported by the Bioinformatics Research Group at SRI International.

________________________________________________________________________ Top

Enzyme functional annotation method

E2P2 v2.1 - Ensemble Enzyme Prediction Pipeline version 2.1:
The functional annotation of protein sequences was performed by the PMN Ensemble Enzyme Prediction Pipeline (E2P2, version 2.1). E2P2 annotates protein sequences using homology transfer by integrating both single sequence (BLAST, E-value cutoff <= 1e-20) and multiple sequence (Priam) models of enzymatic function. The ensemble algorithm relies on an average weighted integration scheme (percentage weight cutoff >= 50%) where the weight of each predicted model was determined via a 5-by-3 nested cross-validation routine. The training of E2P2 and the reference databases used in the annotation process are based on the Reference Protein Sequence Dataset (RPSD) 2.0. Data for RPSD was compiled from protein sequences with experimental support of existence from SwissProt, MetaCyc, and BRENDA.




leaf