WHAT IS A PAG?

The PAG stands for the Pathway, Annotated-list and Gene-sigature, which is a group of genes that belong to the same pathway, share common biological processes, co-expression genes in a particular experiment belonging to the same disease, or any defination of the groups of the gene(e.g. drug targets, microRNA targets, protein family, isozyme and so on). The PAGs' network construction reveals higher level relationships among PAGs. The potential usage of the PAGs' network are exploration of the potential biological pathways in upstream and downstream involved in etiology of the disease, mining microRNA-biological processes, repostioning drugs based on the PAGs level and so on.

WHAT IS PAGER?

Pathway and Annotated-list Gene-signature Electronic Repository (PAGER), an online systems biology tool for constructing and visualizing gene and PAGs' networks from multiple PAG collections (PAG Data).

To construct PAGs' networks, users search for PAGs by using terms or a list of genes. Users add PAGs in which they are interested to PAG Box. PAGER can construct m-type PAGs' network and r-type PAGs' networks of PAGs in PAG Box.

A regulatory gene set network     1) A regulatory PAGs' network (r-PAGs' network). r-PAGs' networks connect a pair of PAGs when there is a significant number of gene regulations between the genes of the two PAGs. Gene regulation data were collected from different data sources in order to construct r-PAGs' network. Only high confident gene regulations were selected to use for constructing a r-PAGs' network.
A co-membership gene set network     2) Co-membership PAGs' networks (m-PAGs' networks) connect a pair of PAGs if there is a significant number of shared genes.

For a selected PAG, PAGER can construct gene interaction and gene regulation networks of genes within the PAG. For examples of how to use PAGER, please see Use Cases.

PAGER-2.0 contains 84,282 PAGs, 22,127 high quality gene-gene regulations in Human, 579,037 gene-gene interactions from 3 star and above in HAPPI-2.0 database.

PAG Data
PAGER version Data Source Description Download Date Number of PAGs
2 GTEx FPKM over or equal to 50 or 100 in tissue specific expression from NGS. (PMID:25954001) 8/6/2016 93
2 TargetScan TargetScan predicts biological targets of miRNAs by searching for the presence of conserved 8mer, 7mer, and 6mer sites that match the seed region of each miRNA. 2/6/2017 390
2 MSigDB Immune signatures for GSEA. 2/23/2014 4,872
2 GO term Gene Ontology: the framework for the model of biology. The GO defines concepts/classes used to describe gene function, and relationships between these concepts. 3/13/2015 11,750
2 pfam Pfam is a large collection of protein families, represented by multiple sequence alignments and hidden Markov models (HMMs). 3/21/2017 1,167
2 DSigDB DSigDB organized drugs and small molecules related gene sets into four collections based on quantitative inhibition and/or drug-induced gene expression changes data. 3/21/2017 22,527
2 Isozyme Collect enzymes that differ in amino acid sequence yet catalyze the same reaction. 10/29/2015 491
2 microcosm_targets MicroCosm Targets (formerly miRBase Targets) is a web resource developed by the Enright Lab at the EMBL-EBI containing computationally predicted targets for microRNAs across many species. 2/11/2017 851
2 mirTARbase miRTarBase has accumulated more than three hundred and sixty thousand miRNA-target interactions (MTIs), which are collected by manually surveying pertinent literature after NLP of the text systematically to filter research articles related to functional studies of miRNAs. 11/04/2015 3,684
2 Phewas Phewas database collect the data from human disease results from complex interactions between genes and environmental risk factors, and that variants from a few (<20) susceptibility genes variants are responsible for >50% of this disease burden. 10/22/2015 1,358
1 GAD Genetic Association Database 8/26/2013 1,679
1 GWAS Catalog A Catalog of Published Genome-Wide Association Studies provided by NHGRI 8/27/2013 1,238
1 GeneSigDB GeneSigDB: a manually curated database and resource for analysis of gene expression signatures 8/23/2013 3,506
1 MSigDB The Molecular Signatures Database is a collection of annotated PAGs for use with GSEA software 8/26/2013 10,295
1 NGS Catalog NGS Catalog: A database of next generation sequencing studies in humans 8/26/2013 56
1 PharmGKB The Pharmacogenomics Knowledge Base 8/26/2013 102
1 Protein Lounge Bioinformatics portal which integrates protein information, databases and research tools for researchers and students 2009 388
1 Spike SPIKE is a database of highly curated human signaling pathways 9/5/2013 28
1 WikiPathway WikiPathways is an open, public platform dedicated to the curation of biological pathways by and for the scientific community 8/26/2013 200
1 BioCarta Online maps of metabolic and signaling pathways 8/26/2013 252
1 KEGG KEGG PATHWAY is a collection of manually drawn pathway maps representing our knowledge on the molecular interaction, reaction and relation networks 8/26/2013 199
Gene Regulation Data
Data Source Description Download date Publication Criterion Number of regulations
Spike (2012) Gene regulations from pathway 03/01/2013 NAR, 2011 Collect all gene regulations 9,451
TRANSFAC 7.4 (Public version) Transcription Factor binding site and genes 2009 NAR, 2003 Binding site quality <= 5 7,460(8,556)
TRED Transcriptional Regulatory Element Database 03/01/2013 NAR, 2007 Remove all computational predicted records 6,040
String 9.05 Protein interaction database 09/04/2013 NAR, 2013 Score >= 800 8,021(8,910)

PAGER FEATURES

PAGER has several unique features which have not been provided before by other existing tools.

  • PAGER not only allows users to construct two types of PAG's networks, PAGER enables users to construct a gene interaction network and a gene regulation network of genes inside a PAG.
  • PAGER allows users to construct three types of expanded PAG's networks including networks of upstream, downstream, and co-membership PAGs. Constructing expanded PAGs' networks enables users to find more important PAGs related to their study.
  • Because PAGER offers gene networks, users can also construct expanded gene networks from a gene including networks of upstream, downstream, and sibling genes.
  • PAGER provides an interactive visualization tool for users to study gene and PAGs' networks and offers spaces, Gene Box and PAG Box, for users to store their genes and PAGs.

For examples of how to use PAGER, please see Use Cases.

PAG Box

A temporary space for users to save PAGs. Users can construct 2 types of PAGs' networks, regulatory and co-membership PAGs' networks, of the PAGs in PAG Box. Users click on the PAG Box status (on the top of the page) to access their PAG Box.

Gene Box

A temporary space for users to save genes. Users can construct 2 types of gene networks, gene interaction and gene regulation networks, of the genes in Gene Box. Users click on the Gene Box status (on the top of the page) to access their Gene Box.

PAGER IMPLEMENTATION

PAGER was implemented by using PHP language version 5 and Codeigniter version 2.1.3. PAGER data was stored in Oracle 12g database maintained by Indiana University and was connected to PHP server by Oracle Instant Client software. P-value for each edge in PAGs' networks was computed on the fly by using hypergeometric function provided by PDL. For gene and PAGs' networks visualization in PAGER, cytoscape.js, a JavaScript graph library for network visualization.

CALCULATION METHODS

Hypergeometric Distribution

Hypergeometric distribution was used for calculating a p-value of each edge in PAGs' networks

calculating significant values of regulatory edges
calculating significant values of co-membership edges

Similarity Score

This is the same calculation used in HPD, so please see Similarity Score Scoring Method for more information.

USE CASES

This section described different ways of using PAGER to study systems biology.

Searching Genes and PAGs by Terms

Users go to PAGER home page and enter searching terms such as a disease name or a gene symbol. For this use case, we entered “non small cell lung” to search for non-small cell lung cancer related PAGs. PAGER returned a list of results. The list contains genes and PAGs which relate to the searching terms in different aspects.



In this case, “non small cell lung” matched with names of 160 PAGs and descriptions of 720 PAGs. “non small cell lung” is not a name or a symbol of a gene, so PAGER returned 0 for a member of PAG line. However, if users entered “BRAF”, which is a gene symbol, PAGER returned 501 PAGs which contain BRAF gene.



The next step is clicking on the 160 PAGs which relate to “non small cell lung”. PAGER displayed a list of the 160 PAGs. The PAGs can be sorted by name, size, organism, or data source. We filtered only PAGs whose sizes is between 5 and 500 and are from humans. PAGs can be added to PAG Box for further analysis. Checkboxes in the left most column were checked if PAGs were already in the PAG Box.



Searching PAGs by a List of Genes

On PAGER home page, users can enter a list of genes obtained from their experiment or other data sources in order to search for related PAGs. In this use case, a list of 94 non-small cell lung cancer genes were entered. PAGER displayed PAGs which related to the list of 94 genes and 28 PAGs were displayed after we applied filters.

PAGER counted the number of shared genes and calculated a significant value for each PAG. Users can click on a PAG name to see more detail about a PAG or add PAGs into PAG Box as in the previous use case.

Note that the significant value of each PAG was computed on-the-fly by using hypergeometric function provided by the additional PHP library. Therefore, it is possible that if the number of shared genes or the size of PAG is very high, the hypergeometric function cannot compute p-value because of number overflow.



Constructing PAGs' Networks

Users can construct co-membership and regulatory PAGs' networks of their selected PAGs. For example, 20 PAGs were added into PAG Box.



On the PAG Box page, co-membership and regulatory PAGs' networks were constructed and displayed by clicking “View PAGs' network” button.



An instruction of using PAGs' network visualization was displayed on the right side. Users move a mouse over a PAG to see more detail about the PAG and see its neighbors, and click on an edge to see the detail of the edge. Users drag a mouse to cover PAG nodes for multiple selection. On the network page, users can also add or remove a PAG from PAG Box by clicking on a checkbox. Users select a PAG and click the link on the right hand side to see more detail about the PAG.



There are two taps for displaying a regulatory PAGs' network, a directed network, and a co-membership PAGs' network, an undirected network.

Generating Expanded PAGs' Networks

On a PAG detail page, users click on a diagram to construct expanded networks. The following example show a PAG detail page of a non small cell lung cancer PAG.



The expanded networks are networks of upstream, downstream, and co-membership PAGs. In this first version of PAGER, the expanded networks contain only top 20 PAGs.

  • Upstream r-PAGs are PAGs which regulate the selected PAG
  • Downstream r-PAGs are PAGs which are regulated by the selected PAG
  • Co-membership m-PAGs are PAGs which share significant number of genes with the selected PAG.

Viewing Gene Networks of Genes inside a PAG

In addition to the expanded PAG network, PAGER enables gene interaction and gene regulation network within a PAG. The following example show a gene regulatory network of a non small cell lung cancer PAG..



Constructing Disease Specific PAGs' Networks

This use case showed how PAGER can be used to construct disease specific PAGs' networks. In this scenario, a list of disease related genes was not required. For example, we searched for non-small cell lung cancer related PAGs by using “non small cell lung”. The terms matched with the names of 160 PAGs. 5 sample human PAGs were added to PAG Box. On the PAG Box page, users create a new PAG by clicking on Create a new PAG button. PAGER displayed a list of gene and the number of PAGs which contain a particular gene. In this example, AKAP12 gene has the highest frequency suggesting that it is an important gene among the five PAGs in PAG Box. On this page, when users click on Use all genes to search by a list of genes, PAGER automatically generate a list of genes and fill in the text area in PAGER home page.