GeneMANIA finds other genes that are related to a set of input genes, using a very large set of functional association data. Association data include protein and genetic interactions, pathways, co-expression, co-localization and protein domain similarity. You can use GeneMANIA to find new members of a pathway or complex, find additional genes you may have missed in your screen or find new genes with a specific function, such as protein kinases. Your question is defined by the set of genes you input.
GeneMANIA returns:
Predicted related genes can be, for instance, in the same pathway or complex as your input genes, can be co-expressed or have similar enzymatic function. To determine how predicted genes are related to your input genes, you need to study the links in the network to find out how your input genes are connected to each other and how new genes are related to your input genes.
GeneMANIA recognizes Entrez, Ensembl, Standard gene symbols, Uniprot/SwissProt and RefSeq identifiers and unique gene names.
GeneMANIA searches many large, publicly available biological datasets to find related genes. These include protein-protein, protein-DNA and genetic interactions, pathways, reactions, gene and protein expression data, protein domains and phenotypic screening profiles. Data is regularly updated.
Networks names describe the data source and are either generated from the PubMed entry associated with the data source (first author-last author-year), or simply the name of the data source (BioGRID, PathwayCommons-(original data source), Pfam)
You can upload your network to GeneMANIA and analyze it in the context of all publicly available networks that GeneMANIA knows about. Your network is deleted from the GeneMANIA server after your session ends, or within 24 hours. Please see our privacy policy for more information.
The upload network button can be found in the advanced options panel. Your network must be for one of the GeneMANIA supported organisms, be tab delimited text, and in the format GeneID <tab> GeneID <tab> Score. The score will vary depending on the type of network, but in general is a number ranging from zero (no interaction) to 1 (strong interaction). For an interaction network or a pathway where interactions either exist or don't exist, the score is 1 for all links. For a gene expression network, the score could be the Pearson correlation coefficient for the gene pair, representing the expression level simiarity across several experiments.
For a co-expression network, the score could be the Pearson correlation coefficient between the expression profiles of the two genes. Note that networks are normalized to reduce the effect of highly connected nodes, so scores may change slightly once uploaded.
GeneMANIA can use a few different methods to weight networks when combining all networks to form the final composite network that results from a search. The default settings are usually appropriate, but you can choose a weighting method in the advanced option panel.
These weighting methods are based on GO terms that have between 3 and 300 genes associated with them. Only the most reliable annotations were used (i.e. all annotations with an IEA evidence code were removed, as these are less reliable). There is one weighting method per GO branch.
Each network data source is represented as a weighted interaction network where each pair of genes is assigned an association weight, which is either zero indicating no interaction, or a positive value that reflects the strength of interaction or the reliability the observation that they interact. For example, the association of a pair of genes in a gene expression dataset is the Pearson correlation coefficient of their expression levels across multiple conditions in an experiment. The more the genes are co-expressed, the higher the weight they are linked by, ranging up to 1.0, meaning perfectly correlated expression.
Direct interactions are used for networks where binary information is available (like protein interactions). When two proteins interact, their network link has a weight of 1.
Shared neighbours were used for networks where the profile of one gene was compared to that of a second gene and the Pearson correlation coefficient was calculated (like protein domain data).
The GeneMANIA database consists of genomics and proteomics data from a variety of sources, including data from gene and protein expression profiling studies and primary and curated molecular interaction networks and pathways. GeneMANIA relies on the following data sources:
We maintain a complete list of networks currently in the GeneMANIA system.
GeneMANIA stands for Multiple Association Network Integration Algorithm.
The GeneMANIA algorithm consists of two parts:
GeneMANIA treats gene function prediction as a binary classification problem. As such, each functional association network derived from the data sources is assigned a positive weight, reflecting the data sources' usefulness in predicting the function. The weighted average of the association networks is constructed into a function-specific association network. GeneMANIA uses separate objective functions to fit the weights; this simplifies the optimization problem and decreases the run time.
GeneMANIA predicts gene function from the composite network using a variation of the Gaussian field label propagation algorithm that is appropriate for gene function prediction in which there are typically relatively few positive examples. Label propagation algorithms assign a score (the discriminant value) to each node in the network. This score reflects the computed strength of association that the node has to the seed list defining the given function. This value can be thresholded to enable predictions of a given gene function.
GeneMANIA: a real-time multiple association network integration algorithm for predicting gene function
Mostafavi S, Ray D, Warde-Farley D, Grouios C, Morris Q (2008)
Genome Biology 9: S4.
PubMed Abstract (PDF)
Web browser: GeneMANIA supports the latest versions of Chrome, Firefox, Safari and Internet Explorer. For a faster, smoother experience with GeneMANIA, we recommend you use a standards compliant browser, such as Chrome or Firefox.
| Windows | Mac OS | Linux | |
|---|---|---|---|
| Very well supported | Chrome 5+, Firefox 3.6+ | Chrome 5+, Firefox 3.6+, and Safari 5+ | |
| Reasonably well supported | Internet Explorer 8+ | ||
| May work | Chrome 5+ and Firefox 3.6+ | ||
| Not supported | older versions and others | older versions and others | older versions and others |
Internet Connection: A fast internet connection such as DSL, Cable or T1.
Computer: A modern computer with at least a 1GHz CPU, 1GB RAM and a modern video card.
We recommend citing the NAR webserver issue, as follows.
The GeneMANIA prediction server: biological network integration for gene prioritization and predicting gene function
Warde-Farley D, Donaldson SL, Comes O, Zuberi K, Badrawi R, Chao P, Franz M, Grouios C, Kazi F, Lopes CT, Maitland A, Mostafavi S, Montojo J, Shao Q, Wright G, Bader GD, Morris Q
Nucleic Acids Res. 2010 Jul 1;38 Suppl:W214-20
PubMed Abstract (PDF)
Other gene function prediction programs are available, including STRING, bioPIXIE, Funcassociate, and FunCoup.
GeneMANIA has the advantage of flexibility, accuracy, and often speed of response over these other systems. In particular, in a competition (on yeast (Mostafavi et al., 2008 PDF, PubMed, journal) and mouse (Pena-Castillo et al., 2008 PDF, PubMed, journal), GeneMANIA was shown to be more accurate than other gene function prediction methods, and is generally faster, producing predictions within seconds. Because of this speed, GeneMANIA can produce results while you wait. Users can select arbitrary subsets of networks that they want to query and GeneMANIA automatically selects network weights based on the input gene list, generating a network specific to the user's gene list. Unlike other systems, GeneMANIA provides users with the ability to upload their own network and also compensates for redundancies in the data, so users don't have to worry about double-counting interactions.
The linking URL in its simplest form is http://genemania.org/link?o=<tid>&g=<genes>, where:
<tid> : NCBI taxonomy id for organism (A. thaliana=3702, C. elegans=6239, D. melanogaster=7227, H. sapiens=9606, M. musculus=10090, S. cerevisiae=4932)<genes> : one or more gene symbols separated by pipes ("|")Examples of the simplest form:
http://genemania.org/link?o=3702&g=rad50http://genemania.org/link?o=3702&g=PHYB|ELF3|COP1|SPA1|FUS9Optional Parameters:
GeneMANIA linking supports some optional parameters (reference GeneMANIA help section on meaning of the various weighting methods):
m : network combining method; must be one of the following:automatic_relevance : Assigned based on query genesautomatic : Automatically selected weighting methodbp : biological process basedmf : molecular function basedcc : cellular component basedaverage : Equal by data typeaverage_category : Equal by networkr : the number of results generated by GeneMANIA; must be a number in the range 1..100.If no optional parameters are provided, GeneMANIA assumes the default values: m=automatic; r=10.
Examples using optional parameters:
The following query runs the GeneMANIA algorithm for A. thaliana using 6 genes as input, the "average" method and returns 50 more genes:
http://genemania.org/link?o=3702&g=DET1|HY5|CIP1|CIP8|PHYA|HFR1&m=average&r=50
The following query runs the GeneMANIA algorithm for A. thaliana's CIP1 gene using the "molecular process based" method and returns 101 genes:
http://genemania.org/link?o=3702&g=CIP1&m=bp&r=100
Invalid queries:
http://genemania.org/link?o=3702 : at least one gene must be specifiedhttp://genemania.org/link?o=1000 : invalid taxonomy idhttp://genemania.org/link?o=3702&g=PHYA&m=super_smart&R=50 : invalid methodhttp://genemania.org/link?o=3702&g=det1&r=1000 : results must be less than 100Happy linking!