SLiMSuite REST Server

EdwardsLab Homepage
EdwardsLab Blog
SLiMSuite Blog
REST Pages
REST Status
REST Tools
REST Alias Data
REST Sitemap

rje_biogrid V1.6

BioGRID Database processing module

Module: rje_biogrid
Description: BioGRID Database processing module
Version: 1.6
Last Edit: 07/05/10

Copyright © 2007 Richard J. Edwards - See source code for GNU License Notice

Imported modules: rje rje_seq rje_uniprot rje_zen

See SLiMSuite Blog for further documentation. See rje for general commands.


This module is designed primarily for parsing the plain text ORGANISM downloads from the BioGRID database. These have names in the form:

BioGRID tables contain useful information that can be used for cross-referencing to other sources, namely the protein names and gene symbols/aliases. The latter will be added to the dict['Mapping'] links dictionary of the BioGRID object, linking each symbol to the primary protein ID. These protein IDs will be used for storing the PPI data (in dict['PPI']) and extracting gene data from external sequence databases. These sequence databases need to be provided separately. This will be read in and added to the dict['Protein'] which will also store gene symbol data etc.

The selection of sequence files might turn out to be quite tricky, as different species have very different protein identifiers used. I will add a list of recommended sequence sources as I find them:

  • Yeast = EnsLoci treatment of the EnsEMBL yeast genome

BioGRID contains data for a number of experimental types. Those of interest can be specified with the ppitype=LIST option. Choices include: Affinity Capture-MS; Affinity Capture-Western; Biochemical Activity; Co-crystal Structure; Co-fractionation; Co-purification; Dosage Lethality; Dosage Rescue; Far Western; FRET; Phenotypic Enhancement; Phenotypic Suppression; Protein-peptide; Reconstituted Complex; Synthetic Growth Defect; Synthetic Lethality; Synthetic Rescue; Two-hybrid;

IntAct has the following: anti bait coip | pull down | two hybrid pooling | two hybrid | tap | x-ray diffraction | anti tag coip | fluorescence imaging | cosedimentation | elisa | protein kinase assay | coip | biochemical | antibody array | confocal microscopy | beta galactosidase | imaging techniques | two hybrid array | molecular sieving | ion exchange chrom | affinity chrom | protein array | enzymatic study | inferred by curator | far western blotting | spr | fps | phosphatase assay | fret | dhfr reconstruction | bn-page | peptide array | nmr | facs | affinity techniques | crosslink | itc | one hybrid | fluorescence | solution sedimentati | ch-ip | emsa | complementation | density sedimentatio | comig non denat gel | filter binding | chromatography | ub reconstruction | reverse phase chrom | emsa supershift | electron microscopy | protein crosslink | competition binding | mappit | gallex | gtpase assay | in gel kinase assay | spa | biophysical | radiolabeled methyl | experimental interac | fluorescence spectr | cd | bret | protein tri hybrid | transcription compl | deacetylase assay | footprinting | yeast display | saturation binding | protease assay | lambda phage | light scattering | htrf | fcs | toxcat | phage display | t7 phage | kinase htrf | methyltransferase as

MINT txt downloads can also be parsed. Experiment types for MINT include: affinity chromatography technologies | affinity technologies | anti bait coimmunoprecipitation | anti tag coimmunoprecipitation | beta galactosidase complementation | beta lactamase complementation | biochemical | bioluminescence resonance energy transfer | biophysical | chromatography technologies | circular dichroism | classical fluorescence spectroscopy | coimmunoprecipitation | colocalization by fluorescent probes cloning | colocalization by immunostaining | colocalization/visualisation technologies | competition binding | copurification | cosedimentation | cosedimentation in solution | cosedimentation through density gradients | cross-linking studies | electron microscopy | enzymatic studies | enzyme linked immunosorbent assay | experimental interaction detection | far western blotting | filter binding | fluorescence-activated cell sorting | fluorescence microscopy | fluorescence polarization spectroscopy | fluorescence technologies | fluorescent resonance energy transfer | gst pull down | his pull down | imaging techniques | isothermal titration calorimetry | lambda phage display | mass spectrometry studies of complexes | molecular sieving | nuclear magnetic resonance | peptide array | phage display | protease assay | protein array | protein complementation assay | protein kinase assay | pull down | saturation binding | surface plasmon resonance | t7 phage display | two hybrid | two hybrid array | two hybrid fragment pooling approach | two hybrid pooling approach | ubiquitin reconstruction | unknown | x-ray crystallography

Reactome interactions are restricted to those of the "reaction" type. There are also "neighbouring_reaction" and "direct_complex" and "indirect_complex"

DIP interactions are restricted to those with two uniprotkb IDs. DIP has similar annotation to MINT, with MI nos.

Domino interactions are restricted to those with two uniprotkb IDs. Has similar annotation to MINT, with MI nos.


BioGRID parsing and PPI Dataset Generation Options

ppifile=FILE : PPI database flat file [None]
seqin=FILE : Sequence file containing protein sequences with appropriate Accession Numbers/IDs [None]
genecards=FILE : File of links between IDs. For human, should have HGNC and EnsLoci columns. [None]
ppitype=LIST : List of acceptable interaction types to parse out []
badtype=LIST : List of bad interaction types, to exclude [indirect_complex,neighbouring_reaction]
symmetry=T/F : Enforce symmetry in interaction datasets [True]
dbsource=X : Source database (biogrid/dip/intact/mint/reactome) [biogrid]
mitab=T/F : Whether source file is in MITAB flat file format [True]
species=X : Name of species to use data for (will be read from file if BioGRID) [human]
taxid=LIST : List of NCBI Taxa IDs to use (for DIP and Domino) [9606]
unipath=PATH : Path to UniProt files [UniProt/]

Output Options

ppifas=T/F : Whether to output PPI datasets as fasta files into Species/BIOGRID_Datasets/ [True]
minseq=X : Minimum number of PPI sequences in order to output fasta file [3]
ppitab=T/F : Whether to output PPI table with aliases etc. [True]
alltypes=T/F : Output a full list of PPITypes. (Will populate the PPIType list) [False]

Special Options

hostvirus=T/F : Whether to pull out host-virus interactions only (MINT/IntAct only) [False]
vcodes=LIST : List/File of viral species codes for IntAct hostvirus=T []

History Module Version History

    # 0.0 - Initial Compilation with BioGRID Flat-File parsing information and sequence extraction for yeast.
    # 1.0 - Added cross-referencing via GeneCards output to generate Human Datasests.
    # 1.1 - Added IntAct and MINT parsing.
    # 1.2 - Add option to pull out host-virus interactions.
    # 1.3 - Added Reactome & DIP parsing.
    # 1.4 - Added rje_genemap object functionality.
    # 1.5 - Added Domino parsing and tracking of evidence codes.
    # 1.6 - Updated BioGRID parsing to use mitab format.

© 2015 RJ Edwards. Contact: