BioGRID Database processing module
Copyright © 2007 Richard J. Edwards - See source code for GNU License Notice
This module is designed primarily for parsing the plain text ORGANISM downloads from the BioGRID database. These have names in the form: BIOGRID-ORGANISM-Saccharomyces_cerevisiae-2.0.27.tab.txt.
BioGRID tables contain useful information that can be used for cross-referencing to other sources, namely the protein names and gene symbols/aliases. The latter will be added to the dict['Mapping'] links dictionary of the BioGRID object, linking each symbol to the primary protein ID. These protein IDs will be used for storing the PPI data (in dict['PPI']) and extracting gene data from external sequence databases. These sequence databases need to be provided separately. This will be read in and added to the dict['Protein'] which will also store gene symbol data etc.
The selection of sequence files might turn out to be quite tricky, as different species have very different protein identifiers used. I will add a list of recommended sequence sources as I find them:
BioGRID contains data for a number of experimental types. Those of interest can be specified with the
IntAct has the following: anti bait coip | pull down | two hybrid pooling | two hybrid | tap | x-ray diffraction | anti tag coip | fluorescence imaging | cosedimentation | elisa | protein kinase assay | coip | biochemical | antibody array | confocal microscopy | beta galactosidase | imaging techniques | two hybrid array | molecular sieving | ion exchange chrom | affinity chrom | protein array | enzymatic study | inferred by curator | far western blotting | spr | fps | phosphatase assay | fret | dhfr reconstruction | bn-page | peptide array | nmr | facs | affinity techniques | crosslink | itc | one hybrid | fluorescence | solution sedimentati | ch-ip | emsa | complementation | density sedimentatio | comig non denat gel | filter binding | chromatography | ub reconstruction | reverse phase chrom | emsa supershift | electron microscopy | protein crosslink | competition binding | mappit | gallex | gtpase assay | in gel kinase assay | spa | biophysical | radiolabeled methyl | experimental interac | fluorescence spectr | cd | bret | protein tri hybrid | transcription compl | deacetylase assay | footprinting | yeast display | saturation binding | protease assay | lambda phage | light scattering | htrf | fcs | toxcat | phage display | t7 phage | kinase htrf | methyltransferase as
MINT txt downloads can also be parsed. Experiment types for MINT include: affinity chromatography technologies | affinity technologies | anti bait coimmunoprecipitation | anti tag coimmunoprecipitation | beta galactosidase complementation | beta lactamase complementation | biochemical | bioluminescence resonance energy transfer | biophysical | chromatography technologies | circular dichroism | classical fluorescence spectroscopy | coimmunoprecipitation | colocalization by fluorescent probes cloning | colocalization by immunostaining | colocalization/visualisation technologies | competition binding | copurification | cosedimentation | cosedimentation in solution | cosedimentation through density gradients | cross-linking studies | electron microscopy | enzymatic studies | enzyme linked immunosorbent assay | experimental interaction detection | far western blotting | filter binding | fluorescence-activated cell sorting | fluorescence microscopy | fluorescence polarization spectroscopy | fluorescence technologies | fluorescent resonance energy transfer | gst pull down | his pull down | imaging techniques | isothermal titration calorimetry | lambda phage display | mass spectrometry studies of complexes | molecular sieving | nuclear magnetic resonance | peptide array | phage display | protease assay | protein array | protein complementation assay | protein kinase assay | pull down | saturation binding | surface plasmon resonance | t7 phage display | two hybrid | two hybrid array | two hybrid fragment pooling approach | two hybrid pooling approach | ubiquitin reconstruction | unknown | x-ray crystallography
Reactome interactions are restricted to those of the "reaction" type. There are also "neighbouring_reaction" and "direct_complex" and "indirect_complex"
DIP interactions are restricted to those with two uniprotkb IDs. DIP has similar annotation to MINT, with MI nos.
Domino interactions are restricted to those with two uniprotkb IDs. Has similar annotation to MINT, with MI nos.
BioGRID parsing and PPI Dataset Generation Options
History Module Version History
# 0.0 - Initial Compilation with BioGRID Flat-File parsing information and sequence extraction for yeast. # 1.0 - Added cross-referencing via GeneCards output to generate Human Datasests. # 1.1 - Added IntAct and MINT parsing. # 1.2 - Add option to pull out host-virus interactions. # 1.3 - Added Reactome & DIP parsing. # 1.4 - Added rje_genemap object functionality. # 1.5 - Added Domino parsing and tracking of evidence codes. # 1.6 - Updated BioGRID parsing to use mitab format.
© 2015 RJ Edwards. Contact: email@example.com.