SLiMSuite REST Server


Links
REST Home
EdwardsLab Homepage
EdwardsLab Blog
SLiMSuite Blog
SLiMSuite
Webservers
REST Pages
REST Status
REST Help
REST Tools
REST Alias Data
REST API
REST News
REST Sitemap

SLiMBench V2.10.1

Short Linear Motif prediction Benchmarking

Module: SLiMBench
Description: Short Linear Motif prediction Benchmarking
Version: 2.10.1
Last Edit: 25/11/15
Citation: Palopoli N, Lythgow KT & Edwards RJ. Bioinformatics 2015; doi: 10.1093/bioinformatics/btv155

Copyright © 2012 Richard J. Edwards - See source code for GNU License Notice


Imported modules: rje rje_db rje_obj rje_ppi rje_seq rje_seqlist rje_slim rje_slimcore rje_slimlist rje_uniprot comparimotif_V3 slimmaker slimprob slimsearch


See SLiMSuite Blog for further documentation.

Function

SLiMBench has two primary functions:

1. Generating SLiM prediction benchmarking datasets from ELM (or other data in a similar format). This includes options for generating random and/or simulated datasets for ROC analysis etc.

2. Assessing the results of SLiM predictions against a Benchmark. This program is designed to work with SLiMFinder and QSLiMFinder output, so some prior results parsing may be needed for other methods.

If generate=F benchmark=F, SLiMBench will check and optionally download the input files but perform no additional processing or analysis.

Please see the SLiMBench manual for more details.

Commandline

SOURCE DATA OPTIONS

sourcepath=PATH/ : Will look in this directory for input files if not found ['SourceData/']
sourcedate=DATE : Source file date (YYYY-MM-DD) to preferentially use [None]
elmclass=FILE : Download from ELM website of ELM classes ['elm_classes.tsv']
elminstance=FILE : Download from ELM website of ELM instances ['elm_instances.tsv']
elminteractors=FILE : Download from ELM website of ELM interactors ['elm_interactions.tsv']
elmdomains=FILE : Download from ELM website of ELM Pfam domain interactors ['elm_interaction_domains.tsv']
elmdat=FILE : File of downloaded UniProt entries (See rje_uniprot for more details) ['ELM.dat']
ppisource=X : Source of PPI data. (See documentation for details.) (HINT/FILE) ['HINT']
ppispec=LIST : List of PPI files/species/databases to generate PPI datasets from [HUMAN,MOUSE,DROME,YEAST]
ppid=X : PPI source protein identifier type (gene/uni/none; will work out from headers if None) [None]
randsource=FILE : Source for random/simulated dataset sequences. If species, will extract from UniProt [HUMAN]
download=T/F : Whether to download files directly from websites where possible if missing [True]
integrity=T/F : Whether to quit by default if source data integrity is breached [False]
unipath=PATH : Path to UniProt download. Will query website if "URL" [URL]

GENERAL/ELM BENCHMARK GENERATION OPTIONS

genpath=PATH : Output path for datasets generated with SLiMBench file generator [./SLiMBenchDatasets/]
generate=T/F : Whether to generate SLiMBench datasets from ELM input. [False]
genspec=LIST : Restrict ELM/OccBench datasets to listed species (restricts ELM instances) []
slimmaker=T/F : Whether to use SLiMMaker to "reduce" ELMs to more findable SLiMs [True]
minupc=X : Minimum number of UPC for benchmark dataset [3]
maxseq=X : Maximum number of sequences for benchmark datasets [0]
minic=X : Min information content for a motif (1 fixed position = 1.0) [2.0; 1.1 for OccBench]
filterdir=X : Directory suffix for filtered benchmarking datasets [_Filtered/]
queries=T/F : Whether to generate datasets with specific Query proteins [False]
flankmask=LIST : List of flanking mask options (used with queries and simbench) [none,win100,flank5,site]
elmbench=T/F : Whether to generate ELM datasets [True]
ppibench=T/F : Whether to generate ELM PPI datasets [True]
domlink=T/F : Link ELMs to PPI via Pfam domains (True) or (False) just use direct protein links [True]
itype=X : Interaction identifer for PPI datasets [first element of ppisource]
dombench=T/F : Whether to generate Pfam domain ELM PPI datasets [True]
occbench=T/F : Whether to generate ELM OccBench datasets [True]

RANDOM/SIMULATION BENCHMARK GENERATION OPTIONS

simbench=T/F : Whether to generate simulated datasets using reduced ELMs (if found) [False]
ranbench=T/F : Whether to generate randomised datasets (part of simulation if simbench=T) [False]
randreps=X : Number of replicates for each random (or simulated) datasets [8]
simcount=LIST : Number of "TPs" to have in dataset [4,8,16]
simratios=LIST : List of simulated ELM:Random ratios [0,1,3,7,15,31]
randir=PATH : Output path for creation of randomised datasets [./SLiMBenchDatasets/Random/]
randbase=X : Base for random dataset name if simbench=F [ran]
masking=T/F : Whether to use SLiMCore masking for query selection [True]
searchini=FILE : INI file containing SLiMProb search options that restrict returned positives []
maxseq=X : Maximum number of randsource sequences for SLiM to hit (also maxaa and maxupc limits) [1000]

BENCHMARK ASSESSMENT OPTIONS

benchmark=T/F : Whether to perfrom SLiMBench benchmarking assessment against motif file [False]
datatype=X : Type of data to be generated and/or benchmarked (occ/elm/ppi/sim/simonly) [elm]
queries=T/F : Whether to datasets have specific Query proteins [False]
resfiles=LIST : List of (Q)SLiMFinder results files to use for benchmarking [*.csv]
compdb=FILE : Motif file to be used for benchmarking [elmclass file] (reduced unless occ/ppi)
occbenchpos=FILE : File of all positive occurrences for OccBench [genpath/ELM_OccBench/ELM.full.ratings.csv]
benchbase=X : Basefile for SLiMBench benchmarking output [slimbench]
runid=LIST : List of factors to split RunID column into (on '.') ['Program','Analysis']
bycloud=X : Whether to compress results into clouds prior to assessment (True/False/Both) [Both]
sigcut=LIST : Significance thresholds to use for assessment [0.1,0.05,0.01,0.001,0.0001]
iccut=LIST : Minimum IC for (Q)SLiMFinder results for elm/sim/ppi benchmark assessment [2.0,2.1,3.0]
slimlencut=LIST : List of individual SLiM lengths to return results for (0=All) [0,3,4,5]
noamb=T/F : Filter out ambiguous patterns [False]

GENERAL OPTIONS

force=T/F : Whether to force regeneration of outputs (True) or assume existing outputs are right [False]
backups=T/F : Whether to (prompt if interactive and) generate backups before overwriting files [True]

See also rje.py generic commandline options.

History Module Version History

    # 0.0 - Initial Compilation.
    # 0.1 - Functional version with benchmarking dataset generation.
    # 1.0 - Consolidation of "working" version with additional basic benchmarking analysis.
    # 1.1 - Added simulated dataset construction and benchmarking.
    # 1.2 - Added MinIC filtering to benchmark assessment. Sorted beginning/end of line for reduced ELMs.
    # 1.3 - Made SimCount a list rather than Integer. Sorted CompariMotif assessment issue.
    # 1.4 - Added ICCut and SLiMLenCut as lists and output columns.
    # 1.5 - Added Summary Results output table. Removed PropRes.
    # 1.6 - Added "simonly" to datatype - calculates both SN and FPR from "sim" data (ignores "ran") to check query bias.
    # 1.7 - Added Benchmarking of ELM datasets without queries.
    # 1.8 - Partially added Benchmarking dataset generation from PPI data and 3DID.
    # 1.9 - Added memsaver option. Replaced SLiMSearch with SLiMProb. Altered default IO paths.
    # 1.9 - Removed 3DID again: new ELM interaction_domains file has position-specific PPI details.
    # 2.0 - Major overhaul of input options to standardise/clarify. Implemented auto-downloads and PPI datasets.
    # 2.1 - Fixed memsaver=T unless in development mode (dev=T). Removed old Assessment. Tested with simbench analysis.
    # 2.2 - Replaced searchini=LIST with searchini=FILE and moved to SimBench commands.
    # 2.2 - Modified the FN/TN and ResNum calculations. No longer rate TP in random data as OT.
    # 2.3 - Changed the default to queries=F. SearchINI bug fix. Added occbench generation.
    # 2.4 - Improved error messages.
    # 2.5 - Basic OccBench assessment benchmarking. Added ELM Uniprot acclist output. (Download issues?)
    # 2.6 - Added ELM domain interactions table: http://www.elm.eu.org/infos/browse_elm_interactiondomains.tsv.
    # 2.6 - Fixed issues introduced with new SLiMCore V2.0 SLiMSuite code.
    # 2.7 - Reinstate filtering. (Not sure why disabled.) Add genspec=LIST to filter by species. Added domlink=T/F.
    # 2.8.0 - Implemented PPIBench benchmarking for datasets without Motifs in name.
    # 2.8.1 - Removed use of Protein name for ELM Uniprot entries due to problems mapping old IDs.
    # 2.9.0 - Added SLiMMaker ELM reduction table and output.
    # 2.9.1 - Enabled download only with generate=F benchmark=F.
    # 2.10.0 - Add generation of table mapping PPIBench dataset generation.
    # 2.10.1 - Updated ELM Source URLs.

SLiMBench REST Output formats

Run with &rest=help for general options. Run with &rest=full to get full server output as text or &rest=format
for more user-friendly formatted output. Individual outputs can be identified/parsed using &rest=OUTFMT.

© 2015 RJ Edwards. Contact: richard.edwards@unsw.edu.au.