Module:	SLiMBench
Description:	Short Linear Motif prediction Benchmarking
Version:	1.9
Last Edit:	06/08/13

Imported modules: rje rje_db rje_obj rje_seq rje_seqlist rje_slim rje_slimcore rje_slimlist rje_uniprot rje_zen comparimotif_V3 slimmaker slimprob slimsearch

See SLiMSuite Blog for further documentation. See rje for general commands.

Function

SLiMBench has two primary functions:

1. Generating SLiM prediction benchmarking datasets from ELM (or other data in a similar format). This includes options for generating random and/or simulated datasets for ROC analysis etc.

2. Assessing the results of SLiM predictions against a Benchmark. This program is designed to work with SLiMFinder and QSLiMFinder output, so some prior results parsing may be needed for other methods.

Documentation for SLiMBench is currently under development. Please contact the author for more details.

Commandline

BENCHMARK ASSESSMENT OPTIONS

benchmark=T/F datatype=X resfiles=LIST compdb=FILE benchbase=X runid=LIST bycloud=X sigcut=LIST iccut=LIST slimlencut=LIST noamb=T/F # Add CompariMotif settings here for OT/TP etc. : Will look in this directory for input files if not found ['SourceData/']
: Download from ELM website of ELM classes ['elm_classes.tsv']
: Download from ELM website of ELM instances ['elm_instances.tsv']
: Download from ELM website of ELM Pfam domain interactors ['elm_interaction_domains.tsv']
: File of downloaded UniProt entries (See rje_uniprot for more details) ['ELM.dat']
: Output path for datasets generated with SLiMBench file generator [./SLiMBenchDatasets/]
: Whether to quit by default if input integrity is breached [True]
: Whether to generate SLiMBench datasets from ELM input [False]
: Whether to use SLiMMaker to "reduce" ELMs to more findable SLiMs [True]
: Minimum number of UPC for ELM dataset [True]
: Min information content for a motif (1 fixed position = 1.0) [2.0]
: Whether to generate datasets with specific Query proteins [True]
: List of flanking mask options [none,win300,win100,flank5,site]
: List of INI files containing search options (should have runid setting) []
: Download from ELM website of ELM Pfam domain interactors ['elm_interaction_domains.tsv']
: File mapping PFam domains onto genes/proteins (BioMart or HMM search) []
: File of gene identifier cross-reference data from rje_genemap []
: Path to 3DID sql data. Use rje_mysql sqldump to extract 3DID DMI data. []
: File of 3DID DMI data ['3did.DMI.csv']
: File mapping PDB identifiers onto genes/proteins []
: Whether to generate simulated datasets using reduced ELMs (if found) [False]
: Whether to generate randomised datasets (part of simulation if simulate=T) [False]
: Number of replicates for each random (or simulated) datasets [10]
: List of simulated ELM:Random rations [1,4,9,19]
: Number of "TPs" to have in dataset [5,10]
: Output path for creation of randomised datasets [./SLiMBenchDatasets/Random/]
: Base for random dataset name if simulate=F [ran]
: Source for new sequences for random datasets [None]
: Whether to use SLiMCore masking for query selection [True]
: Whether to perfrom SLiMBench benchmarking assessment against motif file [False]
: Type of data to be generated and/or benchmarked (elm/sim/simonly) [elm]
: List of (Q)SLiMFinder results files to use for benchmarking [*.csv]
: Motif file to be used for benchmarking (default = reduced elmclass file) []
: Basefile for SLiMBench benchmarking output [slimbench]
: List of factors to split RunID column into (on '.') ['Program','Analysis']
: Whether to compress results into clouds prior to assessment (True/False/Both) [Both]
: Significance thresholds to use for assessment [0.05,0.01,0.001,0.0001]
: Minimum IC for (Q)SLiMFinder results for benchmark assessment [2.0,2.1,3.0]
: List of individual SLiM lengths to return results for (0=All) [0,3,4,5]
: Filter out ambiguous patterns [False]

GENERAL OPTIONS

force=T/F : Whether to force regeneration of outputs (True) or assume existing outputs are right [False]
backups=T/F : Whether to (prompt if interactive and) generate backups before overwriting files [True]

See also rje.py generic commandline options.

History Module Version History

    # 0.0 - Initial Compilation.
    # 0.1 - Functional version with benchmarking dataset generation.
    # 1.0 - Consolidation of "working" version with additional basic benchmarking analysis.
    # 1.1 - Added simulated dataset construction and benchmarking.
    # 1.2 - Added MinIC filtering to benchmark assessment. Sorted beginning/end of line for reduced ELMs.
    # 1.3 - Made SimCount a list rather than Integer. Sorted CompariMotif assessment issue.
    # 1.4 - Added ICCut and SLiMLenCut as lists and output columns.
    # 1.5 - Added Summary Results output table. Removed PropRes.
    # 1.6 - Added "simonly" to datatype - calculates both SN and FPR from "sim" data (ignores "ran") to check query bias.
    # 1.7 - Added Benchmarking of ELM datasets without queries.
    # 1.8 - Partially added Benchmarking dataset generation from PPI data and 3DID.
    # 1.9 - Added memsaver option. Replaced SLiMSearch with SLiMProb. Altered default IO paths.

SLiMBench REST Output formats

Run with &rest=docs for program documentation and options. A plain text version is accessed with &rest=help.
&rest=OUTFMT can be used to retrieve individual parts of the output, matching the tabs in the default
(&rest=format) output. Individual OUTFMT elements can also be parsed from the full (&rest=full) server output,
which is formatted as follows:

###~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~###
# OUTFMT:
... contents for OUTFMT section ...

Available REST Outputs

There is currently no specific help available on REST output for this program.

SLiMSuite REST Server

SLiMBench V1.9

Short Linear Motif prediction Benchmarking

Function

Commandline

INPUT OPTIONS

ELM BENCHMARK GENERATION OPTIONS

ELM PPI/3DID BENCHMARK GENERATION OPTIONS

RANDOM/SIMULATION BENCHMARK GENERATION OPTIONS

BENCHMARK ASSESSMENT OPTIONS

GENERAL OPTIONS

History Module Version History

SLiMBench REST Output formats

Available REST Outputs