Module:	SLiMMutant
Description:	Short Linear Motif Mutation Analysis Tool
Version:	1.3
Last Edit:	16/09/14

Imported modules: rje rje_db rje_obj rje_seq rje_seqlist rje_slim rje_slimlist rje_slimcore rje_uniprot pingu_V4 slimprob

See SLiMSuite Blog for further documentation. See rje for general commands.

Function

SLiMMutant is a Short Linear Motif Mutation Analysis Tool, designed to identify and assess mutations that create and/or destroy SLiMs. There are three main run modes:

- Mode 1. Generating mutant datasets for analysis [generate=T]
The main input is: (1) a file of protein sequence mutations [mutfile=FILE] in a delimited text format with aa substitution [mutfield=X] and protein accession number [protfield=X] data; (2) a corresponding sequence file [seqin=FILE]; (3) a file of SLiMs to analyse [motifs=FILE]. This will process the data and generate two sequence files: *.wildtype.fas and *.mutant.fas. These files will be named after the input mutfile unless basefile=X is used.

- Mode 2. Run SLiMProb on datasets [slimprob=T]
This will run SLiMProb on the two datasets, once per *.ini file given by slimini=LIST. These runs should have distinct runid=X settings. If no *.ini files are given, as single run will be made using commandline settings.

- Mode 3. Compile results of SLiMProb runs. [analyse=T]
This will compare the *.wildtype.fas and *.mutant.fas results from the *.occ.csv file produced by SLiMProb. All mutations analysed will be identified from *.mutant.fas. SLiM occurrences are then matched up between wildtype and mutant versions of the same sequence. If none of the mutations have effected the SLiM prediction, then the wildtype and all mutant sequences will return the motif. If, on the other hand, mutations have created/destroyed motifs, occurrences will be missing from the wildtype and/or 1+ mutant sequences. All unaffected SLiM instances are first removed and altered SLiM instances output to *.MutOcc.csv. Differences between mutants and wildtypes are calculated for each RunID-Motif combination and summary results output to *.Mut_vs_WT.csv. If motlist=LIST is given, analysis is restricted to a subset of motifs.

Unless basefile=FILE is given, output files will be named after mutfile=FILE but output into the current run directory. If running in batch mode, basefile cannot be used.

NOTE: SLiMMutant is still in development and has not been thoroughly tested or benchmarked.

Commandline

SEQUENCE GENERATION METHODS

generate=T/F : Whether to run sequence generation pipeline [False]
mutfile=FILE : Delimited text file with sequence mutation info. Sets basefile. []
mutfield=X : Field in mutfile corresponding to AA subsitution data ['AAChange']
protfield=X : Field in mutfile corresponding to protein accession number ['Uniprot']
splitfield=X : Field in mutfile to split data on (saved as basefile.X.tdt) []
seqin=FILE : Input file with protein sequences []
motifs=FILE : Input file of SLiMs []
motlist=LIST : List of input SLiMs to restrict analysis to []
mutflanks=X : Generate for casemask=Upper of X aa flanking mutation (None if < 1) [0]
minmutant=X : Minimum number of mutants for output [100]
maxmutant=X : Maximum number of mutants for output [100000]

SLiMProb Run Methods

slimprob=T/F : Whether to run SLiMProb on *.wildtype.fas and *.mutant.fas (*=basefile) [False]
slimini=LIST : Lists of INI file with settings for SLiMProb run. Should include runid=X and resdir=PATH. []
resdir=PATH : Location of output files. SLiMProb resdir should be in slimini [SLiMMutant/ (and SLiMProb/)]

SLiMProb Results Analysis

analyse=T/F : Whether to analyse the results of a SLiMProb run [False]
resfile=FILE : Main SLiMProb results table (*.csv and *.occ.csv) [slimprob.csv]
runid=X : Limit analysis to SLiMProb RunID (blank = analyse all) []
buildpath=PATH : Alternative path to look for existing intermediate files (e.g. *.upc) [SLiMProb/]

SLiMProb PPI Analysis

slimppi=T/F : Whether to perform SLiMPPI analysis (will set analyse=T) [False]
sourcepath=PATH/ : Will look in this directory for input files if not found ['SourceData/']
sourcedate=DATE : Source file date (YYYY-MM-DD) to preferentially use [None]
ppisource=X : Source of PPI data. (HINT/FILE) FILE needs 'Hub' and 'SpokeUni' fields. ['HINT']
dmifile=FILE : Delimited text file containing domain-motif interaction data ['elm_interaction_domains.tsv']

Batch running

batch=FILELIST : List of mutfiles to run in batch mode. Wildcards allowed. []

See also rje.py generic commandline options.

History Module Version History

    # 0.0 - Initial Compilation.
    # 1.0 - Working version with standalone functionality.
    # 1.1 - Minor tweaks to generate method to increase speed. (Make index in method.) Added splitfield=X.
    # 1.2 - Added a batch mode for mutfiles - all other options will be kept fixed. Added maxmutant and minmutant.
    # 1.3 - Added SLiMPPI analysis (will set analyse=T). Started basing on SLiMCore

SLiMMutant REST Output formats

Run with &rest=help for general options. Run with &rest=full to get full server output as text or &rest=format
for more user-friendly formatted output. Individual outputs can be identified/parsed using &rest=OUTFMT for:

disorder = List of predicted disorder scores for proteins (if consmask=T or using special disorder rest call)
rlc = List of RLC scores for proteins (if consmask=T or using special rlc rest call)
upc = Groupings of unrelated proteins (if efilter=T)

Note that SLiMCore can either be run as http://rest.slimsuite.unsw.edu.au/slimcore or special runs can be used to
try and directly access RLC or Disorder scores for an individual protein:

http://rest.slimsuite.unsw.edu.au/disorder&acc=X
http://rest.slimsuite.unsw.edu.au/rlc&acc=X&spcode=X

If these data already exist, they will be returned directly as plain text. If not, a jobid will be returned,
which will have the desired output once run.

SLiMSuite REST Server

SLiMMutant V1.3

Short Linear Motif Mutation Analysis Tool