### Basic Input/Output Options ###
seqin=FILE : Sequence file to search [
batch=LIST : List of files to search, wildcards allowed. (Over-ruled by
query=LIST : Return only SLiMs that occur in 1+ Query sequences (Name/AccNum/Seq Number) [
addquery=FILE : Adds query sequence(s) to batch jobs from FILE [
maxseq=X : Maximum number of sequences to process [
maxupc=X : Maximum UPC size of dataset to process [
sizesort=X : Sorts batch files by size prior to running (+1 small->big; -1 big->small; 0 none) [
walltime=X : Time in hours before program will abort search and exit [
resfile=FILE : Main QSLiMFinder results table [
resdir=PATH : Redirect individual output files to specified directory (and look for intermediates) [
buildpath=PATH : Alternative path to look for existing intermediate files [
force=T/F : Force re-running of BLAST, UPC generation and SLiMBuild [
pickup=T/F : Pick-up from aborted batch run by identifying datasets in resfile using RunID [
dna=T/F : Whether the sequences files are DNA rather than protein [
alphabet=LIST : List of characters to include in search (e.g. AAs or NTs) [
default AA or NT codes]
megaslim=FILE : Make/use precomputed results for a proteome (FILE) in fasta format [
megablam=T/F : Whether to create and use all-by-all GABLAM results for (gablamdis) UPC generation [
ptmlist=LIST : List of PTM letters to add to alphabet for analysis and restrict PTM data 
ptmdata=DSVFILE : File containing PTM data, including AccNum, ModType, ModPos, ModAA, ModCode
SLiMBuild Options I
efilter=T/F : Whether to use evolutionary filter [
blastf=T/F : Use BLAST Complexity filter when determining relationships [
blaste=X : BLAST e-value threshold for determining relationships [
altdis=FILE : Alternative all by all distance matrix for relationships [
gablamdis=FILE : Alternative GABLAM results file [None] (!!!Experimental feature!!!)
homcut=X : Max number of homologues to allow (to reduce large multi-domain families) [
SLiMBuild Options II
masking=T/F : Master control switch to turn off all masking if False [
dismask=T/F : Whether to mask ordered regions (see rje_disorder for options) [
consmask=T/F : Whether to use relative conservation masking [
ftmask=LIST : UniProt features to mask out (
imask=LIST : UniProt features to inversely ("inclusively") mask. (Seqs MUST have 1+ features) 
compmask=X,Y : Mask low complexity regions (same AA in X+ of Y consecutive aas) [
casemask=X : Mask Upper or Lower case [
motifmask=X : List (or file) of motifs to mask from input sequences 
metmask=T/F : Masks the N-terminal M (can be useful if
posmask=LIST : Masks list of position-specific aas, where list = pos1:aas,pos2:aas [
aamask=LIST : Masks list of AAs from all sequences (reduces alphabet) 
qregion=X,Y : Mask all but the region of the query from (and including) residue X to residue Y [
SLiMBuild Options III
termini=T/F : Whether to add termini characters (^ & $) to search sequences [
minwild=X : Minimum number of consecutive wildcard positions to allow [
maxwild=X : Maximum number of consecutive wildcard positions to allow [
slimlen=X : Maximum length of SLiMs to return (no. non-wildcard positions) [
minocc=X : Minimum number of unrelated occurrences for returned SLiMs. (Proportion of UP if < 1) [
absmin=X : Used if minocc<1 to define absolute min. UP occ [
alphahelix=T/F : Special i, i+3/4, i+7 motif discovery [
SLiMBuild Options IV
ambiguity=T/F : (
preamb=T/F) Whether to search for ambiguous motifs during motif discovery [
ambocc=X : Min. UP occurrence for subvariants of ambiguous motifs (minocc if 0 or > minocc) [
absminamb=X : Used if ambocc<1 to define absolute min. UP occ [
equiv=LIST : List (or file) of TEIRESIAS-style ambiguities to use [
wildvar=T/F : Whether to allow variable length wildcards [
combamb=T/F : Whether to search for combined amino acid degeneracy and variable wildcards [
SLiMBuild Options V
musthave=LIST : Returned motifs must contain one or more of the AAs in LIST (reduces search space) 
focus=FILE : FILE containing focal groups for SLiM return (see Manual for details) [
focusocc=X : Motif must appear in X+ focus groups (0 = all) [
- * See also rje_slimcalc options for occurrence-based calculations and filtering *
### SLiMChance Options ###
cloudfix=T/F : Restrict output to clouds with 1+ fixed motif (recommended) [
slimchance=T/F : Execute main QSLiMFinder probability method and outputs [
sigprime=T/F : Calculate more precise (but more computationally intensive) statistical model [
sigv=T/F : Use the more precise (but more computationally intensive) fix to mean UPC probability [
qexact=T/F : Calculate exact Query motif space (True) or over-estimate from dimers (False) (quicker) [
probcut=X : Probability cut-off for returned motifs [
maskfreq=T/F : Whether to use masked AA Frequencies (True), or (False) mask after frequency calculations [
aafreq=FILE : Use FILE to replace individual sequence AAFreqs (FILE can be sequences or aafreq) [
aadimerfreq=FILE: Use empirical dimer frequencies from FILE (fasta or *.aadimer.tdt) (!!!Experimental!!!) [
negatives=FILE : Multiply raw probabilities by under-representation in FILE (!!!Experimental!!!) [
smearfreq=T/F : Whether to "smear" AA frequencies across UPC rather than keep separate AAFreqs [
seqocc=T/F : Whether to upweight for multiple occurrences in same sequence (heuristic) [
probscore=X : Score to be used for probability cut-off and ranking (Prob/Sig) [
Advanced Output Options I
clouds=X : Identifies motif "clouds" which overlap at 2+ positions in X+ sequences (
0=minocc / -
runid=X : Run ID for resfile (allows multiple runs on same data) [
logmask=T/F : Whether to log the masking of individual sequences [
slimcheck=FILE : Motif file/list to add to resfile output 
Advanced Output Options II
teiresias=T/F : Replace TEIRESIAS, making *.out and *.mask.fasta files [
slimdisc=T/F : Emulate SLiMDisc output format (*.rank & *.dat.rank + TEIRESIAS *.out & *.fasta) [
extras=X : Whether to generate additional output files (alignments etc.) [
--1 = No output beyond main results file
- 0 = Generate occurrence file and cloud file
- 1 = Generate occurrence file, alignments and cloud file
- 2 = Generate all additional QSLiMFinder outputs
- 3 = Generate SLiMDisc emulation too (equiv
targz=T/F : Whether to tar and zip dataset result files (UNIX only) [
savespace=0 : Delete "unneccessary" files following run (best used with targz): [
- 0 = Delete no files
- 1 = Delete all bar *.upc and *.pickle
- 2 = Delete all bar *.upc (pickle added to tar)
- 3 = Delete all dataset-specific files including *.upc and *.pickle (not *.tar.gz)
Advanced Output Options III
topranks=X : Will only output top X motifs meeting probcut [
minic=X : Minimum information content for returned motifs [
allsig=T/F : Whether to also output all SLiMChance combinations (Sig/SigV/SigPrime/SigPrimeV) [
- * See also rje_slimcalc options for occurrence-based calculations and filtering *
History Module Version History
# 0.0 - Initial Compilation based on SLiMFinder 3.5.
# 1.0 - Test & Modified to include AA masking.
# 1.1 - Added sizesort.
# 1.2 - Added the addquery function.
# 1.3 - Updated the output for Max/Min filtering and the pickup options.
# 1.4 - Added additional dictionary and list to store Query dimers and SLiMs for motif space calculations.
# 1.4 - Added qexact=T/F option for calculating Exact Query motif space (True) or estimating from dimers (False).
# 1.5 - Implemented SigV calculation. Modified extras setting.
# 1.6 - Removed excess module imports.
# 1.7 - Fixed "MustHave=LIST" correction of motif space.
# 1.8 - Added cloudfix=T/F Restrict output to clouds with 1+ fixed motif (recommended) [False]. Consolidating output.
# 1.9 - Preparation for QSLiMFinder V2.0 & SLiMCore V2.0 using newer RJE_Object.
# 2.0 - Converted to use rje_obj.RJE_Object as base. Version 1.9 moved to legacy/.
# 2.1.0 - Added PTMData and PTMList options.
# 2.1.1 - Switched feature masking OFF by default to give consistent Uniprot versus FASTA behaviour.
# 2.2.0 - Added map and failed outputs for uniprotid=LIST input.
QSLiMFinder REST Output formats
SLiMs and SLiMFinder
Short linear motifs (SLiMs) in proteins are functional microdomains of fundamental importance in many biological
systems. SLiMs typically consist of a 3 to 10 amino acid stretch of the primary protein sequence, of which as few
as two sites may be important for activity. SLiMFinder is a SLiM discovery program building on the principles of
the SLiMDisc software for accounting for evolutionary relationships between input proteins. This stops results
being dominated by motifs shared for reasons of history, rather than function. SLiMFinder runs in two phases:
(1) SLiMBuild constructs the motif search space based on number of defined positions, maximum length of "wildcard
spacers" and allowed amino acid ambiguities; (2) SLiMChance assesses the over-representation of all motifs,
correcting for the size of the SLiMBuild search space. This gives SLiMFinder high specificity.
Protein sequences can be masked prior to SLiMBuild. Disorder masking (using IUPred predictions) is highly
recommended. Other masking options are described in the manual and/or literature.
The standared REST server call for SLiMFinder is in the form:
for program documentation and options. A plain text version is accessed with
can be used to retrieve individual parts of the output, matching the tabs in the default
) output. Individual
elements can also be parsed from the full (
) server output,
which is formatted as follows:
... contents for OUTFMT section ...
More options are available through the SLiMFinder server: http://www.slimsuite.unsw.edu.au/servers/slimfinder.php
After running, click on the
tab to see overall SLiM predictions. If any SLiMS have been predicted, the
tab will have details of which proteins (and where) they occur.
If no SLiMs are returned:  Try altering the masking settings. (Disorder masking is recommended. Conservation
masking can sometimes help but it depend on the dataset.)  Try relaxing the probability cutoff. Set
to see the best motifs, regardless of significance. (You may also want to reduce the
Available REST Outputs
= Main results table of predicted SLiM patterns (if any) [
= Occurrence table showing individual SLiM occurrences in input proteins [
= List of Unrelated Protein Clusters (UPC) used for evolutionary corrections [
= Predicted SLiM "cloud" output, which groups overlapping motifs [
= Input sequence data [
= Parsed input sequences in fasta format, used for UPC generation etc. [
= Masked input sequences (masked residues marked with
= Fasta format with positions of SLiM occurrences aligned [
= Fasta format of individual SLiM alignments (unmasked sequences) [
= Fasta format of individual SLiM alignments (masked sequences) [
Additional REST Outputs [extras>1]
To get additional REST outputs, set
. This may increase run times noticeably,
depending on the number of SLiMs returned.
= SLiM predictions reformatted in plain motif format for CompariMotif [
= Results of all-by-all CompariMotif search of predicted SLiMs [
= SLiMs, occurrences and motif relationships in a Cytoscape-compatible network [
= Input sequence distance matrix [
= Main table in SLiMDisc output format [
= Occurrence table in SLiMDisc output format [
= Motif prediction output in TEIRESIAS format [
= TEIRESIAS masked fasta output [