|
QSLiMFinder V2.3.0Query Short Linear Motif Finder
Copyright © 2008 Richard J. Edwards - See source code for GNU License Notice Imported modules:
See SLiMSuite Blog for further documentation. See FunctionQSLiMFinder is a modification of the basic SLiMFinder tool to specifically look for SLiMs shared by a query sequence and one or more additional sequences. To do this, SLiMBuild first identifies all motifs that are present in the query sequences before removing it (and its UPC) from the dataset. The rest of the search and stats takes place using the remainder of the dataset but only using motifs found in the query. The final correction for multiple testing is made using a motif space defined by the original query sequence, rather than the full potential motif space used by the original SLiMFinder. This is offset against the increased probability of the observed motif support values due to the reduction of support that results from removing the query sequence but could potentially still identify SLiMs will increased significance. Note that minocc and ambocc values *include* the query sequence, e.g. CommandlineBasic Input/Output Options
SLiMBuild Options I - Evolutionary Filtering
SLiMBuild Options II - Input Masking
SLiMBuild Options III - Basic Motif Construction
SLiMBuild Options IV - Ambiguity
SLiMBuild Options V - Advanced Motif Filtering
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~# SLiMChance Options
Advanced Output Options I - Output data
Advanced Output Options II - Output formats
Advanced Output Options III - Additional Motif Filtering
History Module Version History# 0.0 - Initial Compilation based on SLiMFinder 3.5. # 1.0 - Test & Modified to include AA masking. # 1.1 - Added sizesort. # 1.2 - Added the addquery function. # 1.3 - Updated the output for Max/Min filtering and the pickup options. # 1.4 - Added additional dictionary and list to store Query dimers and SLiMs for motif space calculations. # 1.4 - Added qexact=T/F option for calculating Exact Query motif space (True) or estimating from dimers (False). # 1.5 - Implemented SigV calculation. Modified extras setting. # 1.6 - Removed excess module imports. # 1.7 - Fixed "MustHave=LIST" correction of motif space. # 1.8 - Added cloudfix=T/F Restrict output to clouds with 1+ fixed motif (recommended) [False]. Consolidating output. # 1.9 - Preparation for QSLiMFinder V2.0 & SLiMCore V2.0 using newer RJE_Object. # 2.0 - Converted to use rje_obj.RJE_Object as base. Version 1.9 moved to legacy/. # 2.1.0 - Added PTMData and PTMList options. # 2.1.1 - Switched feature masking OFF by default to give consistent Uniprot versus FASTA behaviour. # 2.2.0 - Added map and failed outputs for uniprotid=LIST input. # 2.3.0 - Modified qregion=X,Y to be 1-L numbering. QSLiMFinder REST Output formatsSLiMs and SLiMFinderShort linear motifs (SLiMs) in proteins are functional microdomains of fundamental importance in many biologicalsystems. SLiMs typically consist of a 3 to 10 amino acid stretch of the primary protein sequence, of which as few as two sites may be important for activity. SLiMFinder is a SLiM discovery program building on the principles of the SLiMDisc software for accounting for evolutionary relationships between input proteins. This stops results being dominated by motifs shared for reasons of history, rather than function. SLiMFinder runs in two phases: (1) SLiMBuild constructs the motif search space based on number of defined positions, maximum length of "wildcard spacers" and allowed amino acid ambiguities; (2) SLiMChance assesses the over-representation of all motifs, correcting for the size of the SLiMBuild search space. This gives SLiMFinder high specificity. Protein sequences can be masked prior to SLiMBuild. Disorder masking (using IUPred predictions) is highly recommended. Other masking options are described in the manual and/or literature. Running SLiMFinderThe standared REST server call for SLiMFinder is in the form:slimfinder Run with &rest=docs for program documentation and options. A plain text version is accessed with &rest=help .&rest=OUTFMT can be used to retrieve individual parts of the output, matching the tabs in the default( &rest=format ) output. Individual OUTFMT elements can also be parsed from the full (&rest=full ) server output,which is formatted as follows: ###~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~### # OUTFMT: ... contents for OUTFMT section ... More options are available through the SLiMFinder server: http://www.slimsuite.unsw.edu.au/servers/slimfinder.php After running, click on the main tab to see overall SLiM predictions. If any SLiMS have been predicted, theocc tab will have details of which proteins (and where) they occur.If no SLiMs are returned: [1] Try altering the masking settings. (Disorder masking is recommended. Conservation masking can sometimes help but it depend on the dataset.) [2] Try relaxing the probability cutoff. Set probcut=1.0 to see the best motifs, regardless of significance. (You may also want to reduce the topranks=X setting.) Available REST Outputsmain = Main results table of predicted SLiM patterns (if any) [extras=-1 ]occ = Occurrence table showing individual SLiM occurrences in input proteins [extras=0 ]upc = List of Unrelated Protein Clusters (UPC) used for evolutionary corrections [extras=0 ]cloud = Predicted SLiM "cloud" output, which groups overlapping motifs [extras=1 ]seqin = Input sequence data [extras=-1 ]slimdb = Parsed input sequences in fasta format, used for UPC generation etc. [extras=0 ]masked = Masked input sequences (masked residues marked with X ) [extras=1 ]mapping = Fasta format with positions of SLiM occurrences aligned [extras=1 ]motifaln = Fasta format of individual SLiM alignments (unmasked sequences) [extras=1 ]maskaln = Fasta format of individual SLiM alignments (masked sequences) [extras=1 ]Additional REST Outputs [extras>1]To get additional REST outputs, set or . This may increase run times noticeably,depending on the number of SLiMs returned. motifs = SLiM predictions reformatted in plain motif format for CompariMotif [extras=2 ]compare = Results of all-by-all CompariMotif search of predicted SLiMs [extras=2 ]xgmml = SLiMs, occurrences and motif relationships in a Cytoscape-compatible network [extras=2 ]dismatrix = Input sequence distance matrix [extras=3 ]rank = Main table in SLiMDisc output format [extras=3 ]dat.rank = Occurrence table in SLiMDisc output format [extras=3 ]teiresias = Motif prediction output in TEIRESIAS format [extras=3 teiresias=T ]teiresias.fasta = TEIRESIAS masked fasta output [extras=3 teiresias=T ]© 2015 RJ Edwards. Contact: richard.edwards@unsw.edu.au. |