Function
This module is based on the old rje_motifstats module. It is primarily for calculating empirical attributes of SLiMs
and their occurrences, such as Conservation, Hydropathy and Disorder.
Commandline
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~#
### Motif Occurrence Attribute Options ###
slimcalc=LIST
: List of additional attributes to calculate for occurrences - Cons,SA,Hyd,Fold,IUP,Chg,Comp []
winsize=X
: Used to define flanking regions for calculations. If negative, will use flanks *only* [0
]
relconwin=X
: Window size for relative conservation scoring [30
]
iupath=PATH
: The full path to the IUPred exectuable [c:/bioware/iupred/iupred.exe
]
iucut=X
: Cut-off for IUPred results (0.0 will report mean IUPred score) [0.0
]
iumethod=X
: IUPred method to use (long/short) [short
]
percentile=X
: Percentile steps to return in addition to mean [0
]
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~#
### Alignment Settings ###
usealn=T/F
: Whether to search for and use alignments where present. [False
]
alndir=PATH
: Path to pre-made alignment files [./
]
alnext=X
: File extension of alignment files, AccNum.X (checked before Gopher used) [orthaln.fas
]
gopherdir=PATH
: Path from which to look for GOPHER alignments (if not found in alndir) and/or run GOPHER [./]
usegopher=T/F
: Use GOPHER to generate orthologue alignments missing from alndir - see gopher.py options [False
]
fullforce=T/F
: Whether to force regeneration of alignments using GOPHER [False
]
orthdb=FILE
: File to use as source of orthologues for GOPHER []
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~#
### Conservation Parameters ###
conspec=LIST
: List of species codes for conservation analysis. Can be name of file containing list. [None
]
conscore=X
: Type of conservation score used: [rlc
]
- abs = absolute conservation of motif using RegExp over matched region
- pos = positional conservation: each position treated independently
- prob = conservation based on probability from background distribution
- prop = conservation of amino acid properties
- rlc = relative local conservation
- all = all methods for comparison purposes
consamb=T/F
: Whether to calculate conservation allowing for degeneracy of motif (True) or of fixed variant (False) [True
]
consinfo=T/F
: Weight positions by information content (does nothing for conscore=abs
) [True
]
consweight=X
: Weight given to global percentage identity for conservation, given more weight to closer sequences [0
]
- 0 gives equal weighting to all. Negative values will upweight distant sequences.
minhom=X
: Minimum number of homologues for making conservation score [1
]
homfilter=T/F
: Whether to filter homologues using seqfilter options [False
]
alngap=T/F
: Whether to count proteins in alignments that have 100% gaps over motif (True) or (False) ignore
as putative sequence fragments [False] (NB. All X regions are ignored as sequence errors.)
posmatrix=FILE
: Score matrix for amino acid combinations used in pos weighting. (conscore=pos
builds from propmatrix) [None
]
aaprop=FILE
: Amino Acid property matrix file. [aaprop.txt
]
masking=T/F
: Whether to use seq.info['MaskSeq'] for Prob cons, if present (else 'Sequence') [True
]
vnematrix=FILE
: BLOSUM matrix file to use for VNE relative conservation []
relgappen=T/F
: Whether to invoke the "Gap Penalty" during relative conservation calculations [True
]
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~#
### SLiM/Occ Filtering Options ###
slimfilter=LIST
: List of stats to filter (remove matching) SLiMs on, consisting of X*Y []
- X is an output stat (the column header),
- * is an operator in the list >, >=, !=, =, >= ,<
- Y is a value that X must have, assessed using *.
This filtering is crude and may behave strangely if X is not a numerical stat!
!!! Remember to enclose in "quotes" for <> filtering !!!
occfilter=LIST
: Same as slimfilter but for individual occurrences []
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~#
History Module Version History
# 0.0 - Initial Compilation based on rje_motifstats methods.
# 0.1 - Added new probability-based method, inspired by (but different to) Dinkel & Sticht (2007)
# 0.2 - Mended OccPos finding for wildcards. Added new relative conservation score.
# 0.3 - Added von Neumann entropy code.
# 0.4 - Added Webserver pickling of RLC lists.
# 0.5 - Altered to use GOPHER V3 and handle nested alignment directories.
# 0.6 - Minor tweak to avoid unwanted GOPHER directory generation.
# 0.7 - Added RLC to "All" conscore running.
# 0.8 - Made RLC the default.
# 0.9 - Improvements to use of GOPHER.
# 0.9.1 - Modified combining of motif stats to handle expectString format for individual values.
# 0.9.2 - Changed default conscore in docstring to RLC.
# 0.9.3 - Changed fudge error to warning.
# 0.10.0 - Added extra disorder methods to slimcalc.