Short Linear Motif class module
Copyright © 2007 Richard J. Edwards - See source code for GNU License Notice
See SLiMSuite Blog for further documentation.
This module contains the new SLiM class, which replaces the old Motif class, for use with both SLiMFinder and SLiMSearch. In addition, this module encodes some general motif methods. Note that the new methods are not designed with Mass Spec data in mind and so some of the more complicated regexp designations for unknown amino acid order etc. have been dropped. Because the SLiM class explicitly deals with *short* linear motifs, wildcard gaps are capped at a max length of 9.
The basic SLiM class stores its pattern in several forms: - info['Sequence'] stores the original pattern given to the Motif object - info['Slim'] stores the pattern as a SLiMFinder-style string of defined elements and wildcard spacers - dict['MM'] stores lists of Slim strings for each number of mismatches with flexible lengths enumerated. This is used for actual searches in SLiMSearch. - dict['Search'] stores the actual regular expression variants used for searching, which has a separate entry for each length variant - otherwise Python RegExp gets confused! Keys for this dictionary relate to the number of mismatches allowed in each variant and match dict['MM'].
The following were previously used by the Motif class and may be revived for the new SLiM class if needed: - list['Variants'] stores simple strings of all the basic variants - length and ambiguity - for indentifying the "best" variant for any given match
The SLiM class is designed for use with the SLiMList class. When a SLiM is added to a SLiMList object, the SLiM.format() command is called, which generates the 'Slim' string. After this - assuming it is to be kept - SLiM.makeVariants() makes the 'Variants' list. If creating a motif object in another module, these method should be called before any sequence searching is performed. If mismatches are being used, the SLiM.misMatches() method must also be called.
SLiM occurrences are stored in the dict['Occ'] attribute. The keys for this are Sequence objects and values are either a simple list of positions (1 to L) or a dictionary of attributes with positions as keys.
These options should be listed in the docstring of the module using the motif class:
History Module Version History
# 0.0 - Initial Compilation. # 1.0 - Initial working version. # 1.1 - Added DNA option. # 1.2 - Added "N of M or B" format options. # 1.3 - Fixed the terminal variant bug. # 1.4 - Added makeSLiM method for converting a list of instances into a regexp # 1.5 - Added method to report whether motif splitting is necessary. # 1.6 - Fixed splitting bug introduced by lower case motifs. # 1.7 - Fixed import slimFix(slim) error that was reporting slimProb(). # 1.8 - Modified use of aa/dna defaults to (hopefully) not break when using extended alphabets. # 1.9 - Reinstated ambcut for slimToPattern() # 1.10.0 - Added varlength option to makeSlim() method. # 1.10.1 - Fixed varlength and terminal position compatibility. # 1.10.2 - Fixed issue of  returns. # 1.10.3 - Fixed makeSlim bug with variable length wildcards at start of sequence. # 1.11.0 - Added splitMotif() function. # 1.12.0 - Added equiv to makeSlim() function. # 1.12.1 - Modified error message.
© 2015 RJ Edwards. Contact: firstname.lastname@example.org.