|
rje_sequence V2.7.1DNA/Protein sequence object
Copyright © 2006 Richard J. Edwards - See source code for GNU License Notice Imported modules:
See SLiMSuite Blog for further documentation. See FunctionThis module contains the Sequence Object used to store sequence data for all PEAT applications that used DNA or protein sequences. It has no standalone functionality. This modules contains all the methods for parsing out sequence information, including species and source database, based on the format of the input sequences. If using a consistent but custom format for fasta description lines, please contact me and I can add it to the list of formats currently recognised. History Module Version History# 1.0 - Separated Sequence object from rje_seq.py # 1.1 - Rudimentary opt['GeneSpAcc'] added # 1.2 - Modified RegExp for sequence detail extraction # 1.3 - Added list of secondary accession numbers and hasID() method to check ID and all AccNum (and gnspacc combos) # 1.4 - Added Peptide Design methods # 1.5 - Added storing of case in a dictionary self.dict['Case'] = {'Upper':[(start,stop)],'Lower':[(start,stop)]} # 1.6 - Added disorder and case masking # 1.7 - Added FudgeFactor and AA codes # 1.8 - Added position-specific AA masking # 1.9 - Added EST translation functions. Fixed fudging. Added dna() method. # 1.10- Fixed sequence name bug # 1.11- Added recognition of UniRef # 1.12- Added AA masking # 1.13- Added Taxonomy list and UniProt dictionary for UniProt sourced sequences (primarily). # 1.14- Added maskRegion() # 1.15- Added disorder proportion calculations. # 1.16- Added additional Genbank and EnsEMBL BioMart sequence header recognition. # 1.17- Added nematode sequence conversion. # 2.0 - Replaced RJE_Object with RJE_ObjectLite. # 2.1 - Added re_unirefprot = re.compile('^([A-Za-z0-9\-]+)\s+([A-Za-z0-9]+)_([A-Za-z0-9]+)\s+') # 2.2 - Added more yeast species. # 2.3 - Added alternative self.info keys for sequence (for UniProt splice variants). Added SpliceVar dict. # 2.4 - Added recognition of modified IPI format. Added standalone low complexity masking. # 2.4.1 - Moved the gnspacc fragment recognition to reduce issues. Should perhaps remove completely? # 2.5.0 - Added yeast genome renaming. # 2.5.1 - Modified reverse complement code. # 2.5.2 - Tried to speed up dna2prot code. # 2.5.3 - Fixed genetic code warning error. # 2.6.0 - Added mutation dictionary to Ks calculation. # 2.7.0 - Added shift=X to maskRegion() for 1-L input. Fixed cterminal maskRegion. # 2.7.1 - Added spCode() to sequence. © 2015 RJ Edwards. Contact: richard.edwards@unsw.edu.au. |