|
rje_blast V2.27.0BLAST+ Control Module
Copyright © 2013 Richard J. Edwards - See source code for GNU License Notice Imported modules:
See SLiMSuite Blog for further documentation. See FunctionThis is an updated BLAST module to utilise the improved BLAST+ library rather than the old Legacy BLAST. During the upgrade, other improvements are also being made to the module organisation in line with more recent tools in the SeqSuite package. In particular, BLAST Search, Hit and PWAln objects are being replaced by rje_Database tables and entries. This will allow greater flexibility in summary outputs for future. GABLAM statistics will also be entered directly into a Database table. To minimise memory requirements, these tables can be cleared as each BLAST result is read in if the data is not needed. This revised structure will also enable reading of tabular results from other searches as required in future. In Version 2.0, the old BLAST options are included but these will be upgraded to newer BLAST+ options. To run the
old version, use Commandline## Search Options ## ## Output options ## See also rje.py generic commandline options. History Module Version History# 2.0 - Initial Compilation from rje_blast_V1 V1.14. # 2.1 - Tweaking code to work with GOPHER 3.x - removing self.info etc. Added blastObj() method. # 2.2 - Added gablamData() to return old-style GABLAM dictionary from table. # 2.3 - Added blastCluster() method to return UPC clustering and GABLAM distance matrix from a file. # 2.4 - Scrapped BLAST "Run" field to simplify code - keep a single run per BLASTRun object. # 2.5 - Minor modifications for SLiMCore UPC generation. # 2.6 - Minor bug fixes. # 2.7 - Fixed occasional oneline versus description mismatch error. Fixed some localhits bugs. # 2.7.1 - Added capacity to keep alignments following GABLAM calculations. # 2.7.2 - Fixed bug with hitToSeq fasta output for rje_seqlist.SeqList objects. # 2.8.0 - A more significant BLAST e-value setting will filter read results. # 2.9.0 - Added qassemble=T/F : Whether to fully assemble query stats from all hits [False]. # 2.9.1 - Updated default BLAST and BLAST+ paths to '' for added modules. # 2.10.0 - Added nocoverage calculation based on local alignment table. # 2.11.0 - Added localFragFas output method. # 2.11.1 - Fixed snp local table revcomp bug. [Check this!] # 2.11.2 - Fixed GABLAM calculation bug when '*' in protein sequences. # 2.12.0 - Added localidcut %identity filter for GABLAM calculations. # 2.13.0 - Added GFF and SAM output for BLAST local tables for GABLAM, PAGSAT etc. # 2.14.0 - Updated gablamfrag=X and fragmerge=X usage. Fixed localFragFas position output. # 2.15.0 - Fragmerge no longer removes flanks and can be negative for enforced overlap! # 2.16.0 - Added qassemblefas mode for generating fasta file from outfmt 4 run. # 2.16.1 - Improved error messages for BLAST QAssembly. # 2.17.0 - qconsensus=X : Whether to convert QAssemble alignments to consensus sequences (None/Hit/Full) [None] # 2.17.1 - Modified QAssembleFas output sequence names for better combining of hits. Added QFasDir. # 2.17.2 - Modified QAssembleFas output file names for better re-running. Fixed major QConsensus Bug. # 2.18.0 - Added REST output. Fixed QConsensus=Full bug. # 2.19.0 - Added blastgz=T/F : Whether to zip and unzip BLAST results files [False] # 2.19.1 - Fixed erroneous i=-1 blastprog over-ride but not sure why it was happening. # 2.20.0 - Added localGFF output # 2.21.0 - Added blasttask=X setting for BLAST -task ['megablast'] # 2.22.0 - Added dust filter for blastn and setting blastprog based on blasttask # 2.22.1 - Added trimLocal error catching for exonerate issues. # 2.22.2 - Fixed GFF attribute case issue. # 2.23.3 - Fixed LocalIDCut error for GABLAM and QAssemble stat filtering. # 2.24.0 - Added checkblast=T/F : Whether to check BLAST paths etc. on inititiaion [True] # 2.24.1 - Fixed GFF output for atypical local tables. # 2.25.0 - Added bitscore=T/F toggle to switch between BitScore (True) and regular Score (False) [True] # 2.26.0 - Initial Python3 code conversion. # 2.26.1 - Tweaked to handle BLAST v5 formatting. # 2.27.0 - Modified to handle NCBI nr without main fasta file. rje_blast REST Output formatsThe standard REST call is in the form:blast . By default, resultswill be parsed into summary tables. If qassemblefas=T , the summary tables will be replaced with alignments ofeach query with its hits. Each local alignment is a separate sequence in the alignment unless qconsensus=X isused to convert QAssemble alignments to consensus sequences. * qconsensus=Hit converts QAssemble alignments to consensus sequences by Hit sequence.* qconsensus=Full converts all QAssemble alignments to a single consensus sequence.When qconsensus=X is used, the most abundant amino acid or nucleotide at each position is used. In the case ofa tie, the query sequence is used if it's one of the options, else the highest ranked one is used. Run with &rest=docs for program documentation and more options. A plain text version is accessed with &rest=help .&rest=OUTFMT can be used to retrieve individual parts of the output, matching the tabs in the default( &rest=format ) output. Individual OUTFMT elements can also be parsed from the full (&rest=full ) server output,which is formatted as follows: ###~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~### # OUTFMT: ... contents for OUTFMT section ... After running, click on the blastres tab to see overall BLAST search results. The following tables and sequencefile will usually also be output: Available REST Outputsblastres = Main BLAST results output filesearch = Delimited summary of each search [tab]hit = Delimited summary of all hit [tab]local = Delimited table of all local BLAST alignments [tab]blasti = BLAST search input (queries) [fas]blastd = BLAST search database [fas]qassembly = Assembled QAssembly hits as aligned fasta for first query [fas]NOTE: If run in mode, the search , hit and local tables will not be output. The firstquery will have QAssembly output in the qassembly tab. All queries will have also their own tab with theirqassembly alignment, named after their accession numbers. © 2015 RJ Edwards. Contact: richard.edwards@unsw.edu.au. |