BLAST+ Control Module
Copyright © 2013 Richard J. Edwards - See source code for GNU License Notice
This is an updated BLAST module to utilise the improved BLAST+ library rather than the old Legacy BLAST. During the upgrade, other improvements are also being made to the module organisation in line with more recent tools in the SeqSuite package. In particular, BLAST Search, Hit and PWAln objects are being replaced by rje_Database tables and entries. This will allow greater flexibility in summary outputs for future. GABLAM statistics will also be entered directly into a Database table. To minimise memory requirements, these tables can be cleared as each BLAST result is read in if the data is not needed. This revised structure will also enable reading of tabular results from other searches as required in future.
In Version 2.0, the old BLAST options are included but these will be upgraded to newer BLAST+ options. To run the
old version, use
## Search Options ##
## Output options ##
See also rje.py generic commandline options.
History Module Version History
# 2.0 - Initial Compilation from rje_blast_V1 V1.14. # 2.1 - Tweaking code to work with GOPHER 3.x - removing self.info etc. Added blastObj() method. # 2.2 - Added gablamData() to return old-style GABLAM dictionary from table. # 2.3 - Added blastCluster() method to return UPC clustering and GABLAM distance matrix from a file. # 2.4 - Scrapped BLAST "Run" field to simplify code - keep a single run per BLASTRun object. # 2.5 - Minor modifications for SLiMCore UPC generation. # 2.6 - Minor bug fixes. # 2.7 - Fixed occasional oneline versus description mismatch error. Fixed some localhits bugs. # 2.7.1 - Added capacity to keep alignments following GABLAM calculations. # 2.7.2 - Fixed bug with hitToSeq fasta output for rje_seqlist.SeqList objects. # 2.8.0 - A more significant BLAST e-value setting will filter read results. # 2.9.0 - Added qassemble=T/F : Whether to fully assemble query stats from all hits [False]. # 2.9.1 - Updated default BLAST and BLAST+ paths to '' for added modules. # 2.10.0 - Added nocoverage calculation based on local alignment table. # 2.11.0 - Added localFragFas output method. # 2.11.1 - Fixed snp local table revcomp bug. [Check this!] # 2.11.2 - Fixed GABLAM calculation bug when '*' in protein sequences. # 2.12.0 - Added localidcut %identity filter for GABLAM calculations. # 2.13.0 - Added GFF and SAM output for BLAST local tables for GABLAM, PAGSAT etc. # 2.14.0 - Updated gablamfrag=X and fragmerge=X usage. Fixed localFragFas position output. # 2.15.0 - Fragmerge no longer removes flanks and can be negative for enforced overlap! # 2.16.0 - Added qassemblefas mode for generating fasta file from outfmt 4 run. # 2.16.1 - Improved error messages for BLAST QAssembly. # 2.17.0 - qconsensus=X : Whether to convert QAssemble alignments to consensus sequences (None/Hit/Full) [None] # 2.17.1 - Modified QAssembleFas output sequence names for better combining of hits. Added QFasDir. # 2.17.2 - Modified QAssembleFas output file names for better re-running. Fixed major QConsensus Bug. # 2.18.0 - Added REST output. Fixed QConsensus=Full bug. # 2.19.0 - Added blastgz=T/F : Whether to zip and unzip BLAST results files [False] # 2.19.1 - Fixed erroneous i=-1 blastprog over-ride but not sure why it was happening. # 2.20.0 - Added localGFF output # 2.21.0 - Added blasttask=X setting for BLAST -task ['megablast'] # 2.22.0 - Added dust filter for blastn and setting blastprog based on blasttask # 2.22.1 - Added trimLocal error catching for exonerate issues. # 2.22.2 - Fixed GFF attribute case issue. # 2.23.3 - Fixed LocalIDCut error for GABLAM and QAssemble stat filtering. # 2.24.0 - Added checkblast=T/F : Whether to check BLAST paths etc. on inititiaion [True] # 2.24.1 - Fixed GFF output for atypical local tables. # 2.25.0 - Added bitscore=T/F toggle to switch between BitScore (True) and regular Score (False) [True]
rje_blast REST Output formatsThe standard REST call is in the form:
will be parsed into summary tables. If
each query with its hits. Each local alignment is a separate sequence in the alignment unless
used to convert QAssemble alignments to consensus sequences.
a tie, the query sequence is used if it's one of the options, else the highest ranked one is used.
which is formatted as follows:
###~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~### # OUTFMT: ... contents for OUTFMT section ...
After running, click on the
file will usually also be output:
Available REST Outputs
NOTE: If run in
query will have QAssembly output in the
qassembly alignment, named after their accession numbers.
© 2015 RJ Edwards. Contact: email@example.com.