SLiMSuite REST Server


Links
REST Home
EdwardsLab Homepage
EdwardsLab Blog
SLiMSuite Blog
SLiMSuite
Webservers
REST Pages
REST Status
REST Help
REST Tools
REST Alias Data
REST API
REST News
REST Sitemap

rje_blast V2.22.2

BLAST+ Control Module

Module: rje_blast
Description: BLAST+ Control Module
Version: 2.22.2
Last Edit: 14/05/18

Copyright © 2013 Richard J. Edwards - See source code for GNU License Notice


Imported modules: rje rje_blast_V1 rje_db rje_obj rje_seq rje_seqlist rje_sequence rje_zen rje_dismatrix_V2


See SLiMSuite Blog for further documentation. See rje for general commands.

Function

This is an updated BLAST module to utilise the improved BLAST+ library rather than the old Legacy BLAST. During the upgrade, other improvements are also being made to the module organisation in line with more recent tools in the SeqSuite package. In particular, BLAST Search, Hit and PWAln objects are being replaced by rje_Database tables and entries. This will allow greater flexibility in summary outputs for future. GABLAM statistics will also be entered directly into a Database table. To minimise memory requirements, these tables can be cleared as each BLAST result is read in if the data is not needed. This revised structure will also enable reading of tabular results from other searches as required in future.

In Version 2.0, the old BLAST options are included but these will be upgraded to newer BLAST+ options. To run the old version, use [BLASTRun.run(oldblast=True]{cmd:BLASTRun.run(oldblast}).

Commandline

## Search Options ##
blastprog=X : BLAST program to use. blastp=X also recognised. (BLAST -p X) [blastp]
blasttask=X : Flavour of blast to use (BLAST -task X) (NOTE: blastn default is megablast) [blast+ default]
blasti=FILE : Input file (BLAST -i FILE) [None]
blastd=FILE : BLAST database (BLAST -d FILE) [None]
formatdb=T/F : Whether to (re)format BLAST database [False]
blaste=X : E-Value cut-off for BLAST searches (BLAST -e X) [1e-4]
blastv=X : Number of one-line hits per query (BLAST -v X) [500]
blastb=X : Number of hit alignments per query (BLAST -b X) [500]
tophits=X : Sets max number of BLAST hits returned (blastb and blastv) [500]
blastf=T/F : Complexity Filter (BLAST -F X) [True]
blastcf=T/F : Use BLAST Composition-based statistics (BLAST -C X) [False]
blastg=T/F : Gapped BLAST (BLAST -g X) [True]
softmask=T/F : Whether to use soft masking for searches [True]
blastopt=FILE : File containing raw BLAST options (applied after all others) []
## Standalone Run Options ##
savelocal=LIST : Whether to generate extra output for the local BLAST hits table (GFF3/SAM/TDT/TDTSEQ) []
reftype=X : Whether to map SAM/GFF3 hits onto the Qry, Hit, Both or Combined [Hit]
qassemblefas=T/F: Special mode for running with outfmt=4 and then converting to fasta file [False]
qcomplete=T/F : Whether the query sequence should be full-length in qassemblefas output [False]
qconsensus=X : Whether to convert QAssemble alignments to consensus sequences (None/Hit/Full) [None]
qfasdir=PATH : Output directory for QAssemble alignments [./QFAS/]
## GABLAM Parameters ##
gablamfrag=X : Length of gaps between mapped residue for fragmenting local hits [100]
fragmerge=X : Max Length of gaps between fragmented local hits to merge [0]
localcut=X : Cut-off length for local alignments contributing to global GABLAM stats) [0]
localidcut=PERC : Cut-off local %identity for local alignments contributing to global GABLAM stats [0.0]
qassemble=T/F : Whether to fully assemble query stats from all hits [False]
selfsum=T/F : Whether to also include self hits in qassemble output [False] * qassemble must also be T *

## Output options ##
blasto=FILE : Output file (BLAST -o FILE) [*.blast]
restab=LIST : Whether to output summary results tables (Run/Search/Hit/Local/GABLAM) [Search,Hit]
runfield=T/F : Whether to include Run Field in summary tables. (Useful if appending.) [False]
## System Parameters ##
blastpath=PATH : Path to BLAST programs ['']
blast+path=PATH : Path to BLAST+ programs (will use blastpath if not given) ['']
legacy=T/F : Whether to run in "legacy" mode using old BLAST commands etc. (Currently uses BLAST) [False]
oldblast=T/F : Whether to run with old BLAST programs rather than new BLAST+ ones [False]
blasta=X : Number of processors to use (BLAST -a X) [1]
blastforce=T/F : Whether to force regeneration of new BLAST results if already existing [False]
ignoredate=T/F : Ignore date stamps when deciding whether to regenerate files [False]
blastgz=T/F : Whether to gzip (and gunzip) BLAST results files if keeping (not Windows) [False]

See also rje.py generic commandline options.

History Module Version History

    # 2.0 - Initial Compilation from rje_blast_V1 V1.14.
    # 2.1 - Tweaking code to work with GOPHER 3.x - removing self.info etc. Added blastObj() method.
    # 2.2 - Added gablamData() to return old-style GABLAM dictionary from table.
    # 2.3 - Added blastCluster() method to return UPC clustering and GABLAM distance matrix from a file.
    # 2.4 - Scrapped BLAST "Run" field to simplify code - keep a single run per BLASTRun object.
    # 2.5 - Minor modifications for SLiMCore UPC generation.
    # 2.6 - Minor bug fixes.
    # 2.7 - Fixed occasional oneline versus description mismatch error. Fixed some localhits bugs.
    # 2.7.1 - Added capacity to keep alignments following GABLAM calculations.
    # 2.7.2 - Fixed bug with hitToSeq fasta output for rje_seqlist.SeqList objects.
    # 2.8.0 - A more significant BLAST e-value setting will filter read results.
    # 2.9.0 - Added     qassemble=T/F   : Whether to fully assemble query stats from all hits [False].
    # 2.9.1 - Updated default BLAST and BLAST+ paths to '' for added modules.
    # 2.10.0 - Added nocoverage calculation based on local alignment table.
    # 2.11.0 - Added localFragFas output method.
    # 2.11.1 - Fixed snp local table revcomp bug. [Check this!]
    # 2.11.2 - Fixed GABLAM calculation bug when '*' in protein sequences.
    # 2.12.0 - Added localidcut %identity filter for GABLAM calculations.
    # 2.13.0 - Added GFF and SAM output for BLAST local tables for GABLAM, PAGSAT etc.
    # 2.14.0 - Updated gablamfrag=X and fragmerge=X usage. Fixed localFragFas position output.
    # 2.15.0 - Fragmerge no longer removes flanks and can be negative for enforced overlap!
    # 2.16.0 - Added qassemblefas mode for generating fasta file from outfmt 4 run.
    # 2.16.1 - Improved error messages for BLAST QAssembly.
    # 2.17.0 - qconsensus=X    : Whether to convert QAssemble alignments to consensus sequences (None/Hit/Full) [None]
    # 2.17.1 - Modified QAssembleFas output sequence names for better combining of hits. Added QFasDir.
    # 2.17.2 - Modified QAssembleFas output file names for better re-running. Fixed major QConsensus Bug.
    # 2.18.0 - Added REST output. Fixed QConsensus=Full bug.
    # 2.19.0 - Added blastgz=T/F     : Whether to zip and unzip BLAST results files [False]
    # 2.19.1 - Fixed erroneous i=-1 blastprog over-ride but not sure why it was happening.
    # 2.20.0 - Added localGFF output
    # 2.21.0 - Added blasttask=X setting for BLAST -task ['megablast']
    # 2.22.0 - Added dust filter for blastn and setting blastprog based on blasttask
    # 2.22.1 - Added trimLocal error catching for exonerate issues.
    # 2.22.2 - Fixed GFF attribute case issue.

rje_blast REST Output formats

The standard REST call is in the form: blast&blasti=FASFILE&blastd=FASFILE&blastprog=X. By default, results
will be parsed into summary tables. If qassemblefas=T, the summary tables will be replaced with alignments of
each query with its hits. Each local alignment is a separate sequence in the alignment unless qconsensus=X is
used to convert QAssemble alignments to consensus sequences.

* qconsensus=Hit converts QAssemble alignments to consensus sequences by Hit sequence.
* qconsensus=Full converts all QAssemble alignments to a single consensus sequence.

When qconsensus=X is used, the most abundant amino acid or nucleotide at each position is used. In the case of
a tie, the query sequence is used if it's one of the options, else the highest ranked one is used.

Run with &rest=docs for program documentation and more options. A plain text version is accessed with &rest=help.
&rest=OUTFMT can be used to retrieve individual parts of the output, matching the tabs in the default
(&rest=format) output. Individual OUTFMT elements can also be parsed from the full (&rest=full) server output,
which is formatted as follows:
###~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~###
# OUTFMT:
... contents for OUTFMT section ...


After running, click on the blastres tab to see overall BLAST search results. The following tables and sequence
file will usually also be output:

Available REST Outputs

blastres = Main BLAST results output file
search = Delimited summary of each search [tab]
hit = Delimited summary of all hit [tab]
local = Delimited table of all local BLAST alignments [tab]
blasti = BLAST search input (queries) [fas]
blastd = BLAST search database [fas]
qassembly = Assembled QAssembly hits as aligned fasta for first query [fas]

NOTE: If run in &qassemblefas=T mode, the search, hit and local tables will not be output. The first
query will have QAssembly output in the qassembly tab. All queries will have also their own tab with their
qassembly alignment, named after their accession numbers.

© 2015 RJ Edwards. Contact: richard.edwards@unsw.edu.au.