|
|
Module: | taxolotl |
Description: | Taxolotl genome assembly taxonomy summary and assessment tool |
Version: | 0.1.1 |
Last Edit: | 25/11/21 |
|
Copyright © 2021 Richard J. Edwards - See source code for GNU License Notice
Imported modules:
rje
rje_db
rje_gff
rje_obj
rje_rmd
rje_seqlist
rje_kat
saaga
See SLiMSuite Blog for further documentation. See rje
for general commands.
Commandline
Input/Output options
seqin=FILE
: Protein annotation file to assess [annotation.faa
]
gffin=FILE
: Protein annotation GFF file [annotation.gff
]
cdsin=FILE
: Optional transcript annotation file for renaming and/or longest isoform extraction [annotation.fna
]
assembly=FILE
: Optional genome fasta file (required for some outputs) [None
]
basefile=X
: Prefix for output files [$SEQBASE
]
gffgene=X
: Label for GFF gene feature type ['gene'
]
gffcds=X
: Label for GFF CDS feature type ['CDS'
]
gffmrna=X
: Label for GFF mRNA feature type ['mRNA'
]
taxlevels=LIST
: List of taxonomic levels to report (* for superkingdom and below) ['*'
]
Run mode options
dochtml=T/F
: Generate HTML Taxolotl documentation (*.docs.html) instead of main run [False
]
Taxolotl options
taxdb=FILE
: MMseqs2 taxonomy database for taxonomy assignment [seqTaxDB
]
taxbase=X
: Output prefix for taxonomy output [$SEQBASE.$TAXADB
]
taxorfs=T/F
: Whether to generate ORFs from assembly if no seqin=FILE
given [True
]
taxbyseq=T/F
: Whether to parse and generate taxonomy output for each assembly (GFF) sequence [True
]
taxbycontig=T/F
: Whether to generate taxonomy output for each contig if the assembly is loaded [True
]
taxbyseqfull=T/F
: Whether generate full easy taxonomy report outputs for each assembly (GFF) sequence [False
]
taxsubsets=FILELIST
: Files (fasta/id) with sets of assembly input sequences (matching GFF) to summarise []
taxwarnrank=X
: Taxonomic rank (and above) to warn when deviating for consensus [family
]
bestlineage=T/F
: Whether to enforce a single lineage for best taxa ratings [True
]
mintaxnum=INT
: Minimum gene count in main dataset to keep taxon, else merge with higher level [2
]
TabReport options
tabreport=FILE
: Convert MMseqs2 report into taxonomy table with counts (if True use taxbase=X
) [None
]
taxhigh=X
: Highest taxonomic level for tabreport [class
]
taxlow=X
: Lowest taxonomic level for tabreport [species
]
taxpart=T/F
: Whether to output entries with partial taxonomic levels to tabreport [False
]
System options
forks=X
: Number of parallel sequences to process at once [0
]
killforks=X
: Number of seconds of no activity before killing all remaining forks. [36000
]
forksleep=X
: Sleep time (seconds) between cycles of forking out more process [0
]
tmpdir=PATH
: Temporary directory path for running mmseqs2 [./tmp/
]
History Module Version History
# 0.0.0 - Initial Compilation.
# 0.1.0 - Added tabreport function.
# 0.1.1 - Fix bug with contig output. Added seqname, start and end to contig summary.