Function
BADASP implements the previously published Burst After Duplication (BAD) algorithm, plus two variants that have been used
successfully in identifying functionally interesting sites in platelet signalling proteins and can identify Type I and
Type II divergence. In addition, several other measures of functional specificity and conservation are calculated and
output in plain text format for easy import into other applications. See Manual for details.
Commandline
# General Dataset Input/Output Options #
seqin=FILE
: Loads sequences from FILE
query=X
: Selects query sequence by name (or part of name, e.g. Accession Number)
basefile=X
: Basic 'root' for all files X.* [By default will use 'root' of seqin=FILE
if given or haq_AccNum if qblast]
v=X
: Sets verbosity (-1 for silent) [0
]
i=X
: Sets interactivity (-1 for full auto) [0
]
log=FILE
: Redirect log to FILE [Default = calling_program.log or basefile.log
]
newlog=T/F
: Create new log file. [Default = False: append log file
]
rank=T/F
: Whether to output ranks as well as scores [True
]
append=FILE
: Append results to FILE instead of standard output to *.badasp
trimtrunc=T/F
: Whether to trim the leading and trailing gaps (within groups) -> change to X [False
]
winsize=X
: Window size for window scores
# BADASP Statistics #
[funcspec=X1,X2
,..] : List of functional specificity methods to apply X1,X2,..,XN
- BAD = Burst After Duplication (2 Subfamilies)
- BADX = Burst After Duplication Extra (Query Subfam versus X subfams)
- BADN = Burst After Duplication vs N Subfams (2+ Subfams)
- SSC = Livingstone and Barton Score
- PDAD = Variant of Livingstone and Barton
- ETA = Evolutionary Trace Analysis (Basic)
- ETAQ = Evolutionary Trace Analysis (Quantitative)
- all = All of the above!
[seqcon=X1,X2
,..] : List of sequence conservation measures to apply X1,X2,..,XN
- Info = Information content
- PCon = Property Conservation (Absolute)
- MPCon = Mean Property Conservation
- QPCon = Mean Property Conservation with Query
- all = All of the above
# Tree and Grouping Options #
nsfin=FILE
: File name for Newick Standard Format tree
root=X
: Rooting of tree (rje_tree.py), where X is:
- mid = midpoint root tree. [Default]
- ran = random branch.
- ranwt = random branch, weighted by branch lengths.
- man = always ask for rooting options (unless i<0).
- FILE = with seqs in FILE as outgroup. (Any option other than above)
bootcut=X
: cut-off percentage of tree bootstraps for grouping.
mfs=X
: minimum family size [3
]
fam=X
: minimum number of families (If 0, no subfam grouping) [0
]
orphan=T/F
: Whether orphans sequences (not in subfam) allowed. [True
]
allowvar=T/F
: Allow variants of same species within a group. [False
]
gnspacc=T/F
: Convert sequences into gene_SPECIES__AccNum format wherever possible. [True]
groupspec=X
: Species for duplication grouping [None
]
group=X
: Grouping of tree
- man = manual grouping (unless i<0).
- dup = duplication (all species unless groupspec specified).
- qry = duplication with species of Query sequence (or Sequence 1) of treeseq
- one = all sequences in one group
- None = no group (case sensitive)
- FILE = load groups from file
# GASP ancestral sequence prediction options #
useanc=FILE
: Gives file of predicted ancestral sequences
pamfile=FILE
: Sets PAM1 input file [jones.pam
]
pammax=X
: Initial maximum PAM matrix to generate [100
]
pamcut=X
: Absolute maximum PAM matrix [1000
]
fixpam=X
: PAM distance fixed to X [100].
rarecut=X
: Rare aa cut-off [0.05].
fixup=T/F
: Fix AAs on way up (keep probabilities) [True].
fixdown=T/F
: Fix AAs on initial pass down tree [False].
ordered=T/F
: Order ancestral sequence output by node number [False].
pamtree=T/F
: Calculate and output ancestral tree with PAM distances [True].
desconly=T/F
: Limits ancestral AAs to those found in descendants [True].
xpass=X
: How many extra passes to make down & up tree after initial GASP [1].
# System Info Options #
win32=T/F
: Run in Win32 Mode [False
]
Please see help for rje_tree.py and rje_seq.py for additional options not covered here.