Miscellaneous Utilities for PacBio Sequencing
Copyright © 2015 Richard J. Edwards - See source code for GNU License Notice
This module estimates the % genome coverage and accuracy for different X coverage of a genome using PacBio sequencing, i.e. assuming a non-biased error distribution. Calculations use binomial/poisson distributions, assuming independence of sites. Accuracy is based on >50% reads covering a particular base having the correct call. Assuming random calls at the other positions, 25% of the "wrong" positions will be correct by chance. In reality, it will be even higher than this, assuming majority calls are used. Wrong calls will be split between three possible incorrect bases. Accuracy is therefore a conservative estimate.
All calculations are based on *assembled* reads, and therefore using the full
NOTE: This module has been superseded by SMRTSCAPE.
Main output is a results table containing the following fields:
Genome Coverage Options
SubRead Summary Options
Assembly Parameter Options
History Module Version History
# 0.0.0 - Initial Compilation. # 1.0.0 - Initial working version for server. # 1.1.0 - Added xnlist=LIST : Additional columns giving % sites with coverage >= Xn [10,25,50,100]. # 1.2.0 - Added assessment -> now PAGSAT. # 1.3.0 - Added seed and anchor read coverage generator (calculate=T). # 1.3.1 - Deleted assessment function. (Now handled by PAGSAT.) # 1.4.0 - Added new coverage=T function that incorporates seed and anchor subreads. # 1.5.0 - Added parseparam=FILES with paramlist=LIST to parse restricted sets of parameters. # 1.6.0 - Added seqstats=T/F function to add assembly sequence stats (if files found) to parseparam run.
© 2015 RJ Edwards. Contact: firstname.lastname@example.org.