![]() |
|||||||||||||||
The gene modeling program Prodigal has been used by the Oak Ridge National Laboratory since November 2007. The improvements incorporated into Prodigal are based on years of experience hand editing and correcting genomes at the Oak Ridge National Laboratory. Prodigal is fast and it is publically available. A paper describing its methodology, accuracy, and specificity is in preparation.
Gene model parameters: Three gene modeling programs were run on the all contigs, using default settings that permit overlapping genes. Generation (ORNL) uses predominantly 6-mer statistics to recognize coding regions; it uses a proximity rule-based start call with ATG and GTG as potential starts. Glimmer uses interpolated Markov models (IMMs) to identify the coding regions; it uses ATG, GTG, and TTG as potential starts. Critica (v1.05) uses blastn to produce alignments from the entire dataset and derives dicodon statistics to recognize coding sequences. It uses an SD sensor with ATG, GTG, and TTG as potential starts. The Generation and Glimmer training set selected consisted of non-overlapping orfs greater than 900bp in length. Some orfs with "non-standard" start (and stop) codons will (rarely) be found. These are due to 1) sequence ambiguity, 2) orfs that run off the end of a contig or 3) orfs with high coding scores but no candidate start codon. These orfs need to be examined with care.
SiteMap
Feedback
Life Sciences Division
ORNL
Disclaimer
Webmaster