mapDamage2.0: Fast approximate Bayesian estimates of ancient DNA damage parameters – University of Copenhagen

GeoGenetics > News > mapDamage2.0: Fast app...

25 April 2013

mapDamage2.0: Fast approximate Bayesian estimates of ancient DNA damage parameters

Researchers from the Orlando group now release an updated version of their software mapDamage, two years after the publication of the first version.

Damage reactions affect DNA molecules after death and during the fossilization process, leaving a specific signature in the sequences generated by high-throughput sequencing platforms. mapDamage2.0 exploits this signature within a built-in DNA damage model in order to quantify series of key DNA damage parameters and provide information about the average structure of ancient DNA templates.

Even though the posterior distribution of those parameters represents the main output of mapDamage2.0, all features from the previous version are still available. These include well known nucleotide misincorporation and fragmentation patterns that can be used to authenticate sequences as truly ancient and not modern contamination by-products.

New applications

The statistical model of DNA damage implemented in mapDamage2.0 opens for many new possible applications. For instance, with accurate DNA damage estimates in hand, we can now study the kinetics of post-mortem DNA degradation over time in different environments.

We can also tease apart those substitutions that likely originate from post-mortem degradation and therefore limit their impact in downstream analyses. This significantly improves the analysis of ancient DNA sequences and the quality of ancient genomes where the amount of damage-related artifactual misincorporations often outcompasses the amount of genuine biological mutations.

mapDamage2.0 is available here together with an expedient documentation.

Huge performance

The code is written in Python using pysam, resulting in huge performance gain compared to the previous version. As a result, mapDamage2.0 does not require large RAM and CPU capacities and can analyze millions of sequence data within minutes even on laptop computers. Of note, mapDamage2.0 is compatible with any UNIX-like operating system and with all types of DNA libraries, including the most recent based on single-strand ligation.

Performance and examples of possible applications are presented in a companion article published in Bioinformatics (2013) doi: 10.1093/bioinformatics/btt193. First published online: April 23, 2013.