Methodological improvements for increasing the yield of ancient DNA templates – University of Copenhagen

GeoGenetics > News > Methodological improve...

22 June 2012

Methodological improvements for increasing the yield of ancient DNA templates

Two new articles follow-up on a previous study which demonstrated the feasibility of sequencing ancient DNA using a third generation sequencing platform (the Helicos Heliscope platform), and examining methods for improving DNA extraction and template preparation of ancient samples, researchers at the Centre for GeoGenetics have demonstrated additional methods for improving the recovery of ancient DNA at the sample preparation step and in the bioinformatic pipeline.

As extraction methods are destructive and samples rare, optimizing the number of recovered endogenous sequences is of utmost importance in ancient DNA research.

Improving sequence extraction

In a paper published in BMC Genomics [1], researchers from the Orlando group investigated the impact of treating template molecules with DNA phosphatase prior to sequencing, and the effect of oligonucleotide spiking. Additionally, the researchers confirmed the existence of molecular preservation niches in large bone crystals from which DNA could be preferentially extracted.

Oligonucleotide spiking serve to compensate for potential low concentrations of template molecules, which for the Helicos platform may prevent the proper alignment of images of flow-cells across cycles. This has been shown to improve the ability to sequence modern extracts of low concentration. However, the researchers found that spiking of ancient extracts reduced the fraction of endogenous reads recovered, and also led to the inclusion of a significant fraction of oligonucleotides sequences which had not filtered, potentially due to sequencing errors.

The DNA phosphatase treatment aimed to remove phosphate groups from the 3’-end of DNA templates, which could remain after breaks in DNA backbones. Such groups could prevent sequencing using the Helicos HeliScope platform, as this platform requires a free hydroxyl group at the 3’-ends of template molecules, in order to allow poly-A tailing, and subsequent capture by oligo-dT probes on the surface of the flow cell. Researchers found that this treatment improved the amount of endogenous sequences recovered by up to 3.3-fold, while still providing molecular signatures of endogenous ancient DNA damage.

This study will be presented during the poster session at the SMBE conference in Dublin (23rd – 26th July 2012).

Improving sequence identification

In a companion paper published in the same issue of BMC Genomics [2], researchers at the Centre for GeoGenetics demonstrated that optimizing the bioinformatic pipeline used to process ancient DNA could yield further gains in the amount of endogenous ancient DNA recovered. This publication awards a collaborative effort encouraged by the strong expertise in genomics and bioinformatics available at the research group of Anders Krogh within the Centre.

It is typical of alignment software to assume that high-quality base-calls are found at the 5’-end of reads, an assumption that is known to be false for ancient DNA templates, where damage occurs preferentially at fragment termini. Furthermore, most alignment software assumes low rates of machine generated indels, an assumption that does not hold in the case of 3rd generation sequencing platforms.

Examining shotgun sequencing reads produced using 2nd and 3rd generation sequencing platforms, the researchers show that tailoring the behavior of alignment software to the specific characteristics of the ancient DNA templates, and the behavior of the sequencing platform, could yield significant improvements in the amount of endogenous DNA recovered.

Furthermore, the researchers determined that the in silico removal of putative damaged bases could further increase the amount of endogenous DNA sequences that were successfully recovered, and tested a simple improvement to the filtering of putative contamination, significantly reducing the amount of false positives, and thereby increasing the pool of endogenous sequences available for downstream analysis.


Taken together, these methodological improvements help increase the yield of endogenous DNA when targeting ancient templates, and also allows old data sets to be minded for additional endogenous sequences.

[1] Ginolhac A, Vilstrup J, Stenderup J, Rasmussen M, Stiller M, Shapiro B, Zazula G, Froese D, Steinmann KE, Thompson JF, AL-Rasheid KAS, Gilbert TMP, Willerslev E and Orlando L. Improving the performance of True Single Molecule Sequencing for ancient DNA. BMC Genomics 2012, 13:177.

[2] Schubert M, Ginolhac A, Lindgreen S, Thompson JF, Al-Rasheid KAS, Willerslev E, Krogh A and Orlando L. Improving ancient DNA read mapping against modern reference genomes. BMC Genomics 2012, 13:178.