How DRAGEN is being used by research and clinical genomics customers

You are here: Home / Applications

Application Case Study


The Customer

The Genome Analysis Centre (TGAC) is a world-class research institute focusing on the development of genomics and computational biology. TGAC offers state of the art DNA sequencing facility, unique by its operation of multiple complementary technologies for data generation. The Institute is a UK hub for innovative bioinformatics through research, analysis and interpretation of multiple, complex data sets. It hosts one of the largest computing hardware facilities dedicated to life science research in Europe. It is also actively involved in developing novel platforms to provide access to computational tools and processing capacity for multiple academic and industrial users and promoting applications of computational Bioscience. TGAC specializes in the sequencing and analysis of microbial, plant and animal genomes to advance a sustainable bioeconomy and protect the UK’s food security. With respect to the latter, it particularly focuses on the wheat genome.

The Challenge

One of the challenges for the team at TGAC is the wheat genome, which is five times bigger than the human genome and much more complex. Since wheat is a staple diet for over 30% of the world’s population, we are focused on improving yields for an increasing population (estimates predict nine billion people worldwide by 2050) against the challenges of less space to grow wheat and heat, drought and pathogens that severely deplete wheat yield worldwide. By understanding the genomic building blocks of wheat and its diversity, we can help breeders overcome some of these obstacles.

Accelerating genomics analysis remains one of the toughest challenges in life science research. All manner of optimizations are in use – disk streaming, optimized parallel files systems, algorithm tweaks, faster processors, and hardware accelerators, to name a few – all with varying results. Alignment against reference genomes is a fundamental task undertaken daily by TGAC researchers. Efficiency gains were needed due to the high throughput of genomic data processed at TGAC, where sequence alignment is critical to many sequencing projects.

The Solution

The collaboration between Edico Genome and TGAC resulted in the first adaptation of the DRAGEN technology for the analysis of non-human genomes as part of the Institute’s endeavors to sequence the DNA of plant, animal and microbial species to promote a sustainable bioeconomy.

The hardware modifications were carried out by Edico Genome engineers based on in-house testing using the datasets provided by TGAC. Edico engineers ran test sets in-house including rice and horse genomes. Each time slight adaptations were made to the pipelines to handle the new genome. The DRAGEN system at TGAC contains highly optimized analysis pipelines for both genome and transcriptome.

The DRAGEN system can be housed close to the sequencing machines or in the Cloud. Users can interact with the system using a GUI or an API. TGAC plans to incorporate the system into the existing HPC platform as a resource within the batch submission system.

The Results

Initial evaluations of DRAGEN showed that mapping against the ash tree genome was 177 times faster per processing core than TGAC’s local HPC systems, requiring only 7 minutes instead of 3 hours on one of the larger datasets. The fidelity of the results was comparable with those obtained on our local HPC cluster, yet it was much faster. Alignment runs on the rice genome that take approximately two hours on TGAC’s HPC servers took just three minutes using DRAGEN.

TGAC report seeing a ~20-fold speedup (or ~1900%) over HPC cluster runs with equal alignment accuracy.