OmniTier's CompStor Genome Assembly Software Stack Now Available

OmniTier's CompStor Genome Assembly Software Stack Now Available

The disruptive memory-centric acceleration compute architecture delivers supercomputing assembly times with only distributed x86 servers

OmniTier, an application acceleration company, announces the availability of its flagship scientific computing product, CompStorTM Assembly, for de novo DNA assembly and reference alignment. Designed for short read next-generation sequencing (NGS) data, CompStor Assembly delivers dramatically reduced assembly times and improved assembly quality on low-cost hardware compared to currently available assemblers. Using CompStor Assembly and eight CompStor Assembly server nodes, de novo assembly of human genome is achieved in about eight minutes. This performance equals the assembly time previously achieved with the NERSC’s Cray XC30 advanced supercomputer, using 15,360 processor cores and DRAM-based algorithm implementations. By contrast, CompStor Assembly nodes are standard servers based on low-cost x86 Intel processors and tiered DRAM and NVMe SSDs.

CompStor Assembly delivers on the need for fast, affordable, and high quality de novo DNA assembly to facilitate genomic workflows and analytics, and advance the field of genomics. Interest in de novo sequencing (that is, sequencing without the benefit of a reference genome) remains stronger than ever as researchers set their sights on mapping the entire set of the world’s biological domains. The quest for genomics-based personalized healthcare is driving the need for affordable, accurate, and near real-time genome sequencing. These needs present new problems in data storage and computational algorithms. OmniTier’s memory-centric acceleration approach provides the required breakthrough.

A single-node CompStor Assembly server was used to successfully perform de novo DNA assembly on several organisms of varying DNA lengths with high quality: Staphylococcus aureus (a bacterium with 2.82 Mbp); Apergillus nidulans (a fungus with 30.24 Mbp); Bombus terrestris (a bumblebee with 216.8 Mbp); and Homo sapiens (a primate with 3.84 Gbp). Compared to commonly-used open-source assemblers, implemented on the same hardware platform, CompStor Assembly provides about 17x acceleration in assembly time, while maintaining best-in-class assembly quality, as measured in industry-standard terms of NGA50, mismatches, and misassemblies. An eight-node CompStor Assembly compute cluster magnifies that acceleration to 100x, enabling de novo assembly of a human genome in about eight minutes.

CompStor Assembly optimizes cost and performance for large genome assemblies by managing its large data and its computation resources intelligently in compute clusters, ranging from one to many standard x86 Intel servers. The eight-minute result employed eight standard NEC dual-socket Intel Xeon servers, with 22 cores per socket, and 512GB installed DRAM per server, connected by a 10GbE network. For a 50X coverage human DNA input data set, CompStor Assembly consumes 128GB DRAM and 330GB of NVMe SSD on each server. Using low-cost, off-the-shelf server components, customers can seamlessly scale up performance by increasing the number of identical servers.

CompStor Assembly, in common with other scientific computing applications to be announced under the CompStor logo, can be configured to run on local commercial compute servers, proprietary compute systems, cloud compute farms, and integrated genomics workflow platforms.

CompStor Assembly is sampling this month with selected partners.