HPC Job Bank
HPC in the Cloud


Dedicated to covering high-end cloud computing
in science, industry and the datacenter

Language Flags

DNAnexus and Google Team Up To Improve Access to Next-Gen DNA Sequencing Data


MOUNTAIN VIEW, Calif., Oct. 12 -- DNAnexus, Inc., today announced that it will provide a long-term solution for researchers who require access to the vast repository of DNA sequencing data contained in the public Sequence Read Archive (SRA) database. As a part of this initiative, DNAnexus will provide a freely accessible web-based search interface that simplifies searching and accessing these datasets, and improves their usability for life science research. Google Cloud Storage will support the hosting of the SRA data repository. This new community resource was made publically available today at sra.dnanexus.com.

The SRA has been the primary repository designated for data generated by the latest DNA sequencing technologies. For example, it houses all primary sequence data for National Institutes of Health-sponsored next-generation sequencing projects. In February 2011, the main U.S. source of public genomic data, NCBI (National Center for Biotechnology Information), announced that it would phase out hosting support of the SRA in its current form due to federal funding cuts. Uncertainty persists about whether there will be a resource available in the future to store and disseminate public DNA sequence data, but DNAnexus is aiming to provide a long-term solution.

"As a public repository of unique DNA sequencing data, the SRA has been an invaluable resource to the research community. However, the ever increasing size of datasets being submitted and the need to easily integrate them into downstream analyses has tested the limits of its utility," said Richard M. Myers, Ph.D., President, Director and Investigator of the HudsonAlpha Institute for Biotechnology. "I am very pleased to see private entities such as DNAnexus step in to keep this resource freely accessible and provide a more intuitive and user-friendly portal for searching and retrieving these important genomic datasets."

Access to a wide range of DNA sequence data and the ability to easily integrate, analyze and manage these data are critical to advancing life science research. The DNAnexus SRA web site provides an intuitive interface for quickly identifying and browsing datasets of interest based on a number of query options. Through this interface, researchers can also download sequence read files including all sequences from the 1,000 Genomes Project for investigation using their own tools.

"The DNAnexus SRA website is an example of a 'big data' initiative that benefits from rethinking the interface in a 100% web-enabled world," said Eric Morse, head of business development, Google Cloud Storage. "Combining Google's massively scalable data storage infrastructure with DNAnexus' expertise in web-based interfaces, genomics data analysis, and visualization, researchers can quickly access the world's genomic information from any web browser."

Andreas Sundquist, Ph.D., CEO and co-founder of DNAnexus, Inc. added: "Life science research will be characterized by the ever growing presence of diverse genomic datasets and the ability to easily integrate them. Through this effort, as well as other initiatives announced today in support of academic users, we're helping to ensure that scientists can easily access a critical archive of genomics information in a hosted environment that allows them to focus on science, not software."

Users of the DNAnexus SRA website can also import SRA datasets into the commercial DNAnexus platform to access additional functionality such as mapping, RNA-seq, ChIP-seq, variant analysis, and data visualization, as well as tools for integrating SRA data with their own sequence data. To further support the academic research community, DNAnexus has also reduced its standard academic pricing by half and allows researchers to import SRA data into DNAnexus, for free.

The company also announced today an investment from Google Ventures, as well as other firms, in its latest funding round. Krishna Yeshwant of Google Ventures has joined the DNAnexus board of directors.

For more information and to access this hosted SRA database, visit http://sra.dnanexus.com.

About the Sequence Read Archive

In the spring of 2007, the Cold Spring Harbor Laboratory submitted the first DNA sequence data -- James Watson's 454 sequencing reads -- to what was then called the Short Read Archive at NCBI. The SRA has been considered a critical component of the genomics community infrastructure, providing two-way access to enormous datasets, integrating with European (EBI) and Japanese (DDBJ) repositories. Deposition of data in the SRA is a mandatory requirement of some funding agencies and open-access journals. The most active SRA submitters include the Broad Institute of MIT and Harvard, Washington University in St Louis, the Wellcome Trust Sanger Institute and Baylor College of Medicine. The largest individual global project generating next-generation sequence is the 1,000 Genomes Project ( http://www.1000genomes.org ) which has generated nearly half of all data submitted into the SRA. The most sequenced organisms are Homo sapiens with 65 percent and Mus musculus [house mouse] with 4 percent share of all bases in the SRA.

About DNAnexus

DNAnexus is powering the genomics revolution. The company's mission is to unlock the potential of DNA-based medicine and biotechnology by creating scalable and collaborative data technologies. The company has created a DNA data management and analysis platform that provides instant online genomics data centers for researchers and sequencing service providers alike. For more information, visit https://dnanexus.com.

-----

Source: DNAnexus, Inc.
 

Most Read Features

Most Read Around the Web

Most Read This Just In

Most Read Blogs


Feature Articles

SLA-Aware Scheduling and Virtual Efficiency

Researchers from the Suddhananda Engineering and Research Centre in Bhubaneswar, India developed a job scheduling system, which they call Service Level Agreement (SLA) scheduling, that is meant to achieve acceptable methods of resource provisioning similar to that of potential in-house systems. They combined that with an on-demand resource provisioner to ensure utilization optimization of virtual machines.
Read more...

CloudSigma CEO Elaborates on Science Cloud

Experimental scientific HPC applications are continually being moved to the cloud, as covered here in several capacities over the last couple of weeks. Included in that rundown, Co-founder and CEO of CloudSigma Robert Jenkins penned an article for HPC in the Cloud where he discussed the emergence of cloud technologies to supplement research capabilities of big scientific initiatives like CERN and ESA (the European Space Agency)...
Read more...

Examining Questions of Virtualization and Security in the Cloud

When considering moving excess or experimental HPC applications to a cloud environment, there will always be obstacles. Were that not the case, the cost effectiveness of cloud-based HPC would rule the high performance landscape. Jonathan Stewart Ward and Adam Barker of the University of St. Andrews produced an intriguing report on the state of cloud computing, paying a significant amount of attention to the problems facing cloud computing.
Read more...

Short Takes

Datapipe and Verne Global's Green Cloud

Jun 17, 2013 | With that in mind, Datapipe hopes to establish themselves as a green-savvy HPC cloud provider with their recently announced Stratosphere platform. Datapipe markets Stratosphere as a green HPC cloud service and in doing so partnering with Verne Global and their Icelandic datacenter, which is known for its propensity in green computing.
Read more...

IBM's Guide to Cloud Based HPC

Jun 12, 2013 | Cloud computing is gaining ground in utilization by mid-sized institutions who are looking to expand their experimental high performance computing resources. As such, IBM released what they call Redbooks, in part to assist institutions’ movement of high performance computing applications to the cloud.
Read more...

OpenStack and the SDSC Research Cloud

Jun 06, 2013 | The San Diego Supercomputer Center launched a public cloud system for universities in the area designed specifically to run on commodity hardware with high performance solid-state drives. The center, which currently holds 5.5 PB of raw storage, is open to educational and research users in the University of California.
Read more...

Sponsored Whitepapers

Best Practices in Big Data Storage

05/10/2013 | Cleversafe, Cray, DDN, NetApp, & Panasas | From Wall Street to Hollywood, drug discovery to homeland security, companies and organizations of all sizes and stripes are coming face to face with the challenges – and opportunities – afforded by Big Data. Before anyone can utilize these extraordinary data repositories, however, they must first harness and manage their data stores, and do so utilizing technologies that underscore affordability, security, and scalability.

Exploring the Potential of Heterogeneous Computing

04/02/2012 | AMD | Developers today are just beginning to explore the potential of heterogeneous computing, but the potential for this new paradigm is huge. This brief article reviews how the technology might impact a range of application development areas, including client experiences and cloud-based data management. As platforms like OpenCL continue to evolve, the benefits of heterogeneous computing will become even more accessible. Use this quick article to jump-start your own thinking on heterogeneous computing.

Sponsored Multimedias

Newsletters

Stay informed! Subscribe to HPC in the Cloud email Newsletters.

HPC in the Cloud Update
HPCwire Weekly Update
Digital Manufacturing Report
Datanami
HPCwire Conferences & Events
Job Bank
HPCwire Product Showcases



HPC Job Bank


Featured Events




  • November 17, 2013 - November 22, 2013
    SC'13
    Denver, CO
    United States


HPC in the Cloud Conferences & Events