Intel HPC Job Bank
HPC in the Cloud


Dedicated to covering high-end cloud computing
in science, industry and the datacenter

Language Flags

Blue Waters to Use Globus Online File Transfer Service


When Blue Waters comes online this year, researchers will be able to spend more of their valuable time on actual research, thanks in part to Globus Online, a cloud-based managed file transfer service. The National Center for Supercomputing Applications (NCSA) has selected Globus Online as the data movement solution for the Blue Waters supercomputer. Globus Online allows end users to move and synchronize files reliably between multiple locations without having to install and manage complex software.

Globus Online logoBlue Waters is on track to becoming one of the world's premier supercomputers with an expected peak performance of more than 11.5 petaflops. The powerful system will enable cutting-edge scientific breakthroughs in a broad range of disciplines, but first researchers need a reliable, secure, and preferably easy, way to move their data. That's what Globus Online was made for, explains Steve Tuecke, co-PI for the Globus project at University of Chicago and Argonne National Laboratory's Computation Institute.

While there are other file transfer tools out there, they are more targeted to the consumer space or commercial market, says Tuecke. What sets Globus Blue Waters National Petascale Computing Facility exteriorOnline apart is that it's geared toward the research world, where high-speed networks go hand-in-hand with complex security requirements. Globus Online caters specifically to the research community, and has targeted that audience every step of the way.

In light of this, it's perhaps not surprising that when NCSA needed a data transfer solution for Blue Waters, Globus Online was their first choice. Traditionally, Globus would make the first move with prospective customers, but Tuecke said that in this instance, it was the other way around: NCSA reached out to them. Blue Waters, unbeknownst to Globus, had performed their own trial runs with the solution, liked what it offered, and went to Globus to work out a partnership.

Michelle Butler, NCSA technical program manager, shares a similar story. She notes that there are other solutions out there than can either transfer data or manage data, but not with the ease-of-use and reliability and security that Globus offers. According to Butler, when it comes to "a push-button, easy data transfer mechanism that uses the tools [they've] been using for years (Globus tools like GridFTP), nothing else compares."

Using Globus Online

Butler explains that Globus Online will be used for transferring data into and out of the Blue Waters machine. The SaaS-based tool will also be tasked with moving data within NCSA, and into both the Blue Waters and NCSA archives. "Anywhere data needs to be transferred within NCSA or Blue Waters, it will be transferred across the Globus Online mechanism," says Butler.

When asked about the unique requirements of Blue Waters, Butler cited the use of one-time passwords, an authentication method that relies on the use of key fobs. The file transfer service must also keep pace with Blue Waters' powerful servers and interconnects, including 40 10 Gigabit Ethernet links, as well as an integrated file system capable of providing more than a terabyte-per-second of aggregate storage bandwidth. Globus Online is also supplying Blue Waters with a custom interface for a more seamless user experience.

Tuecke notes that satisfying the requirements of the Blue Waters project has contributed to improvements in the company's entire software ecosystem including their original offering, Globus Toolkit. The latest release of which, Globus Toolkit 5.2, now comes with native packaging to support the various flavors of Linux.

The Globus Online service is based on the open-source Globus Toolkit, a fundamental enabling technology for grid computing. Developed and provided by the Globus Alliance, the popular middleware toolkit is used in HPC production grids worldwide. At the heart of Globus Online is GridFTP, a data transfer protocol optimized for high-speed wide-area networks. Although GridFTP is a highly-effective data transfer tool, it can be difficult to implement. Enter Globus Online, which delivers the same functionality in an easy-to-use format. It works across and between any servers that have the GridFTP software installed, but the underlying complexity is hidden from the user. "Under the covers," says Tueke, "Globus online is really just a fancy client to our traditional Globus toolkit to FTP, but we've made it adoptable by any researcher out there."

Security is another major feature of Globus Online. In a typical research environment, data needs to be able to cleanly traverse multiple security domains, so Globus created a technique for managing cross-organizational, multi-domain security environments. "We were able to do the big data with the cross-organizational security," says Tuecke.

Globus Online made its debut in November 2010 at the Supercomputing Conference where it was also announced that NERSC (the National Energy Research Scientific Computing Center) had selected the file transfer tool. Since then, adoption has increased steadily. Globus Online is the data transfer tool of choice at a number of universities and research centers, including the UK National Grid Service.

Tuecke sees big opportunities for Globus Online going forward, perhaps even going beyond simple transfer to a more intensive management role. He believes the market for cloud services will continue to expand as the technology matures. And as cloud-based systems evolve to support a broader range of workloads, Globus' mission is to make the process as easy as possible on the data side, says Tuecke.

As a final note, Tuecke says the software-as-a-service path is all about delivering needed capability in a fashion that is easily adoptable and employable by end users without having to install software and manage updates and so on. "Users just want to point their browser at a site and use the capability," he says.

Most Read Features

Most Read Around the Web

Most Read This Just In

Most Read Blogs

Intel

Short Takes

Hacking into the N-Queens Problem with Virtualization

Jun 19, 2013 | Ruan Pethiyagoda, Cameron Boehmer, John S. Dvorak, and Tim Sze, trained at San Francisco’s Hack Reactor, an institute designed for intense fast paced learning of programming, put together a program based on the N-Queens algorithm designed by the University of Cambridge’s Martin Richards, and modified it to run in parallel across multiple machines.
Read more...

Datapipe and Verne Global's Green Cloud

Jun 17, 2013 | With that in mind, Datapipe hopes to establish themselves as a green-savvy HPC cloud provider with their recently announced Stratosphere platform. Datapipe markets Stratosphere as a green HPC cloud service and in doing so partnering with Verne Global and their Icelandic datacenter, which is known for its propensity in green computing.
Read more...

IBM's Guide to Cloud Based HPC

Jun 12, 2013 | Cloud computing is gaining ground in utilization by mid-sized institutions who are looking to expand their experimental high performance computing resources. As such, IBM released what they call Redbooks, in part to assist institutions’ movement of high performance computing applications to the cloud.
Read more...

Sponsored Whitepapers

Best Practices in Big Data Storage

05/10/2013 | Cleversafe, Cray, DDN, NetApp, & Panasas | From Wall Street to Hollywood, drug discovery to homeland security, companies and organizations of all sizes and stripes are coming face to face with the challenges – and opportunities – afforded by Big Data. Before anyone can utilize these extraordinary data repositories, however, they must first harness and manage their data stores, and do so utilizing technologies that underscore affordability, security, and scalability.

Exploring the Potential of Heterogeneous Computing

04/02/2012 | AMD | Developers today are just beginning to explore the potential of heterogeneous computing, but the potential for this new paradigm is huge. This brief article reviews how the technology might impact a range of application development areas, including client experiences and cloud-based data management. As platforms like OpenCL continue to evolve, the benefits of heterogeneous computing will become even more accessible. Use this quick article to jump-start your own thinking on heterogeneous computing.

Sponsored Multimedias

Newsletters

Stay informed! Subscribe to HPC in the Cloud email Newsletters.

HPC in the Cloud Update
HPCwire Weekly Update
Digital Manufacturing Report
Datanami
HPCwire Conferences & Events
Job Bank
HPCwire Product Showcases



HPC Job Bank


Featured Events




  • November 17, 2013 - November 22, 2013
    SC'13
    Denver, CO
    United States


HPC in the Cloud Conferences & Events