November 22, 2011
Let's face it, the success of any cloud solution is only as strong as the underlying network; this is especially true when it comes to running HPC workloads. The communications network becomes even more important if we're to make make significant inroads with the running of data-intensive workloads. For that to happen, we're going to need bigger pipes and more robust protocols and standards. For the bleeding-edge in networking, what better place to turn than the annual supercomputing conference?
On the heels of SC11, which seemingly packed the entire HPC community into the Seattle Convention Center last week, Indiana University is announcing the results of its Scinet Research Sandbox entry, called "The Data Superconductor: An HPC cloud using data-intensive scientific applications, Lustre-WAN and OpenFlow over 100Gb Ethernet." As a key component of SCinet, the SRS program gives researchers with innovative network approaches a chance to test out their ideas in the unique environment of the SCinet networks. The 100 Gbps environment provided by SCinet, ESnet, and Internet2 is ten times faster than the current standard, and, of course, many thousands of times faster than public Internet speeds.
The IU researchers set out to address a major concern of data-intensive research, which is how to transfer massive amounts of data to supercomputing facilities for analysis. With collaborators from Brocade, Ciena, Data Direct Networks, IBM, Internet2, Whamcloud and ZIH, the IU team created two compute clusters with Lustre file systems, one in Indianapolis and and the other Seattle, connected by a 2,300 mile 100Gbps link. The series of demonstrations, performed during the conference, achieved a throughput of 96 Gbps for network benchmarks and 6.5 GB/s using IOR, a standard file system benchmark, while running a mix of eight real-world applications returned a result of 5.2 GB/s.
IU believes these results are record-worthy, saying that "this appears to be the fastest data transfer ever achieved with a 100Gbps network at a distance of thousands of miles."
Stephen Simms, manager of the High Performance File Systems group at Indiana University, examines the implications for researchers as we enter the age of big data:
"100 Gigabit per second networking combined with the capabilities of the Lustre file system could enable dramatic changes in data-intensive computing. Lustre's ability to support distributed applications, and the production availability of 100 gigabit networks connecting research universities in the US, will provide much needed and exciting new avenues to manage, analyze, and wrest knowledge from the digital data now being so rapidly produced."
Large-scale, worldwide scientific initiatives rely on some cloud-based system to both coordinate efforts and manage computational efforts at peak times that cannot be contained within the combined in-house HPC resources. Last week at Google I/O, Brookhaven National Lab’s Sergey Panitkin discussed the role of the Google Compute Engine in providing computational support to ATLAS, a detector of high-energy particles at the Large Hadron Collider (LHC).
Frank Ding, engineering analysis & technical computing manager at Simpson Strong-Tie, discussed the advantages of utilizing the cloud for occasional scientific computing, identified the obstacles to doing so, and proposed workarounds to some of those obstacles.
The private industry least likely to adopt public cloud services for data storage are financial institutions. Holding the most sensitive and heavily-regulated of data types, personal financial information, banks and similar institutions are mostly moving towards private cloud services – and doing so at great cost.
05/10/2013 | Cleversafe, Cray, DDN, NetApp, & Panasas | From Wall Street to Hollywood, drug discovery to homeland security, companies and organizations of all sizes and stripes are coming face to face with the challenges – and opportunities – afforded by Big Data. Before anyone can utilize these extraordinary data repositories, however, they must first harness and manage their data stores, and do so utilizing technologies that underscore affordability, security, and scalability.
04/02/2012 | AMD | Developers today are just beginning to explore the potential of heterogeneous computing, but the potential for this new paradigm is huge. This brief article reviews how the technology might impact a range of application development areas, including client experiences and cloud-based data management. As platforms like OpenCL continue to evolve, the benefits of heterogeneous computing will become even more accessible. Use this quick article to jump-start your own thinking on heterogeneous computing.