Behind the Cloud | Main Blog Index
July 15, 2010
Today Purdue University’s Coates Cluster, which is ranked at the #103 spot on the TOP500 supercomputer roll, was declared to the first native 10Gb Ethernet cluster system to be ranked on the honor roll, which means, of course, that the cluster of clusters before this one have all been employing the mighty InfiniBand to sate their low-latency imperatives.
There is little room for questioning that the purist side of the high performance computing community sees InfiniBand as the gold standard. Shortly after my surprise following the announcement regarding Amazon’s new HPC-inspired Compute Cluster Instances, which have the power to place them at the equivalent of the #145 position on the TOP500 list, I figured that the word “InfiniBand” would follow—but it didn’t. Amazon instead went with 10GbE, a decision that has ruffled a few feathers because it is seen by some as being still inferior on low-latency front.
In an interview with HPCwire’s Michael Feldman, Deepak Singh, Business Development Manager at Amazon Web Services, responded to a question that many were asking after they’d a day to sit on Amazon’s news: why did they opt for a 10GbE network rather than InfiniBand, for instance?
Singh replied that Amazon looked to the customer base to understand what technology options were best-suited to their needs, saying, “we know that for HPC, microseconds matter. We specifically engineered Cluster Computer Instances with 10Gbps Ethernet bandwidth to give customers the low-latency network performance required for tightly-coupled, node-to-node communication. Cluster Compute Instances will provide more CPU than any other instance type and customers can expect to find the same performance provided by custom-built infrastructure but with the additional benefits of elasticity, flexibility and low per-hour pricing.”
When asked whether not they had plans to add InfiniBand networked clusters Singh stated that Amazon would “continue to evaluate all technologies as we receive customer feedback on the new instance type” which translates roughly into, no, not anytime soon, but we appreciate that you asked.
Amazon revealed a surprising amount of information for this new instance type, at least compared to their other releases which offered just enough information for users to have a rough idea—another big weakness in the EC2 option for running HPC-type applications. While they did share the hardware specs this time around, the specifics are still cloudy. For instance, when HPCwire asked about the configuration details (i.e., adapters, switches and so on) and for metrics on the node-to-node latency—or any latency information at all, Singh’s response was back to the EC2 generalities. He stated that Amazon “does not share details on the specifics of network implementation. What I can tell you is that the new Cluster Compute instances operate on a 10GbE network that provides full cross-sectional bandwidth to members of a cluster and very low latency.”
Gilad Shainer, Senior Director of HPC and Technical Computing at Mellanox Technologies, a company that is definitely an advocate of InfiniBand (although still caters to the 10GbE market), “Many of the HPC systems around the world are being built for maximum performance and efficiency—hence InfiniBand, GPUs, etc. People using HPC want to be able to run their simulations as fast as possible and as many as possible per day. Amazon’s new entry includes 10GigE for the I/O and incorporates the latest CPUs, but is currently limited in the amount of CPUs that users can utilize. I believe that Amazon will need to continue to improve their HPC cloud offering to include technology being used in most of today’s HPC systems to provide more compute resources per user.”
After the thrill of the news has worn off, people are taking a much closer look at not only the Linpack results that delivered Amazon’s virtual placement (it takes more than a test to get on the Top500—this was more of an exercise to demonstrate CCI’s capabilities) and the nature of this as a viable alternative to in-house HPC clusters. This delivers way more than standard EC2 and answers the concerns by many in the community that they just weren’t getting enough out of what was being offered.
I look forward to seeing how others rise to the challenge since it’s clear now that the HPC market must be important enough to cater to. If someone else ups the ante with InfiniBand and more CPU horsepower (via magic, of course) --what will this mean?
Would love to hear some thoughts on this issue. How important is the network or do the other drawbacks, even the capabilities provided by CCI still stand in the way? In short, is it not just the latency imperative?
Posted by Nicole Hemsoth - July 15 @ 2:39PM, Eastern Daylight Time
(Digg, Technorati, more)
Nicole Hemsoth is the managing editor of HPC in the Cloud and will discuss a range of overarching issues related to HPC-specific cloud topics in posts, which will appear several times per week in Behind the Cloud.
More Nicole Hemsoth
Re: Virtualization is Not Cloud...But Does Make It Shine by pcalcada
Re: Virtualization is Not Cloud...But Does Make It Shine by miha123
Re: Virtualization is Not Cloud...But Does Make It Shine by dparrilla
Re: Virtualization is Not Cloud...But Does Make It Shine by Scott
renewable energy powered IT by Paul Halsey
Re: HPC, the Cloud, and Core Competency by Scott
I agree with Scott by null
Excellent post Miha! by apurkiss
Cloud Adapter by miha123
Fresh air at Univa by miha123
Re: Elite HPC and the Cloud Culture Clash by Badri
Consistent with what I've been seeing by rgillen
Presentation available by in_the_crease
IB EDR is 104Gb/s by in_the_crease
IB EDR is 104Gb/s by in_the_crease
Of a simple number and overflow by Wolfgang Gentzsch
Of Number and Overflow by jbernstein
Parallelism by Scott
ISV part 2 by ScottClark
ISV part 1 by ScottClark
Impact on the ISV community? by peterdenyer
Parallelism by Ron Van Holst
Why we chose Cloud Computing (part 2) by JonathanWeedon
Why we chose Cloud Computing (part 1) by JonathanWeedon
about vmware by faheem
Storage vendor 3PAR has been at the heart of an intense bidding war between HP and Dell due to its unique refinements and developments in virtualized storage platform concepts. Thin provisioning and a focus on the needs of large-scale enterprises and cloud providers have catapulted the company into the public eye but as 3PAR's Craig Nunes discusses with HPC in the Cloud, the cloud strategy has been consistent since 1999--even if the world is just taking notice now.
Read More...
The concept of private clouds is gaining traction and due to the buzz, more enterprises are taking a much closer look at the possibility—if they haven’t taken steps to virtualize some or all of their infrastructure already. For those who have not yet made the transition, a lack of understanding of the complex process behind private cloud implementation is at the core of hesitancy, therefore vendors are looking for ways to convince users to fear not, the private cloud is not only within reach—but simple to step into.
Read More...
Companies in competitive domains, such as financial services, create large data repositories containing significant amounts of data collected from daily operations. Using supercomputers to analyze these massive datasets might yield the highest level of performance, but this is prohibitively expensive. Using proprietary, custom-built HPC atop cloud environments is also a viable option--although one that does come with a series of drawbacks that must be mitigated to achieve critical performance levels.
Read More...
Aug 31 | Application delivery strategies must be shaped with flexibility in mind as the number of platforms delivering core applications is bound to change with time. Since a greater number of devices and platforms are entering the infrastructure mix, those who do not adapt quickly face being locked into strategies that do not mesh well with new developments. Read more...
Aug 27 | Storage virtualization has been gaining momentum as it moves from concept to practice but some suggest the offerings in this realm have not matured sufficiently and require a longer maturation process before wider adoption occurs. Read more...
Aug 27 | Although it was lost in the chaos of the 3PAR bidding war between, HP announced news that it acquired cloud service automation firm Stratavia to bolster its cloud management offering and further its strategy in the arena. Read more...
Aug 26 | In an interview from the NASA IT Summit last week, the agency's CIO, Linda Cureton weighs in on developments with Nebula platform and the adoption of the open source code by other agencies looking to the cloud. Read more...
Aug 24 | While private clouds are getting far more attention than they received at the beginning of the cloud buzz boom, the realities of the complexities of actual building them--not to mention the financial and time investments--are often overlooked. Read more...
Aug 30 | | Enterprises face a paradox today: while workers become increasingly distributed, IT infrastructure is rapidly consolidating. Virtualization has made it possible to create consolidated, elastic pools.
May 14 | | Empower business users, scientists and researchers with their own grid computing infrastructure in the cloud.
This Webinar will highlight the four critical areas of concern when securing cloud infrastructure services and managed enterprise applications.
Escalating energy and operational costs of building and maintaining data centers are forcing enterprises to adopt cloud computing models. But are Infrastructure as a Service (IaaS) solutions like IBM's Computing on Demand (CoD) really cost effective? Join the discussion as industry experts discuss how you can exploit cloud computing for maximum ROI.