August 16, 2012
Last week, HPC in the Cloud discussed what types of HPC applications are best suited for cloud technologies. While capabilities offered by cloud providers (minimal upfront costs, high scalability and quick time to deployment) remain attractive to HPC users, the needs of their workloads are sometimes at odds with the technology. One particular hurdle is the amount of bandwidth between the end user and their provider of choice. Earlier this week, a scalability.org blog covered this dilemma, calling it a "non-trivial" issue.
Most public cloud providers are best suited for Web hosting, email services and similar ongoing tasks. Their infrastructures are geared toward these purposes, scaling up capacity relative to end user demand. However, if a single user wants to store and process massive datasets, the lack of high bandwidth connectivity can severely hinder their research.
NASA is familiar with this problem. The agency recently launched a program called NEX, which houses 40 years of earth satellite data in a storage cluster next to their Pleiades supercomputer. NASA AMES Earth scientist Ramakrishna Nemani, spoke to us about the project. He described how long it took to migrate a large collection of landsat images from a datacenter in South Dakota to the AMES facility.
"I'll give you an example about how difficult this has been. We brought about 400 terabytes of data from the EROS datacenter in Sioux Falls, South Dakota. I was blown away, it took us nearly 6 ½ months."
With a turnaround time like that, it probably would have been easier to FedEx the dataset on a set of hard drives. The scalability blog directs blame for this kind of issue at lack of competition between ISPs in the US.
They priced an asymmetric connection delivering 100Mbit/s down and 10-15Mbit/s up at roughly $300/ mo. That translates to 12.5MByte/s down and 1.25MByte/s up.
Given that performance, an end user could download roughly one terabyte per day. But since the upload transfers at 10 percent the download speed, it would take approximately 10 days to upload a single terabyte.
Although standard service providers have been lacking in their ability to match throughput with demand, they may receive more incentive from Google. The Internet search giant has decided to throw themselves into the mix, launching their own fiber service in Kansas City. For $70 a month, users can get symmetrical 1,000Mbit/s (1 Gb/s) connectivity. With that performance, the 10 day/TB upload becomes a more practical, two hour transfer.
By effectively eliminating the bandwidth bottleneck, end users have the ability to implement a new range of cloud-based services. This includes high capacity storage and data-intensive research. Unfortunately Google's service is limited to Kansas City and no plans to expand the program have been announced.
The ever-growing complexity of scientific and engineering problems continues to pose new computational challenges. Thus, we present a novel federation model that enables end-users with the ability to aggregate heterogeneous resource scale problems. The feasibility of this federation model has been proven, in the context of the UberCloud HPC Experiment, by gathering the most comprehensive information to date on the effects of pillars on microfluid channel flow.
Read more...
Large-scale, worldwide scientific initiatives rely on some cloud-based system to both coordinate efforts and manage computational efforts at peak times that cannot be contained within the combined in-house HPC resources. Last week at Google I/O, Brookhaven National Lab’s Sergey Panitkin discussed the role of the Google Compute Engine in providing computational support to ATLAS, a detector of high-energy particles at the Large Hadron Collider (LHC).
Read more...
Frank Ding, engineering analysis & technical computing manager at Simpson Strong-Tie, discussed the advantages of utilizing the cloud for occasional scientific computing, identified the obstacles to doing so, and proposed workarounds to some of those obstacles.
Read more...
05/10/2013 | Cleversafe, Cray, DDN, NetApp, & Panasas | From Wall Street to Hollywood, drug discovery to homeland security, companies and organizations of all sizes and stripes are coming face to face with the challenges – and opportunities – afforded by Big Data. Before anyone can utilize these extraordinary data repositories, however, they must first harness and manage their data stores, and do so utilizing technologies that underscore affordability, security, and scalability.
04/02/2012 | AMD | Developers today are just beginning to explore the potential of heterogeneous computing, but the potential for this new paradigm is huge. This brief article reviews how the technology might impact a range of application development areas, including client experiences and cloud-based data management. As platforms like OpenCL continue to evolve, the benefits of heterogeneous computing will become even more accessible. Use this quick article to jump-start your own thinking on heterogeneous computing.