August 16, 2012
Last week, HPC in the Cloud discussed what types of HPC applications are best suited for cloud technologies. While capabilities offered by cloud providers (minimal upfront costs, high scalability and quick time to deployment) remain attractive to HPC users, the needs of their workloads are sometimes at odds with the technology. One particular hurdle is the amount of bandwidth between the end user and their provider of choice. Earlier this week, a scalability.org blog covered this dilemma, calling it a "non-trivial" issue.
Most public cloud providers are best suited for Web hosting, email services and similar ongoing tasks. Their infrastructures are geared toward these purposes, scaling up capacity relative to end user demand. However, if a single user wants to store and process massive datasets, the lack of high bandwidth connectivity can severely hinder their research.
NASA is familiar with this problem. The agency recently launched a program called NEX, which houses 40 years of earth satellite data in a storage cluster next to their Pleiades supercomputer. NASA AMES Earth scientist Ramakrishna Nemani, spoke to us about the project. He described how long it took to migrate a large collection of landsat images from a datacenter in South Dakota to the AMES facility.
"I'll give you an example about how difficult this has been. We brought about 400 terabytes of data from the EROS datacenter in Sioux Falls, South Dakota. I was blown away, it took us nearly 6 ½ months."
With a turnaround time like that, it probably would have been easier to FedEx the dataset on a set of hard drives. The scalability blog directs blame for this kind of issue at lack of competition between ISPs in the US.
They priced an asymmetric connection delivering 100Mbit/s down and 10-15Mbit/s up at roughly $300/ mo. That translates to 12.5MByte/s down and 1.25MByte/s up.
Given that performance, an end user could download roughly one terabyte per day. But since the upload transfers at 10 percent the download speed, it would take approximately 10 days to upload a single terabyte.
Although standard service providers have been lacking in their ability to match throughput with demand, they may receive more incentive from Google. The Internet search giant has decided to throw themselves into the mix, launching their own fiber service in Kansas City. For $70 a month, users can get symmetrical 1,000Mbit/s (1 Gb/s) connectivity. With that performance, the 10 day/TB upload becomes a more practical, two hour transfer.
By effectively eliminating the bandwidth bottleneck, end users have the ability to implement a new range of cloud-based services. This includes high capacity storage and data-intensive research. Unfortunately Google's service is limited to Kansas City and no plans to expand the program have been announced.
Frank Ding, engineering analysis & technical computing manager at Simpson Strong-Tie, discussed the advantages of utilizing the cloud for occasional scientific computing, identified the obstacles to doing so, and proposed workarounds to some of those obstacles.
The private industry least likely to adopt public cloud services for data storage are financial institutions. Holding the most sensitive and heavily-regulated of data types, personal financial information, banks and similar institutions are mostly moving towards private cloud services – and doing so at great cost.
In this week's hand-picked assortment, researchers explore the path to more energy-efficient cloud datacenters, investigate new frameworks and runtime environments that are compatible with Windows Azure, and design a uniﬁed programming model for diverse data-intensive cloud computing paradigms.
05/10/2013 | Cleversafe, Cray, DDN, NetApp, & Panasas | From Wall Street to Hollywood, drug discovery to homeland security, companies and organizations of all sizes and stripes are coming face to face with the challenges – and opportunities – afforded by Big Data. Before anyone can utilize these extraordinary data repositories, however, they must first harness and manage their data stores, and do so utilizing technologies that underscore affordability, security, and scalability.
04/02/2012 | AMD | Developers today are just beginning to explore the potential of heterogeneous computing, but the potential for this new paradigm is huge. This brief article reviews how the technology might impact a range of application development areas, including client experiences and cloud-based data management. As platforms like OpenCL continue to evolve, the benefits of heterogeneous computing will become even more accessible. Use this quick article to jump-start your own thinking on heterogeneous computing.