November 19, 2010
New architecture can double analytics processing speed
NEW ORLEANS, Nov. 19 -- At the Supercomputing 2010 conference (SC10), IBM (NYSE: IBM) today unveiled details of a new storage architecture design, created by IBM scientists, that will convert terabytes of pure information into actionable insights twice as fast (1) as previously possible. Ideally suited for cloud computing applications and data-intensive workloads such as digital media, data mining and financial analytics, this new architecture will shave hours off of complex computations without requiring heavy infrastructure investment. IBM won the Storage Challenge competition for presenting the most innovative and effective design in high performance computing with the best measurements of performance, scalability and storage subsystem utilization.
Running analytics applications on extremely large data sets is becoming increasingly important, but organizations can only continue to increase the size of their storage facilities so much. As businesses search for ways to harness their large stored data to achieve new levels of business insight, they need alternative solutions like cloud computing to keep up with growing data requirements as well as tackling workload flexibility through the rapid provisioning of system resources for different types of workloads.
"Businesses are literally running into walls, unable to keep up with the vast amounts of data generated on a daily basis," said Prasenjit Sarkar, master inventor, Storage Analytics and Resiliency, IBM Research – Almaden. "We constantly research and develop the industry's most advanced storage technologies to solve the world's biggest data problems. This new way of storage partitioning is another step forward on this path as it gives businesses faster time-to-insight without concern for traditional storage limitations."
Created at IBM Research – Almaden, the new General Parallel File System-Shared Nothing Cluster (GPFS-SNC) architecture is designed to provide higher availability through advanced clustering technologies, dynamic file system management and advanced data replication techniques. By "sharing nothing," new levels of availability, performance and scaling are achievable. GPFS-SNC is a distributed computing architecture in which each node is self-sufficient; tasks are then divided up between these independent computers and no one waits on the other.
IBM's current GPFS technology offering is the core technology for IBM's High Performance Computing Systems, IBM's Information Archive, IBM Scale-Out NAS (SONAS), and the IBM Smart Business Compute Cloud. These research lab innovations enable future expansion of those offerings to further tackle tough big data problems.
For instance, large financial institutions run complex algorithms to analyze risk based on petabytes of data. With billions of files spread across multiple computing platforms and stored across the world, these mission-critical calculations require significant IT resource and cost because of their complexity. Using this GPFS-SNC design, running this complex analytics workload could become much more efficient, as the design provides a common file system and namespace across disparate computing platforms, streamlining the process and reducing disk space.
For more information about IBM Research, visit www.ibm.com/research.
(1) MapReduce Benchmarks on a 16-node cluster with 4 SATA disks per node comparing GPFS-SNC to HDFS
The ever-growing complexity of scientific and engineering problems continues to pose new computational challenges. Thus, we present a novel federation model that enables end-users with the ability to aggregate heterogeneous resource scale problems. The feasibility of this federation model has been proven, in the context of the UberCloud HPC Experiment, by gathering the most comprehensive information to date on the effects of pillars on microfluid channel flow.
Large-scale, worldwide scientific initiatives rely on some cloud-based system to both coordinate efforts and manage computational efforts at peak times that cannot be contained within the combined in-house HPC resources. Last week at Google I/O, Brookhaven National Lab’s Sergey Panitkin discussed the role of the Google Compute Engine in providing computational support to ATLAS, a detector of high-energy particles at the Large Hadron Collider (LHC).
Frank Ding, engineering analysis & technical computing manager at Simpson Strong-Tie, discussed the advantages of utilizing the cloud for occasional scientific computing, identified the obstacles to doing so, and proposed workarounds to some of those obstacles.
May 16, 2013 |
When it comes to cloud, long distances mean unacceptably high latencies. Researchers from the University of Bonn in Germany examined those latency issues of doing CFD modeling in the cloud by utilizing a common CFD and its utilization in HPC instance types including both CPU and GPU cores of Amazon EC2.
May 10, 2013 |
Australian visual effects company, Animal Logic, is considering a move to the public cloud.
May 10, 2013 |
Program provides cash awards up to $10,000 for the best open-source end-user applications deployed on 100G network.
05/10/2013 | Cleversafe, Cray, DDN, NetApp, & Panasas | From Wall Street to Hollywood, drug discovery to homeland security, companies and organizations of all sizes and stripes are coming face to face with the challenges – and opportunities – afforded by Big Data. Before anyone can utilize these extraordinary data repositories, however, they must first harness and manage their data stores, and do so utilizing technologies that underscore affordability, security, and scalability.
04/02/2012 | AMD | Developers today are just beginning to explore the potential of heterogeneous computing, but the potential for this new paradigm is huge. This brief article reviews how the technology might impact a range of application development areas, including client experiences and cloud-based data management. As platforms like OpenCL continue to evolve, the benefits of heterogeneous computing will become even more accessible. Use this quick article to jump-start your own thinking on heterogeneous computing.