November 19, 2007
Yahoo! Inc., a leading global Internet company, today announced that it will be the first in the industry to launch an open source program aimed at advancing the research and development of systems software for distributed computing. Yahoo’s program is intended to leverage its leadership in Hadoop, an open source distributed computing sub-project of the Apache Software Foundation, to enable researchers to modify and evaluate the systems software running on a 4,000-processor supercomputer provided by Yahoo. Unlike other companies and traditional supercomputing centers, which focus on providing users with computers for running applications and for coursework, Yahoo’s program focuses on pushing the boundaries of large-scale systems software research.
Currently, academic researchers lack the hardware and software infrastructure to support Internet-scale systems software research. To date, Yahoo has been the primary contributor to Hadoop, an open source distributed file system and parallel execution environment that enables its users to process massive amounts of data. Hadoop has been adopted by many groups and is the software of choice for supporting university coursework in Internet-scale computing. Researchers have been eager to collaborate with Yahoo and tap the company’s technical leadership in Hadoop-related systems software research and development.
As a key part of the program, Yahoo intends to make Hadoop available in a supercomputing-class datacenter to the academic community for systems software research. Called the M45, Yahoo’s supercomputing cluster, named after one of the best known open star clusters, has approximately 4,000 processors, 3TB of memory, 1.5 petabytes of disks, and a peak performance of more than 27 teraflops, placing it among the top 50 fastest supercomputers in the world.
M45 is expected to run the latest version of Hadoop and other state-of-the-art, Yahoo-supported, open-source distributed computing software such as the Pig parallel programming language developed by Yahoo Research, the central advanced research organization of Yahoo Inc.
Carnegie Mellon University will be the first institution to take advantage of Yahoo’s M45. Leading systems software researchers Garth Gibson and Greg Ganger, both professors at Carnegie Mellon, will instrument the system and evaluate its performance. Simultaneously, Carnegie Mellon computer science professors Jamie Callan and Christos Faloutsos, academic leaders in text and Web mining, will solve challenging information retrieval and large-scale graph problems on the cluster. Carnegie Mellon faculty members Alexei Efros, Noah Smith, and Stephan Vogel will also use the cluster to tackle large-scale computer graphics, natural language processing, and machine translation problems, respectively. In the future, Yahoo plans to make M45 available to researchers from other universities for open, collaborative research.
“Hadoop has become an important computing environment for data-intensive applications and Yahoo is playing a leading role in its development. We are excited about collaborating with Yahoo on systems software research, helping to advance the state-of-the-art, and creating new research possibilities in this critical area,” said Randall E. Bryant, dean of the School of Computer Science at Carnegie Mellon. “We look forward to working with Yahoo and jointly contributing back to the open source community.”
“Yahoo is dedicated to working with leading universities to solve some of the most critical computing challenges facing our industry,” said Ron Brachman, vice president and head of Yahoo academic relations. “Launching this program and M45 is a significant milestone in creating a global, collaborative research community working to advance the new sciences of the Internet. This milestone is a key element of Yahoo’s growing Academic Relations effort.”
Yahoo! Inc. is a leading global Internet brand and one of the most trafficked Internet destinations worldwide. Yahoo is focused on powering its communities of users, advertisers, publishers, and developers by creating indispensable experiences built on trust. Yahoo is headquartered in Sunnyvale, Calif. For more information, visit pressroom.yahoo.com or the company’s blog, Yodel Anecdotal.
Jun 19, 2013 |
Ruan Pethiyagoda, Cameron Boehmer, John S. Dvorak, and Tim Sze, trained at San Francisco’s Hack Reactor, an institute designed for intense fast paced learning of programming, put together a program based on the N-Queens algorithm designed by the University of Cambridge’s Martin Richards, and modified it to run in parallel across multiple machines.
Jun 17, 2013 |
With that in mind, Datapipe hopes to establish themselves as a green-savvy HPC cloud provider with their recently announced Stratosphere platform. Datapipe markets Stratosphere as a green HPC cloud service and in doing so partnering with Verne Global and their Icelandic datacenter, which is known for its propensity in green computing.
Jun 12, 2013 |
Cloud computing is gaining ground in utilization by mid-sized institutions who are looking to expand their experimental high performance computing resources. As such, IBM released what they call Redbooks, in part to assist institutions’ movement of high performance computing applications to the cloud.
Jun 06, 2013 |
The San Diego Supercomputer Center launched a public cloud system for universities in the area designed specifically to run on commodity hardware with high performance solid-state drives. The center, which currently holds 5.5 PB of raw storage, is open to educational and research users in the University of California.
05/10/2013 | Cleversafe, Cray, DDN, NetApp, & Panasas | From Wall Street to Hollywood, drug discovery to homeland security, companies and organizations of all sizes and stripes are coming face to face with the challenges – and opportunities – afforded by Big Data. Before anyone can utilize these extraordinary data repositories, however, they must first harness and manage their data stores, and do so utilizing technologies that underscore affordability, security, and scalability.
04/02/2012 | AMD | Developers today are just beginning to explore the potential of heterogeneous computing, but the potential for this new paradigm is huge. This brief article reviews how the technology might impact a range of application development areas, including client experiences and cloud-based data management. As platforms like OpenCL continue to evolve, the benefits of heterogeneous computing will become even more accessible. Use this quick article to jump-start your own thinking on heterogeneous computing.