Univa HPC Job Bank
HPC in the Cloud


Dedicated to covering high-end cloud computing
in science, industry and the datacenter

Language Flags

Gemini Releases Real-Time Log Processing Based on Flume and Cassandra


FOSTER CITY, Calif., March 3, 2011 -- Gemini Mobile Technologies ("Gemini") released a Real-Time Log Processing System based on Flume and Cassandra ("Flume-Cassandra Log Processor") as open source today. The Flume-Cassandra Log Processor enables massive volumes of production system logs to be collected and processed into graphical reports, in real-time. In addition, logs from multiple data centers can be simultaneously aggregated and analyzed in a single database. With its ability for real-time analysis at unprecedented volumes, Gemini's Flume-Cassandra Log Processor enables businesses to vastly improve both the quality and timeliness of business intelligence gained from their online operations. This dramatic scalability at low cost and small footprint is enabled by NOSQL (Not Only SQL) technology, which originates from Cloud Storage technologies at Google, Facebook, and Amazon. 

Logs record online activities of users and applications, in real-time.  Web service providers, telecom carriers, eCommerce providers, and corporate websites analyze logs to improve customer experience and their business operations.  Traditional systems process logs offline because log files are too big for relational databases, typically producing reports days to weeks after the time when logs were generated.  Gemini's solution uses Flume to stream log entries as they occur, data is extracted on the fly and inserted in real-time into a Cassandra NOSQL database.  This allows reports to be updated in real-time, less than a second after the log event.  Offline analysis is also supported by querying the Cassandra database and running MapReduce jobs.  The Flume-Cassandra Log Processor can be easily scaled from a few PCs to multiple clusters, allowing hundreds of Terabytes of logs to be stored, analyzed, and automatically deleted upon expiration. 

"Business intelligence is the key ingredient for enterprises and service providers to provide a personal and effective web experience to their customers," said Michael Tso, co-founder and COO of Gemini.  "Our Flume-Cassandra Log Processor delivers business intelligence in real-time, at data volumes and velocities that were previously unattainable.  We are pleased to release it as open source, allowing easy customization and improvement by the community." 

"Gemini's real-time log processing solution is an impressive demonstration of Cassandra's capabilities for solving big data problems in real-time," said Matt Pfeil, CEO of DataStax, the commercial leader in Apache Cassandra.  "Cassandra is quickly gaining momentum as the scalable platform for web and enterprise applications. We look forward to continuing our innovation and collaboration with Gemini." 

The Flume-Cassandra Log Processor can be downloaded from Github: https://github.com/geminitech/logprocessing

About Gemini

Gemini is a leading provider of high-performance, cloud-enabled messaging platforms.  Gemini is a pioneer in real-time Big Data and NOSQL database technology, developing the Hibari® Key Value Store, Hibari Gigabyte-Maildir Email Store, and the Cloudian™ Multi-Tenant Cloud Storage System.  Gemini has offices in San Francisco, Tokyo and Beijing. Gemini's customers include NTT DOCOMO, Softbank Mobile, Vodafone, and Nextel International; the company also has OEM partnerships with Alcatel-Lucent and Bytemobile.  Gemini is backed by Goldman Sachs, Mitsubishi-UFG Capital, Nomura Securities, Mizuho Capital, Access, and Aplix.  For more information, visit http://www.geminimobile.com/.

-----

Source: Gemini

Most Read Blogs

Aspen

Feature Articles

CometCloud: Using a Federated HPC-Cloud to Understand Fluid Flow in Microchannels

The ever-growing complexity of scientific and engineering problems continues to pose new computational challenges. Thus, we present a novel federation model that enables end-users with the ability to aggregate heterogeneous resource scale problems. The feasibility of this federation model has been proven, in the context of the UberCloud HPC Experiment, by gathering the most comprehensive information to date on the effects of pillars on microfluid channel flow.
Read more...

CERN, Google, and the Future of Global Science Initiatives

Large-scale, worldwide scientific initiatives rely on some cloud-based system to both coordinate efforts and manage computational efforts at peak times that cannot be contained within the combined in-house HPC resources. Last week at Google I/O, Brookhaven National Lab’s Sergey Panitkin discussed the role of the Google Compute Engine in providing computational support to ATLAS, a detector of high-energy particles at the Large Hadron Collider (LHC).
Read more...

Avoiding Scientific Computing Bottlenecks in the Cloud

Frank Ding, engineering analysis & technical computing manager at Simpson Strong-Tie, discussed the advantages of utilizing the cloud for occasional scientific computing, identified the obstacles to doing so, and proposed workarounds to some of those obstacles.
Read more...

Short Takes

Running Computational Fluid Dynamics in the Cloud

May 16, 2013 | When it comes to cloud, long distances mean unacceptably high latencies. Researchers from the University of Bonn in Germany examined those latency issues of doing CFD modeling in the cloud by utilizing a common CFD and its utilization in HPC instance types including both CPU and GPU cores of Amazon EC2.
Read more...

In Support of Cloud-based Rendering

May 10, 2013 | Australian visual effects company, Animal Logic, is considering a move to the public cloud.
Read more...

Internet2 Awards Program Seeks Innovative Applications

May 10, 2013 | Program provides cash awards up to $10,000 for the best open-source end-user applications deployed on 100G network.
Read more...

Sponsored Whitepapers

Best Practices in Big Data Storage

05/10/2013 | Cleversafe, Cray, DDN, NetApp, & Panasas | From Wall Street to Hollywood, drug discovery to homeland security, companies and organizations of all sizes and stripes are coming face to face with the challenges – and opportunities – afforded by Big Data. Before anyone can utilize these extraordinary data repositories, however, they must first harness and manage their data stores, and do so utilizing technologies that underscore affordability, security, and scalability.

Exploring the Potential of Heterogeneous Computing

04/02/2012 | AMD | Developers today are just beginning to explore the potential of heterogeneous computing, but the potential for this new paradigm is huge. This brief article reviews how the technology might impact a range of application development areas, including client experiences and cloud-based data management. As platforms like OpenCL continue to evolve, the benefits of heterogeneous computing will become even more accessible. Use this quick article to jump-start your own thinking on heterogeneous computing.

Sponsored Multimedias

Newsletters

Stay informed! Subscribe to HPC in the Cloud email Newsletters.

HPC in the Cloud Update
HPCwire Weekly Update
Digital Manufacturing Report
Datanami
HPCwire Conferences & Events
Job Bank
HPCwire Product Showcases



HPC Job Bank


Featured Events



  • June 16, 2013 - June 20, 2013
    ISC'13
    Leipzig,
    Germany

  • June 17, 2013 - June 18, 2013
    Forecast 2013
    San Francisco, CA
    United States




HPC in the Cloud Conferences & Events