Univa HPC Job Bank
HPC in the Cloud


Dedicated to covering high-end cloud computing
in science, industry and the datacenter

Language Flags

Gemini Releases Real-Time Log Processing Based on Flume and Cassandra


FOSTER CITY, Calif., March 3, 2011 -- Gemini Mobile Technologies ("Gemini") released a Real-Time Log Processing System based on Flume and Cassandra ("Flume-Cassandra Log Processor") as open source today. The Flume-Cassandra Log Processor enables massive volumes of production system logs to be collected and processed into graphical reports, in real-time. In addition, logs from multiple data centers can be simultaneously aggregated and analyzed in a single database. With its ability for real-time analysis at unprecedented volumes, Gemini's Flume-Cassandra Log Processor enables businesses to vastly improve both the quality and timeliness of business intelligence gained from their online operations. This dramatic scalability at low cost and small footprint is enabled by NOSQL (Not Only SQL) technology, which originates from Cloud Storage technologies at Google, Facebook, and Amazon. 

Logs record online activities of users and applications, in real-time.  Web service providers, telecom carriers, eCommerce providers, and corporate websites analyze logs to improve customer experience and their business operations.  Traditional systems process logs offline because log files are too big for relational databases, typically producing reports days to weeks after the time when logs were generated.  Gemini's solution uses Flume to stream log entries as they occur, data is extracted on the fly and inserted in real-time into a Cassandra NOSQL database.  This allows reports to be updated in real-time, less than a second after the log event.  Offline analysis is also supported by querying the Cassandra database and running MapReduce jobs.  The Flume-Cassandra Log Processor can be easily scaled from a few PCs to multiple clusters, allowing hundreds of Terabytes of logs to be stored, analyzed, and automatically deleted upon expiration. 

"Business intelligence is the key ingredient for enterprises and service providers to provide a personal and effective web experience to their customers," said Michael Tso, co-founder and COO of Gemini.  "Our Flume-Cassandra Log Processor delivers business intelligence in real-time, at data volumes and velocities that were previously unattainable.  We are pleased to release it as open source, allowing easy customization and improvement by the community." 

"Gemini's real-time log processing solution is an impressive demonstration of Cassandra's capabilities for solving big data problems in real-time," said Matt Pfeil, CEO of DataStax, the commercial leader in Apache Cassandra.  "Cassandra is quickly gaining momentum as the scalable platform for web and enterprise applications. We look forward to continuing our innovation and collaboration with Gemini." 

The Flume-Cassandra Log Processor can be downloaded from Github: https://github.com/geminitech/logprocessing

About Gemini

Gemini is a leading provider of high-performance, cloud-enabled messaging platforms.  Gemini is a pioneer in real-time Big Data and NOSQL database technology, developing the Hibari® Key Value Store, Hibari Gigabyte-Maildir Email Store, and the Cloudian™ Multi-Tenant Cloud Storage System.  Gemini has offices in San Francisco, Tokyo and Beijing. Gemini's customers include NTT DOCOMO, Softbank Mobile, Vodafone, and Nextel International; the company also has OEM partnerships with Alcatel-Lucent and Bytemobile.  Gemini is backed by Goldman Sachs, Mitsubishi-UFG Capital, Nomura Securities, Mizuho Capital, Access, and Aplix.  For more information, visit http://www.geminimobile.com/.

-----

Source: Gemini

Most Read Blogs

Aspen

Feature Articles

Avoiding Scientific Computing Bottlenecks in the Cloud

Frank Ding, engineering analysis & technical computing manager at Simpson Strong-Tie, discussed the advantages of utilizing the cloud for occasional scientific computing, identified the obstacles to doing so, and proposed workarounds to some of those obstacles.
Read more...

Overcoming the Cloud Security Barrier for Financial Services

The private industry least likely to adopt public cloud services for data storage are financial institutions. Holding the most sensitive and heavily-regulated of data types, personal financial information, banks and similar institutions are mostly moving towards private cloud services – and doing so at great cost.
Read more...

Research Roundup: Toward a More Efficient Cloud

In this week's hand-picked assortment, researchers explore the path to more energy-efficient cloud datacenters, investigate new frameworks and runtime environments that are compatible with Windows Azure, and design a unified programming model for diverse data-intensive cloud computing paradigms.
Read more...

Short Takes

Running Computational Fluid Dynamics in the Cloud

May 16, 2013 | When it comes to cloud, long distances mean unacceptably high latencies. Researchers from the University of Bonn in Germany examined those latency issues of doing CFD modeling in the cloud by utilizing a common CFD and its utilization in HPC instance types including both CPU and GPU cores of Amazon EC2.
Read more...

In Support of Cloud-based Rendering

May 10, 2013 | Australian visual effects company, Animal Logic, is considering a move to the public cloud.
Read more...

Internet2 Awards Program Seeks Innovative Applications

May 10, 2013 | Program provides cash awards up to $10,000 for the best open-source end-user applications deployed on 100G network.
Read more...

HPC and the True Cost of Cloud

May 08, 2013 | For engineers looking to leverage high-performance computing, the accessibility of a cloud-based approach is a powerful draw, but there are costs that may not be readily apparent.
Read more...

Sponsored Whitepapers

Best Practices in Big Data Storage

05/10/2013 | Cleversafe, Cray, DDN, NetApp, & Panasas | From Wall Street to Hollywood, drug discovery to homeland security, companies and organizations of all sizes and stripes are coming face to face with the challenges – and opportunities – afforded by Big Data. Before anyone can utilize these extraordinary data repositories, however, they must first harness and manage their data stores, and do so utilizing technologies that underscore affordability, security, and scalability.

Exploring the Potential of Heterogeneous Computing

04/02/2012 | AMD | Developers today are just beginning to explore the potential of heterogeneous computing, but the potential for this new paradigm is huge. This brief article reviews how the technology might impact a range of application development areas, including client experiences and cloud-based data management. As platforms like OpenCL continue to evolve, the benefits of heterogeneous computing will become even more accessible. Use this quick article to jump-start your own thinking on heterogeneous computing.

Sponsored Multimedias

Newsletters

Stay informed! Subscribe to HPC in the Cloud email Newsletters.

HPC in the Cloud Update
HPCwire Weekly Update
Digital Manufacturing Report
Datanami
HPCwire Conferences & Events
Job Bank
HPCwire Product Showcases



HPC Job Bank


Featured Events



  • June 16, 2013 - June 20, 2013
    ISC'13
    Leipzig,
    Germany

  • June 17, 2013 - June 18, 2013
    Forecast 2013
    San Francisco, CA
    United States




HPC in the Cloud Conferences & Events