December 03, 2007
In this Q&A, ScaleOut Software founder and CEO William Bain discusses his company's distributed data caching solutions, including how they are being adopted by e-commerce and financial services customers, as well as how ScaleOut's products differentiate themselves from the competition.
GRIDtoday: First, can you give an explanation of ScaleOut Software's distributed caching technology? How does it work and what are the important distinctions between distributed caching and traditional database-driven architectures?
WILLIAM BAIN: ScaleOut StateServer (SOSS) creates a distributed, in-memory object cache that spans the servers in a grid or server farm. Stored objects are globally accessible across the grid using intuitive application programming interfaces (APIs). SOSS employs a highly integrated architecture that combines automatic data partitioning and dynamic load balancing for scalability, transparent local caching for fast access, and intelligent data replication for high availability. For fast deployment and simplified management, caching servers automatically discover and join the distributed cache, which self-heals after server or network outages.
SOSS automatically partitions all of the distributed cache's stored objects across the grid and simultaneously processes access requests on all servers. This reduces access times and scales the overall throughput of the distributed cache. It also avoids "hot spots" that can arise if objects are stored on the servers where they are created.
As servers are added to the grid, SOSS automatically repartitions and rebalances the storage workload to scale throughput. Likewise, if servers are removed, SOSS coalesces stored objects on the surviving servers and rebalances the storage workload as necessary.
Two levels of internal caching are employed to ensure the fastest possible access times. These caches are integrated into SOSS to automatically accelerate performance without involving the developer in configuring and coordinating multiple caches. One level of internal caching holds objects within the StateServer service process on each server. This speeds up repeated accesses to these objects by avoiding the networking overheads required to copy them from remote hosts. A second level of internal caching holds de-serialized objects within SOSS's client libraries. This cache sidesteps CPU overheads required to retrieve objects when they are repeatedly accessed from the distributed cache.
SOSS ensures that cached data is never lost -- even if a server in the grid fails -- by replicating all cached objects on up to two additional servers. If a server goes offline or loses network connectivity, SOSS retrieves its objects from replicas stored on other servers in the grid, and it creates new replicas to maintain redundant storage as part of its "self-healing" process. SOSS uses a patent-pending, scalable, point-to-point heartbeat architecture that efficiently detects failures without flooding the server grid's network with multicast heartbeat packets. Heartbeat failures automatically trigger SOSS's self-healing technology, which quickly restores access to cache partitions and dynamically rebalances the storage load across the grid.
Using a distributed cache in place of a database server (DBMS) to store data has the dual advantages of very high performance with essentially unlimited scalability. The most common data stored in a distributed cache is mission-critical, but relatively short-lived data, called "workload data." This type of data includes session-state, shopping carts, cached database results, SOAP requests, financial data, grid computing results, and other rapidly accessed, fast-changing application data. To provide global accessibility, applications historically have stored workload data in a centralized, back-end DBMS so that it can be retrieved from any server and preserved in case of server outages. However, database servers are designed to handle long-term, line-of-business data, such as inventory, purchase orders, billing records, and other long lived business data. As the following table illustrates, workload data have different characteristics that make it poorly suited for storage in database servers:
Line of Business Data
Critical, but reproducible
Fast access and update
In addition, database servers can be costly (especially if clustering is used), and traffic to and from the data storage tier creates a bottleneck that impacts both performance and scalability. Database caches alone aren't the answer because they can't accelerate updates to fast-changing workload data. Distributed, in-memory caching solves these problems.
Gt: What business problems are driving the need for distributed caching solutions? In what markets are these needs most pressing?
BAIN: There are two general types of business problems that are driving the adoption of distributed caching. One is the need of e-commerce sites running on server farms to simultaneously scale and be highly available as their traffic increases. Without distributed caching, developers have to choose between high availability using a DBMS for scalability using in-process storage. Distributed caching simultaneously solves both these issues.
The second driver is the grid computing market, especially the financial services vertical, where there are extreme pressures to wring out every microsecond of latency and to maximize application throughput. Over the past few years, the decline in exponential growth of CPU speed has stimulated a resurgence in the use of grid-based computing, which has provided important performance gains. However, data access technology continues to lag far behind. Most data used in grid computing today is either maintained in a database until needed by the grid or delivered sequentially to the grid by a master control node, and even interim results are frequently stored in a database. These techniques drastically and unnecessarily lengthen the overall compute time. Distributed caching avoids these limitations by reliably hosting application data in memory within the compute grid's servers, making it simultaneously available to all compute nodes.
Gt: What does ScaleOut's product line look like? How do its different solutions address different needs?
BAIN: ScaleOut Software's flagship product, StateServer, provides a distributed cache as described earlier. SOSS includes comprehensive APIs for storing data objects, and it also transparently stores ASP.NET session state on ASP.NET server farms. Some customers only need a solution for ASP.NET session-state, and to meet that need, we provide ScaleOut SessionServer, which transparently stores ASP.NET session-state but does not include the APIs.
ScaleOut Software also has released two other important products. ScaleOut GeoServer replicates stored objects between SOSS caches running on server farms at different sites. This enables multiple datacenters to stay fully protected against site-wide failures. GeoServer's capabilities help IT managers meet the stringent performance and uptime needs of high-end Web sites and other mission-critical applications. ScaleOut Remote Client lets client applications running on networked computers remotely access an SOSS distributed cache. In many situations, it is more convenient to deploy the SOSS distributed cache on its own dedicated server farm instead of co-locating it on a Web server farm or compute grid. This adds a new level of flexibility to the deployment of SOSS's distributed cache by allowing it to be hosted on a server farm tailored for distributed caching and accessed by numerous remote clients.
ScaleOut Software's currently released products support .NET/Windows environments. In early 2008, we also plan to offer a Java/Linux version of our products. This version will be completely interoperable between .NET/Windows and Java/Linux, allowing any combination of clients and servers to coexist and use a single distributed cache storage environment.
Gt: How has ScaleOut approached the area of distributed caching differently than other providers in this space? What about distributed database solutions?
BAIN: First, it is important to distinguish SOSS from an object-oriented database like Objectivity/DB or GemStone. These products were designed for long term storage of object-oriented data and not for distributed data caching to scale application performance with high availability.
[Oracle] Coherence, [GemStone] Gemfire EDF, and GigaSpaces all provide competitive distributed caching products. The products generally have a Linux/Unix/Java heritage, although they provide interoperability to Windows .NET. In contrast, ScaleOut StateServer was designed from the outset to be portable across Windows and Linux environments and to deliver fully native performance (instead of just interoperability) in both environments. Also, we have focused on .NET deployments to date, so we have a strong, long-term presence in .NET that these competitors cannot match.
From an architectural viewpoint, SOSS is distinguished from its competitors in these key aspects:
Gt: Can you speak a little about your customer base -- what it looks like in terms of numbers, specific users/ use cases, major industries, etc.?
BAIN: ScaleOut Software's products are running on thousands of servers at nearly 150 customers across a wide range of industries. Initial adoption of SOSS came from customers with e-commerce applications needing to transparently store session-state information on Web server farms. In late 2005, we observed strong interest in the use of our APIs to cache many types of application data. This trend has quickly grown. More recently, the financial services segment has rapidly ramped up with its need to handle very large computational loads over decreasing time frames. Most of these companies have deployed grid computing environments running tens and hundreds of servers in each compute grid. They have turned to distributed caching to boost performance and scalability while maintaining high availability for stored data.
Gt: How would you rate the overall demand for distributed caching solutions right now, and how do you see the level of demand changing in the years -- or even months -- to come?
BAIN: Both the demand for distributed caching and the range of applications are rapidly growing. Developers have discovered that data access is a bottleneck which limits the performance and scalability of their grid-based applications. Distributed caching is the key technology that can address these challenges. At the same time, distributed caching solutions have evolved in both their functionality and ease of use, opening up this technology to an ever widening group of architects and application developers who lack specialized training in distributed computing. In fact, we are seeing department-level organizations increasingly taking advantage of distributed caching, thereby creating a much larger market opportunity.
Especially in financial services, we see an exploding interest in distributed caching because the volume of daily transactions that need to be processed is growing exponentially, and the market is extremely competitive. Application developers need effective, easy to use tools that can extract the highest possible performance with the least amount of programming effort. Distributed caching fills an important need in this market, and ScaleOut Software's focus on high integration and ease of use keeps the learning curve for developers to a minimum.
Gt: Speaking of the future, do you see distributed databases -- whether in-memory caches or simply data grids -- becoming the norm at any point? Are traditional approaches just too slow for today's increasing needs in terms of low latency, etc.?
BAIN: With the tail-off in the historic, exponential growth rate of CPU clock rates, multi-core and multi-server architectures will increase in dominance. This also will accelerate the adoption of server virtualization. Distributed caching provides the "glue" which ties all of these elements together to form a scalable processing platform for tomorrow's applications.
As with many other technologies, a tipping point of adoption occurs when a critical mass of understanding, experience and a positive and proven cost-benefit ratio is reached. Starting with applications where there is pressure to remove latency and increase performance, we expect distributed caching to emerge as an essential component of application development platforms within the next few years.
Gt: Is there anything else you'd like to add about ScaleOut Software's technology or products, or about the distributed data market in general?
BAIN: ScaleOut Software is focused on playing a leadership role in this exciting market. Our architectural approach is based on almost 30 years of experience developing parallel computing solutions for scientific and commercial applications. This experience shows us that distributed caching is the foundation of a distributed computing platform that can dramatically reduce development time and boost application performance. At the core, distributed computing platforms need to be easy to target by application developers, and they need to be easy to deploy and manage. You can expect ScaleOut Software to roll out new technologies over time that further advance our technology leadership in distributed computing.
May 16, 2013 |
When it comes to cloud, long distances mean unacceptably high latencies. Researchers from the University of Bonn in Germany examined those latency issues of doing CFD modeling in the cloud by utilizing a common CFD and its utilization in HPC instance types including both CPU and GPU cores of Amazon EC2.
May 10, 2013 |
Australian visual effects company, Animal Logic, is considering a move to the public cloud.
May 10, 2013 |
Program provides cash awards up to $10,000 for the best open-source end-user applications deployed on 100G network.
May 08, 2013 |
For engineers looking to leverage high-performance computing, the accessibility of a cloud-based approach is a powerful draw, but there are costs that may not be readily apparent.
05/10/2013 | Cleversafe, Cray, DDN, NetApp, & Panasas | From Wall Street to Hollywood, drug discovery to homeland security, companies and organizations of all sizes and stripes are coming face to face with the challenges – and opportunities – afforded by Big Data. Before anyone can utilize these extraordinary data repositories, however, they must first harness and manage their data stores, and do so utilizing technologies that underscore affordability, security, and scalability.
04/02/2012 | AMD | Developers today are just beginning to explore the potential of heterogeneous computing, but the potential for this new paradigm is huge. This brief article reviews how the technology might impact a range of application development areas, including client experiences and cloud-based data management. As platforms like OpenCL continue to evolve, the benefits of heterogeneous computing will become even more accessible. Use this quick article to jump-start your own thinking on heterogeneous computing.