A lot has changed since 1997, when SGI put on the first public Grid
demonstration at the Supercomputing show. Walter Stewart, SGI's
business development manager for Grid, spoke with GRIDtoday editor
Derrick Harris about just what has changed since then, as well about
the company's tactics in the battle against increasingly large data
How do the new products mentioned in SGI's current
release [the Altix 1350, Altix Hybrid Cluster, InfiniteStorage Total
Performance 9700 and Silicon Graphics Prism] advance SGI's Grid
We have, for some considerable time, ensured
that any new product that we bring out is Grid-enabled. We have been
looking to bring out products that bring a unique functionality to
Grids, and we believe that these four products advance the kinds of
functionalities that SGI is able to make available to people who are
operating Grids. Particularly in the case of the 1350, we are bringing
in functionality at a much more attractive price point than has been
I think that speaks to SGI's overall Grid position. We're out there to
be a toolmaker for the Grid and to make sure that the kind of power SGI
brings to stand-alone compute facilities is available to hugely
Do you know of any projects off-hand that are using or planning on using any of these new solutions?
Because we've only just released them, I'm not aware of
any that are looking at them for Grid at the moment. Certainly, some
products from the Altix family have already been installed in major
Grid installations around the world. We've had circumstances where
we've had customers who are not interested in large, shared-memory
machines, but who are interested in what might be described as "robust
node clusters" for their Grids, and that's precisely what the Altix
1350 addresses. We are certainly very much involved with the Altix
family in a number of major Grid installations around the world.
Which leads to my next question. There are certainly some
well-established projects currently powered by SGI solutions, including
the TeraGyroid, COSMOS and SARA projects. Could you talk a little about
how SGI products are being used in these, and other, Grid projects?
One interesting one, in your own country, is that we
installed at the beginning of the year a 1,000-plus processor machine
at NCSA, which will be one of the resources on the TeraGrid. This is,
as far as I'm aware, the first shared-memory resource that has been
available to TeraGrid users. So that's one very recent, and North
I think I'm right in saying that it's next month that we're installing
another 1,600-plus processor machine at the Australian Partnership for
Advanced Computing, which will become a major resource on the
COSMOS Grid has been around for some time, and we've been through a
couple of generations with COSMOS Grid. We first installed Origin
there, and have subsequently installed Altix. This is all because the
COSMOS Grid people are in the business of setting up the data
environment, including processing and visualization, in order to be
ready for the data that will come flooding in from the Planck satellite
in 2007. This is an example of bringing real power to Grids with a very
strong emphasis on a very close connection among compute, visualization
and data management.
SGI is focused on addressing four primary challenges of
Grid, and I want to talk about two in particular. First: Why is
security such a big issue in Grid computing, what are some of the major
security issues and what is SGI doing to improve Grid security?
We've been doing a lot of work with the open source
community in transferring a lot of IP from our experience with our own
operating system, IRIX. There are some security issues that we're
hopeful will be picked up by the open source community. Because a lot
of our security work with IRIX was right in the OS, we feel obliged to
work with the open source community, and move at their speed, on the
introduction of some of those attributes.
One area that isn't talked about in security, or security-related
issues, that SGI is very preoccupied with is the whole issue of
versioning. It's one thing to talk about the security of data, it's
another thing to talk about the integrity of data. With our CXFS
storage-area network over a wide-area network on the Grid, we are
solving the problem of making multiple copies of data. So if you're in
San Diego and I'm in Toronto, we could be working on the same data set
without having to make a copy for each of us. Therefore, we can be
confident that the data I'm working with is the same as you are because
we're using the same copy.
We also keep a watchful eye on the work that goes on in the standards bodies like GGF.
I haven't heard a lot of companies state their dedication to
the cause of visualization capabilities on Grid networks. Can you give
me a little more detail on why this is so important to SGI?
It's important to SGI because we think it's important
to the world of Grid and the world of next-generation computing. Let me
give you an example. We work with an engineering firm that was working
in a fairly conventional IT environment where they had a number of
workstations around the company, in a few locations, and they were
copying files from workstation to workstation as different people had
to work on them. Those files were in the neighborhood of 200GB, and it
was taking about three hours on their network to move the files. But
then they came to us and said that they were going to have a problem
because their next data set was going to be a terabyte in size, and
that was probably going to take something in the order of 22 hours to
move -- that was not going to be acceptable. We said, "Well, we
wouldn't worry about it anyway if we were you."
And they said, "Why is that?"
"Because you can't load it on the workstation anyway. It'll crash it."
Before we presented the solution to them, they came back and said they
had made a mistake. It was, in fact, not going to be 1TB; it was going
to be 4TB. I might say this as an aside: this is an increasingly common
happening among companies and among research organizations. The spike
in data is so profound.
Quite clearly in that circumstance, there was no way those workstations
could cope with 4TB, nor could the company's network. So we designed a
system for them that allows them to have those remote legacy
workstations, have the users there, send instructions into a SAN
(Storage Area Network) to cause the compute server to compute the data,
then the visualization piece of the compute server visualizes that
data, then we strip the pixels from the data and stream the pixels in
real-time back to the remote user on the legacy workstation. They have
full interactivity not only with the data set, but the computation of
that data, from a legacy workstation -- stressing neither the network
nor the workstation's capacity.
In that circumstance, visualization is a critical tool for working with
big data. More and more, as people look at these data spikes, even if
you are able to get the data moved -- which is becoming increasingly
impossible -- if the data or the data results are being expressed
alpha-numerically, it's going to take you too long to read it. Ask a
big data question ... you get bigger data answer frequently. And if it
takes you six months to read the answer ...
If you can look at it visually, you often can understand it in a
fraction of the time it would take you to internalize the information
if it's expressed to you alpha-numerically. We believe that kind of
infrastructure is critical for all sorts of users, in all sorts of
places, working on all sorts of devices, with all sorts of OS's. It's
SGI's role to bring that core power to Grid installations so that
people at the various points along the Grid can have access to it.
Finally, I want to go back, for a few questions, to SGI's
groundbreaking demonstration at Supercomputing '97. What kind of effect
did it have on the Grid movement? Did it add an element of legitimacy?
I think it certainly got Grid going and began a process
of people seeing that there is a possibility to design this very
different kind of infrastructure. I think that, in truth, if the
community could have kept the momentum of that activity in 1997, we'd
be further ahead today. I think we got sidetracked for a number of
years, particularly in North America, with the cycle scavenging model
as a single approach to Grid computing. I'm happy to say that single
approach has very much ended.
While going around and doing cycle scavenging is still a very legitimate part of Grid, it's no longer seen as the grid
People are recognizing that Grid users should have access to a variety
of different devices and a variety of different kinds of tools to work.
So, I think that 1997 was critical. I just wish we could have
maintained the momentum that [the demo in] 1997 started, and we might
be further ahead today. Things have changed dramatically in the last
year to two years and focused much more on the building of the kind of
infrastructure that's required to deal with big data.
That was almost seven-and-a-half years ago, an eternity in information technology, and a whole
lot has changed with Grid since then. What do you see as some of the biggest differences?
Grids are now deployed in working environments -- there
are lots of Grids. I would characterize the Grid as having three phases
so far. From 1997 until about 2001, you were looking at Grids deployed
for research on Grids. Starting around 2001, you increasingly saw Grids
deployed in a research environment to serve research goals of multiple
disciplines. We moved away from the Grid being the object of the
research to the Grid being a tool to enable research. Starting roughly
around late 2003-2004, we really began to see a major ramp-up of Grids
being installed in enterprise situations and in corporate situations.
Certainly, I might comment with one other hat that I wear as co-chair
of the Plenary Program Committee at GGF. Our Plenary Program at GGF12
in Brussels last September was quite extraordinary [in regard to] the
number of companies that turned up at event that either already had
Grids or were seriously looking at installing Grids and came to find
out about it. There was nothing like that attendance previously. If you
go back to GGF in 2002, there would be no one there but vendors and
researchers. By now, we believe we are seeing a strong corporate
engagement in the whole issue of Grid.
My final question is: If you had a crystal ball, what do you
think you would see in another seven-and-a-half years, in 2012? Where
will Grid be, and what role will SGI have in helping it get there?
I very much see Grid in this context: Starting in the
middle of the 18th century, right up well through the 20th century, we
built evermore elaborate distribution mechanisms, or infrastructures
for distribution, in order to move raw materials, processed materials
and finished products. That was the absolute foundation of the
industrial economy. We began, sometime late in the 20th century, to
begin creating the infrastructure for the knowledge economy. Data is
the raw material for the knowledge economy. Grid is the nascent, or the
beginning, of that infrastructure that is going to allow us to move
data from data to information to knowledge and, therefore, to value.
I would say that we are going to increasingly see infrastructure built
around the principles that are related to Grid computing that enable
users in every conceivable location to have access to the tools that
they need for data. If I chose to, I could drive about a mile from my
house and be able to buy a lemon picked off a tree in California. We
have the infrastructure in place to make it possible for me to have
that in the middle of winter in Toronto. We're going to see the kind of
infrastructure that will make it possible for me, regardless of where I
am, to be able to access the power to deal with the data that I need in
order to be a knowledge worker.
Is there anything else you would like to add in regard to this announcement or SGI's Grid strategy in general?
I believe that SGI will always be looking forward to
your 2012 date. SGI will always be there, designing the tools that are
ready to deal with that next spike in the volumes of data people have
to work with. We're not going to be down at the commodity level, and
we're not going to be there for the problems that are already solved.
We are going to be there for the people who are tackling the next data
To read the release cited in this article, please see "New Solutions
Extend SGI'S Drive to Advance Grid Computing"
in the issue of GRIDtoday.