January 10, 2005
TONY HEY ON THE NEED FOR WEB SERVICES STANDARDS
By Tony Hey, Contributing Editor
The UK e-Science Initiative, which started in 2001, began by gaining
some understanding of the problems of implementing distributed
middleware services that crossed institutional boundaries by evaluating
the then current NASA Information Power Grid software -- which
primarily consisted of the Globus Toolkit, Condor and the Storage
Resource Broker packages. But even in 2001, it was clear that any
future distributed middleware that wished to have support and tooling
from the IT industry would have to be based on Web services. However,
it is an unfortunate fact that, although the Web services movement is
supported by all the IT companies, even at the end of 2004, this "grand
project" is still very much "work in progress." The presently accepted
Web services certainly do not constitute a satisfactory basis to
construct a robust, international Cyberinfrastructure --
"e-Infrastructure" in Europe -- on which to build novel and demanding
e-Science and business applications.
This is a problem. Funding agencies in the United States, Europe and
Asia are funding many hundreds of e-Science or "Grid" projects, all of
which involve one or more forms of distributed data, computation and
collaboration. In the United States alone, even in the absence of a
long-delayedCyberinfrastructure initiative along the lines recommended
by theAtkins Report, the NSF is funding over $400 million worth of
"e-Science" projects per year. In the United Kingdom, with the present
dollar exchange rate, the e-Science program amounts to some $500
million over five years. Germany and the Netherlands have just
announced 90-million- and 50-million-Euro e-Science programs,
respectively, and the European Commission have launched over 400
million Euros worth of new Grid projects. China and Japan also have
ambitious and significant e-Science programs.
To underpin all of this activity we need a set of standard middleware
services that enable the coordinated, collaborative use of distributed
resources (computation, data sets, facilities). This set of middleware
services -- determined by the application requirements -- is what I
call the"Grid," as a shorthand for distributed middleware
infrastructure or
Cyberinfrastructure, according to my definition. There is clearly
awhole community of scientistsand engineers -- in both academia and
industry -- all gearing up tomake scientific and commercial e-Science
and Cyberinfrastructureapplications a reality. So what is the problem?
The problem lies with the slow pace of the standards process and the
ongoing Web services standards "wars."
A quick aside on Web services. This is the distributed computing
technology that the IT industry is trying to define to be the building
blocks for building loosely-coupled, distributed applications, based on
Service Oriented Architecture principles. Web services interact by
exchanging messages in SOAP format while the contracts for the message
exchanges that
implement those interactions are described via WSDL and other metadata
formats. When a SOAP message arrives at a Web service, it is first
handled by the service's message processing logic which transforms
network level SOAP messages into something more tangible for
applications to deal with (such as domain-specific objects). Once the
message has been consumed, its contents are then processed by the
application logic, making use of the resources available to the
service. Typically, some response is then generated which is fed back
via one or more messages.
By encapsulating the internal resources within the service, and
providing a layer of application logic between those resources and the
consumers, the owners of the service are free to evolve its internal
structure over time (for
example to improve its performance or dependability), without making
changes to the message exchange patterns that are used by existing
service consumers. This encourages loose-coupling between consumers and
service providers, which is important in inter-enterprise computing, as
no one party is in complete control of all parts of the distributed
application. However, loose-coupling does not mean that the
functionality of applications is compromised, since the set of existing
and emerging Web services specifications should allow distributed
application builders to model complex interactions between services.
Web services specifications can be divided into two classes.
Infrastructure specifications define generic aspects of services (or
other specifications), e.g. WSDL, WS-Security and the proposed "Grid"
service WSRF. High-level
specifications define domain specific aspects of services, e.g. a data
access and integration service specification. Policy also plays a key
role in a service oriented architecture. While WSDL describes the
functional
characteristics of a Web service -- such as operations supported,
messages sent and consumed -- the non-functional requirements
associated with service invocation are also a very important aspect of
Web services and service oriented architectures in general. WS-Policy
and WS-PolicyAttachment describe a foundation policy framework within
which the behaviors associated with a service -- such as security,
transactionality, reliability and so on -- can be specified.
Conceptually, WSDL and WS-Policy are peers in the Web services stack.
Now read on ...
By leveraging the developments in Web services technologies, Grid
architects will be able to exploit the tools, documentation,
educational materials, and experience from the Web services community
when building applications, without having to create a parallel set of
solutions. This will allow the Grid community to concentrate on
building the higher-level services that are specific to the Grid
application domain while the responsibility for the
underlying infrastructure is left to IT industry. The software vendors
will work on standardizing the Web services technologies, developing
production-quality tooling, achieving wide adoption, testing for the
interoperability of the implementations of those standards, educating
developers, etc.
This all sounds very desirable and obvious. So, again, where is the
problem? At this point in time there are a large number of industry-led
standardization efforts, only some of which are being developed within
an open standards
organization. This makes it difficult for a user to identify those that
have completed the standardization process, those that are proprietary
standards or indeed those that may have little future in terms of broad
acceptance. The
sheer number of specifications and the mixed signals coming from
industry due to competing specifications in similar areas can leave
application architects with the impression that there is no single
clear vision for Web services. Even where there is a clear need for a
standard (e.g., workflow, security, transactions, notification), it is
still taking a long time for a widely accepted one to emerge. Different
sets of vendors are producing competing
specifications and it will therefore take time to resolve the
differences in a manner that is both technically and commercially
acceptable.
The uncertainty that this range of specifications creates becomes a
real problem for developers who must choose which specifications to use
in their implementations. If a specification is chosen too early in its
lifecycle, then
developers may suffer from lack of tool support as well as instability
due to changes incurred as the specification evolves through a
standardization process. In the worst case, a specification may never
be widely adopted, and
so will over time wither and die, adversely impacting any services that chose to adopt it.
What can be done about this and who should take the lead? The "Men in
Black" theory of standards suggests that a Web Service specification
that is supported by both Microsoft and IBM is most likely to achieve
widespread
acceptance. Indeed, it was not so long ago that Bill Gates from
Microsoft and Steve Mills from IBM shared a stage and gave guarantees
that their implementations of Web services would interoperate with each
other. The problem is that this agreement apparently does not extend to
Web services for Grids. Examples of competing or overlapping
specifications relevant to Grids are WS-Eventing, WS-Transfer,
WS-Enumeration and WS-Management, supported by Microsoft and others;
and WS-Notification, WS-ResourceFramework and WS-DistributedManagement,
supported by IBM and their friends. The resolution is not so obvious --
since both companies approach Web services from very different
commercial perspectives. Microsoft is concerned with keeping Web
Services as simple as possible and easy to implement efficiently on
Intel architectures. By contrast, IBM is concerned with defining more
sophisticated Web services that can be used to create robust
applications for commercial data centers.
However, by not reaching some compromise, both Microsoft and IBM risk
confusing and antagonizing their major commercial customers. At a
recent meeting in Europe, I counted dozens of major companies involved
in the latest
set of European Grid projects -- Atos Origin, DataMat, Telenor, EADS,
ESI, MSC, BAESystems, Boeing, SAP, T-Systems, Daimler-Chrysler, Audi,
GlaxoSmithKlein and BT, among others, as well as Microsoft and IBM. All
of
these are multi-national companies with multi-vendor IT systems and it
makes no sense for Microsoft and IBM to continue to talk past each
other about Grids. Agreement on these low level standards is a matter
of urgency for the world-wide e-Science and Grid research and business
community and needs some resolution as soon as possible. Only Microsoft
and IBM can provide the necessary leadership and this is what they need
to show now. Only when these Web services standards have been
stabilized can the Global Grid Forum concentrate on defining and
standardizing, where appropriate, the higher level services that will
constitute the Open Grid Services Architecture.