The momentum behind Web services is building. If you haven't heard
about them by now, you've probably been living in a jungle or some
remote area for the last few years. The underlying technologies of
Web services are XML, HTTP, SOAP, WSDL, and UDDI. XML provides an
open standard for data exchange, HTTP an open transport protocol,
SOAP a remote method invocation protocol, WSDL the service
description language, and UDDI a service discovery and registry.
The next logical step might be creating a means to investigate and
query the information available. Interestingly, Web services provides
both a means for remote querying and a reason to need an XML
database. As Web services grow in popularity, more and more XML
content will be generated, all with a need to be managed. This
article describes how to use SOAP and XML query techniques to manage
the XML data stored in remote XML databases.
XML Query and XML Databases
XML has established itself as an open standard format for exchanging
data and information in many areas. With the tremendous growth of XML
utilization, managing XML data is fast becoming an IT headache. To
manage XML data effectively, two things are needed: a query language
for search and update operations, and a database to collect and
organize the XML.
There are actually two XML query languages: XPath and XQuery. Both
have their uses. XPath is an XML query language already standardized
by the W3C. It gets its name through its use of a path notation for
navigating through the hierarchical structure of an XML document.
XPath operates on the abstract, logical structure of an XML document
as a tree of nodes. Still in the standardization process in the W3C,
XQuery is a query language that uses XML structure to express
intelligent queries across all structured and semistructured XML data
sources. XQuery was originally derived from the XML query language
Quilt and builds on useful features incorporated from several other
languages. In XQuery, a query is represented as an expression, which
includes path expression (based upon XPath syntax), element
constructors, For Let When Return (or FLWR, pronounced "flower")
expressions, expressions involving functions and arithmetic
operators, quantified expressions, conditional expressions, and
expressions used for modifying or testing data types.
There are several options for storing XML data. The simplest is to
use the file system.
While this may be adequate for small amounts of information, it can
create a lot of overhead in searching and updating since each
potential document would need to be loaded and parsed. A single
transaction could span multiple documents and cause a lot of wasted
time in parsing irrelevant XML. What we need is a set of indexes that
describes the contents of the XML and helps to optimize the queries.
Two options here are to use a relational database or a native XML
database. Several of the relational database vendors provide
utilities for storing XML and mapping XML queries to SQL. Native XML
databases provide indexing and query capabilities optimized for XML,
so we'll choose this option for our example.
There are a few commercial native XML databases in the market. For
convenience, we'll use Ipedo's XML database as an example. Figure 2
illustrates the deployment model of the XML database.
The database is a client/ server system.
A remote client communicates with
the server using client APIs. The SOAP implementation includes a SOAP client
API, a SOAP client implementation, a SOAP server implementation, and
the SOAP server. The SOAP server layer is basically a facade, which
includes all the services you want to publish. This is the interface
for the SOAP client to talk to the XML database server. We'll look at
how to publish your service in the following section.
Before we start to write the code to publish the service, we need to
understand the programming model of the XML database. Figure 3 shows
the xAPI object model. A session is the entry point to all other
objects. From a session, you can create database instances to which
all the database management functionality is attached, and
XPathStatement and XQueryStatement with which all the XPath and
XQuery queries can be invoked. Transformer is used for XSLT style
sheet transformation. From the database object, you can instantiate a
collection object and use it to manipulate the document, adding,
deleting, or updating content.
Publishing a Web Service
Now that we've covered the basics, it's time to get our hands dirty.
The SOAP implementation is based on Apache SOAP open source (version
2.1). This implementation is a Java servlet, so you first need a
servlet engine. We'll use Apache's Tomcat.
Writing the Server Code
I'll start with some code. The code in Listing 1 contains the facade
classes with methods that are exposed to the SOAP clients (code
for this article can be found at www.sys-con.com/webservices/sourcec.cfm.)
This code allows a new XML document to be added and the document to
be searched with a specified XPath query. One thing to point out here
is that the SOAPServer class has no knowledge about SOAP. This means
you can take your existing Java classes and expose them through SOAP.
Deploying the Service
To deploy the SOAP service with Apache SOAP, you need to define a
deployment descriptor for the Java class in which it specifies
several key things to a SOAP server:
- The URN of the SOAP service for clients to access
- The method or methods available to clients
- The serialization and deserialization handlers for any custom classes
The URN is similar to a URL and is required for a client to connect
to any SOAP server. The second item is a list of methods letting the
client know what's allowable for a SOAP client to invoke. It also
lets the SOAP server know what requests to accept. The third item is
a means of letting the SOAP server know how to handle any custom
parameters.
To simplify our implementation, we use String for all parameters.
Listing 2 shows the deployment descriptor for the SOAP server. The
URN for the service is supplied in the id attribute. This needs to be
unique across services, and descriptive of the service. A list of
methods names is specified in the provider section as the attribute.
The Java element specifies the class to expose, including its package
name (through the class attribute), and indicates that the methods
being exposed were not static ones (through the static attribute).
Next, a fault listener implementation is specified. Apache's SOAP
implementation provides two; the first one, DOMFault Listener, is
used here. This listener returns any exception and fault information
through an additional DOM element in the response to the client. The
other fault listener implementation is org.apache. soap.server.
ExceptionFaultListener.
At this point, we have both the server code and deployment descriptor
ready for deployment. Now we can deploy our service. Apache SOAP
comes with a utility to deploy the service. Assume that the XML
database is already set up. The next thing we need to do is make the
classes for our service available to the SOAP server. The best way to
do this is to jar up the service class from the previous section:
C:\>jar cvf SOAPServer.jar SOAPServer.class
And then drop it into the lib/ of the tomcat directory and restart
your Tomcat server. Use Apache SOAP's org.apache.soap.server.Service
Manager utility class:
C:\>java org.apache.soap.server.ServiceManagerClient
http://localhost:8080/soap/servlet/rpcrouter deploy SOAPServerDD.xml
Three arguments are provided: the first is the SOAP server endpoint.
Here it's my local machine; it could be anywhere else on the
Internet. The second is the action to take, and the third is the
deployment descriptor file. Once this has executed, to verify the
service was added:
C:\> java org.apache.soap.server.ServiceManagerClient
http://localhost:8080/soap/servlet/rpcrouter list
Deployed Services:
urn:SOAPServer
Now we've published our service. Then you need to provide a WSDL file
for your client to be aware of your service specification (see
Listing 3).
Querying a Remote XML Database with SOAP and XPath
The next task is to write up a client to add a document and query the
XML database using SOAP. If you are using a toolkit such as the Idoox
SOAP wizard integrated with Borland JBuilder, based on the WSDL file
of SOAP Server, you can generate the SOAP client proxy, which you can
directly use it in your client application. In Listing 4 I
demonstrate how to write these proxies using the Apache SOAP
implementation.
Listing 4 creates proxies for the server class methods listed in
Listing 1: addDocument, executeQuery, and appendChild. Within each
proxy method, after setting all the input parameters, we call a
method invoke, which is a wrapper for the Apache SOAP client APIs.
The main program shows how to add an XML document and then do a query
against it; it takes five parameters from the command line. The first
parameter is the SOAP server endpoint, the second is the name of the
collection to which the document is added, the third is the assigned
name of the XML document in the database, the fourth is the XML file
name, and the last is the XPath query string. For example, to add a
document shown in Listing 5, type:
C:\>java SOAPClientProxy http://localhost:8080/soap/servlet/rpcrouter
bib bib.xml c:\xmldata\bib.xml
/bib/vendor/book/author[lastname='Kaufman']
The result will look like:
Document bib.xml added, docID = 8
Result of query /bib/vendor/book/author[lastname='Kaufman']:
<?xml version="1.0" encoding="UTF-8"?>
<XMLQueryResult>
<author>
<firstname>Lar</firstname>
<lastname>Kaufman</lastname>
</author>
</XMLQueryResult>
Conclusion
You can do much more with SOAP and XML query. The examples shown here
are relatively simple ones. What we discussed are the juicy bits of
SOAP and XPath that let you create a remote query method for XML.
The fact that the query method is based on SOAP and a standard XML
query language like XPath means that they can be reused with almost
any XML database. They can also be used for a range of applications -
everything from a catalog query application to a wireless
customization engine pulling for multiple sources. The W3C is near
completion of XQuery, enriching XML query techniques just in time for
the explosion of XML content being driven by Web services.
|