|
|
To print: Select File and then Print from your browser's menu
-------------------------------------------------------------- This story was printed from ZDNet Australia. --------------------------------------------------------------
|
XML: Great hope or great hype? By Stephen Withers, 0 November 16, 2001 URL: http://www.zdnet.com.au/news/business/soa/XML-Great-hope-or-great-hype-/0,139023166,120261855,00.htm
XML has been called everything from the technology that will revolutionise the Web to a nuisance. What's really the story? XML--Extensible Markup Language--provides a way of marking up a document that reflects the structure of the information it contains. While HTML is concerned primarily with the way the document is presented (eg, that a certain array of information is to be presented in a table with a certain number of rows and columns), it doesn't say anything about the nature of the item in each cell. This makes it difficult to write a robust program to do anything with an HTML document apart from displaying it. If you know that a table contains a list of products and column one holds the description and column two the price, that's fine. But what happens if the organisation producing the document adds a column to flag new items, and description and price end up in columns two and three? An XML document includes definitions (or references to external definitions) for the data it contains, so there's no requirement for the receiver to know or guess anything about the data. And, as assistant professor John Hurst of Monash University's School of Computer Science and Software Engineering points out, it is relatively easy to restructure data expressed in XML. "It gives the user far more flexibility and choice...in that context, XML beats everything else hands down," he says. Communication difficultiesSince XML preserves the structure of information, it provides a way of transferring data between different programs. There are advantages to going through one common format rather than having each pair of programs communicate directly with each other: each time you add a program to the pool the only work required is to make it read and write the standard format. The alternative is that the newcomer must be able to read and write the format used by each of the existing programs, and so the effort involved increases with the number of programs. Using XML as a standard connector in this way is really just a special case--though an important one--of the more general idea presented in the previous paragraph. Another consideration is that a document marked up with XML can be automatically and fairly easily translated to other markup languages such as HTML for the Web or WML for wireless. A single copy of a document (or the output from a single program) can thus be reused in different circumstances without modifying the original. This isn't just a question of technical convenience. As a Fuji Xerox Australia white paper "XML for the Marketer" points out: "The separation of the information type from the style gives XML flexibility in its application. Once the data is approved through the internal compliance/legal/marketing departments the same data can be used in multiple formats with reduced risk to data integrity." So it's not merely that XML can reduce the effort required to repurpose information for different delivery channels, but it helps ensure that identical content is delivered regardless of its format in each channel. "The advantage is that XML provides separation of form and content...[or] structure from presentation," says Monash's Hurst. Writing XMLXML is a text format. This makes it relatively easy to write code to generate or interpret such documents, and in a worst-case scenario it's possible for a skilled person to interpret the data. Two APIs for interpreting XML code--DOM (Document Object Model) and SAX (Simple API for XML)--exist to save developers from reinventing the wheel. While DOM is the W3C-sanctioned API, SAX is widely used. The word "simple" should not be taken as an indication that SAX is easier to use: on the contrary, because the API is simpler, it leaves more work for the programmer. SAX's popularity is partly due to the fact that it is more efficient than DOM when the application is concerned with extracting data from an XML document (such as an accounting system receiving an invoice delivered electronically in XML format) rather than manipulating the document (as is the case for an XML editor). Parsers from Apache (which inherited IBM's parser), Microsoft and Oracle support DOM and SAX. A new version of DOM (DOM Level 3) is expected either late this year or early 2002, and will provide new features such as the ability to test whether two objects contain the same content. A host of standardsThere are a number of related standards and works in progress that supplement the XML language itself. Extensible Style sheet Language (XSL) and the associated XSL Tranformations (XSLT, www.w3.org/TR/xslt) define stylesheets for use with XML. XML Linking Language (XLink, XML Query addresses another aspect of extracting a subset of data contained in an XML document by providing a query language similar to those used with databases. If you imagine an XML document containing a list of personal information, an appropriate XML Query would let you extract a list of names and phone numbers of those people living in Western Australia. XML also serves as the basis for other languages, such as Mathematical Markup Language (MathML, www.w3.org/TR/REC-MathML/), Extensible Hypertext Markup Language (XHTML, www.w3.org/TR/xhtml1/) and Synchronised Multimedia Integration Language (SMIL, www.w3.org/TR/smil20/). More on standardsW3C has also recently issued the Scalable Vector Graphics standard (SVG, www.w3.org/TR/SVG/) as a recommendation. This XML-based format allows 2D graphics to be resized and displayed well on a variety of devices. Interestingly, the first piece of software that allowed an SVG image to be displayed within a Web browser was developed by CSIRO, and CSIRO W3C fellow Dean Jackson is one of the standard's authors. In August, the XML interoperability consortium OASIS (Organization for the Advancement of Structured Information Standards, www.oasis-open.org) began work on Human Markup Language. HumanML will use XML to convey human characteristics such as cultural, social, kinesic, psychological, and intentional features within data. The organisation expects applications in the areas of artificial intelligence, virtual reality, conflict resolution, psychotherapy, art, workflow, advertising, cultural dialogue, agent systems, diplomacy, and business negotiation. "Using HumanML, we can substantially reduce interpersonal and intersocietal conflicts associated with the inadequate conveyance of human traits and expression," says Ranjeeth Kumar Thunga, chair of the OASIS HumanML technical committee. The committee will also work on issues such as messaging, style, alternative schemas, constraint mechanisms, object models, and repository systems--all in the context of representing and amalgamating human information within data. "HumanML extends the use of XML into totally new arenas and offers the potential to affect the way we communicate with one another," claims Karl Best, director of technical operations for OASIS.
Security and authenticationWork is underway to standardise a mechanism that will allow XML documents to be digitally signed--if you intend to receive purchase orders presented as XML, you'll need a way of checking that they were duly authorised by your client. But where an e-mail message is typically signed in its entirety, a user or process will want to sign only that portion of an XML document for which it is responsible. For example, a workflow system may require different people to certify the accuracy of data that they entered or that they actually made the decision at their stage of the process. In a money lending situation, a call centre agent might certify the transcription of the data collected from the applicant while the loans officer would only certify his or her decision to grant or reject the loan. And where digital signatures are found, encryption is rarely far away: work is in progress to define how data within an XML document can be encrypted. This is clearly important for a variety of applications. Why use SML?According to Bill French, chief architect at StarBase, there are four types of application where XML is an appropriate technology: 1. As a mediator between databases This is relatively easy to achieve, French says. Such a task is primarily a question of mapping one set of tags onto another. Companies such as Tibco offer tools that facilitate this process. 2. To shift the processing load from the Web server to the client Since XML delivers not just the data but also an indication of its meaning, it simplifies the task of creating robust client-side software--minor changes in format or structure are less likely to be disruptive. 3. Delivering different views of the same data French says XML is a key technology in having a Web client deliver different views of the same data to different users. The server can deliver a single XML document (thus reducing the processing load), but the client software customises its presentation according to the needs of a particular user. For example, a cinema's Web site might present a table of screening times for a particular movie. If you wanted a text to speech gateway to relay that information over the phone, the task would be much easier if the data was marked up in XML rather than HTML. Another example is that one XML document can be rendered in different ways on different devices (eg, a computer or a PDA) according to XSL style sheets resident on each device. But Rob Janson, CTO at e-commerce software developer Hubbub Group and chairman of the Melbourne XML Users Group, warns that the combination of XML and XSLT raises scalability issues. In situations where you can't guarantee that the client can perform the transformation, the server must do it instead and as the load increases, the processing required may be greater than needed for other technologies such as ASP or JSP. On the other hand, XSLT is good when you need to make XML documents you've received human readable. 4. To facilitate information discovery Finally, XML is appropriate for intelligent Web agents that attempt to tailor information discovery to the needs of individual users or arbitrate commercial transactions, as it simplifies the task of identifying information. This characteristic of XML also has implications for people searching the Web for information. Is "mercury" a reference to an element, a planet, an early US space program, a deceased rock star, or a character from a 1960s children's TV show? It's possible that widely used XML tags could help narrow the scope of a Web search. For example, requiring that the data must be located inside a (hypothetical) < lastname > tag would most likely limit the results to references to people's names.
Bandwidth issuesAnother issue is that XML files are relatively large--five to 10 times the size of an equivalent flat file, in Janson's experience. This is due to the tags surrounding each data element. Consequently, it is not suited to situations where a large amount of information is transferred in a batch because an error in one item may result in the whole lot being rejected. (A characteristic of XML is that a document must be "well formed"--ie, it complies with the basic XML syntax rules--or it isn't XML and is rejected by the parser.) On the other hand, Janson believes XML is ideal for sending data about individual entities (such as a transaction) in near real-time via the Internet. Peter Boyle, consulting services manager at IT consultancy Kanbay, took up this theme. When sending data via the Internet, there are good reasons for using compression (to reduce the bandwidth requirement) and encryption (for privacy or security). While it's possible to wrap compressed and encrypted data in XML, there's no advantage--people are sometimes too quick to reject the traditional flat file based exchange of data, he says.
Microsoft gets inMicrosoft has adopted XML in a big way. According to Peter Moore, director for .NET strategies and development, "many different development teams within Microsoft started to realise the value of XML while they were starting to develop the next generation of Microsoft products some years ago. It was all of a sudden realised that there was a whole lot of common work going on around using XML as a standard for integration and open connectivity [and] that raised the attention of people like Bill Gates." This realisation led to Microsoft's .NET platform for XML Web services. Like other Web services initiatives, .NET is about computers exchanging information via the Web rather than simply receiving and presenting information to a human user. One of the more pressing reasons for using XML is that one of your partners or prospective partners is already using it and wants or requires you to fall in line. If you are already using EDI, you'll probably continue using it with established partners, but generally speaking its use is confined to larger businesses. "While the value of EDI was recognised as a set of standards that allowed applications to talk to one another, the limited availability and high cost of EDI and the private network charges, meant that it really didn't take off. The ubiquity of the Internet came along and people started to think 'Wow, we could use the Internet,'" says Microsoft's Moore. XML and e-commerceebXML is a set of specifications for doing business over the Internet, jointly sponsored by UN/CEFACT (United Nations Centre for Trade Facilitation and Electronic Business, www.unece.org/cefact/) and OASIS. As its name suggests ebXML uses XML as a way of exchanging structured data. One of the aims is to involve small and medium-sized enterprises in e-commerce, and consequently developers of shrink-wrapped software are being encouraged to support ebXML. Another aim is to create a single vocabulary that can be used across industries and functions, to avoid the need for individual businesses to support multiple vocabularies. Industry consortia that have already defined XML vocabularies are being encouraged to meet ebXML standards. Ultimately, ebXML may supplant EDI: if current EDI users adopt ebXML in order to widen the range of partners with whom they can perform electronic transactions, it's not obvious why they would want to support two separate interfaces that serve essentially the same purpose.
Are you with us or against us?In some cases, XML can either compete with or complement other technologies. For example, Greg Wright, senior systems engineer at Borland suggests that a system running wholly within an intranet could probably be created more quickly using Enterprise JavaBeans and CORBA, but if the same program was being deployed across the Internet, XML could be a better choice as the developer has less control over the entire system. Kanbay's Boyle took up this theme. "XML is a huge advantage when you have to publish some information and you don't know where it's going to go...all major databases can directly read XML...[but] if you do know the recipient [system] any other method may be suitable." On the other hand, Damien Bootsma, a Borland system engineer, points out that the company's AppServer application server uses XML to describe the deployment rules for Enterprise JavaBeans. Indeed, a Sun white paper (java.sun.com/xml/b2b.html) states: "XML is fundamental to Sun's plans for the transmission of mission-critical enterprise data and is being used to make Sun's Enterprise JavaBeans technology even more portable." Not applicableAlthough XML-based database management systems--not to be confused with DBMSes that can import or export data in XML format--are starting to appear, not everyone's convinced by the idea. "The whole XML database phenomenon is a furphy," says Hubbub's Janson, who believes SQL Server and Oracle are too well entrenched to be displaced. The latest released versions of systems are already well down the track of growing the required XML integration features, he says. Monash's Hurst agrees: while he uses an XML database for his Christmas card list, he suggests it is less suited for larger applications although the ease with which the data delivered can be served to the Web can give it an advantage. However, XML as a format "gives the user far more flexibility and choice...in that context, XML beats anything else hands down," he says. Teething problems StarBase's French warns that the power of XML is simultaneously good news and bad news. Good because the absence of prescribed use-cases means it can support any interpretation; bad because that means it's open to any interpretation. Good because it's easily integrated into almost any information architecture; bad because it supports anyone's architectural vision, even bad ones. The appropriate application of a fashionable technology is often an issue. "Just because a system uses XML doesn't mean that it leverages the power of XML," warns Peter Moore, CEO of Cortex eBusiness, arguing that many developers cause more problems than they solve by using XML without understanding why. Just because you can use a particular tool for a job, that doesn't mean it's the best choice. "Everybody wants to use XML," says Kanbay's Boyle, "but it adds overhead to the system so it might be more efficient to use a file-based interface." He also warns that it can make an interface slower and harder to change, though as Dimension Data found, that's not always the case.
Diverging standardsOn the e-commerce side, most organisations are falling in behind ebXML, but "few IT professionals in Australia seems to realise Tradegate (www.tradegate.com.au) is the [Australian] government-nominated standards authority," warns Hubbub's Janson. Tradegate ECA is a not-for-profit, non-government, user organisation with the primary role of facilitating the use of electronic commerce techniques for the exchange of information between customers and their suppliers. Janson would like to see more government attention given to this area, including a marketing effort, as he fears Australian industry might otherwise repeat the mistake it made with EDI and X.12--initially adopting the X.12 standard and subsequently incurring extra costs with the switch to UN/EDIFACT. "Today Australian based companies utilise a mix of X.12 and EDIFACT and custom EDIFACT/X.12 like files," he observes. Janson is also concerned by vendors pushing their own variations when ebXML is clearly the future: "I don't think the answer is necessarily BizTalk, CommerceOne, Ariba, or WebMethods. What we need is the adoption of standards for implementing E-Business processes." He suggests organisations should start work on ebXML now. Tradegate's plan for a schema repository will play an important part, because "Australian companies seem fixated on their particular business rules and frequently implement them without comparing the cost with the associated return on investment." Cortex eBusiness' Moore concurs. "Industry standard XML document type definitions...can make a big impact when widely adopted." Variations played down Vendors play down questions about variations on standards. Borland's Wright says that because XML is self-defining and dynamic, there's not much scope to achieve user lock-in, and certainly less than there is with Java. He also pointed out that even though various groups are establishing common sets of tags to facilitate data interchange, there's nothing to stop someone adding their own private tags--if an application doesn't recognise a tag, it simply ignores it. Unlike flat files, the order and length of data items are not critical. More fundamentally, the standards themselves are changing. "W3C is running as fast as it can to stay in the same place," says Monash's Hurst. Companies want to get products to market as quickly as possible, he said, pointing to the way XSL style sheets showed up in Microsoft's Internet Explorer before the standard was finished. The rapid evolution of standards--for example, XSLT 1.0 was quickly followed by version 1.1 when problems were found with the original specification--and the growth in the number of XML related tools makes purchasing decisions difficult. "What was the right decision one week may be wrong the next week," Hurst says. Industry-specific standardsA variety of industries and disciplines are developing standardised ways of using XML, sometimes based on ebXML. One of the largest groups is the HR-XML Consortium, which develops and promotes a suite of XML specifications for human resources-related e-commerce and data exchange. Members include recruitment and personnel firms such as Adecco and Kelly Services, ERP vendors including JD Edwards, Peoplesoft, and SAP. Other examples include cXML (commerce XML, for procurement applications), XBRL (Extensible Business Reporting Language, for the preparation and exchange of business reports and data such as annual and financial statements--the Institute of Chartered Accountants in Australia is a participant in this effort), and NITF (News Industry Text Format, developed by the International Press Telecommunications Council to define the structure and content of news articles--Australian Associated Press is an ITPC member). However, standardisation isn't just about the format of the messages that will be exchanged: the exchange mechanism must also be specified. SOAP (Simple Object Access Protocol) was originally a Microsoft specification but with the backing of other big players SOAP 1.2 has reached the W3C working draft stage. Now that it is no longer proprietary, SOAP is used by ebXML. What's missing?According to Hubbub's Janson, the task of integrating existing systems with trading partners using XML is hampered by the absence of three elements. The first is a low-cost, easy to use message broker technology; the second is the lack of awareness of ebXML, and the third is a local schema repository (as planned by Tradegate). Kanbay's Boyle points to inadequate XML support in current browsers. Currently, browser-based XML applications must be kept simple, but he predicts that in a couple of browser generations we'll see cross-platform applications delivered in XML, "but it's a way to go before it's a huge hit." Rapid change is a feature of IT, but Monash's Hurst thinks some "considered reflection" would help the XML arena. The problem is that "you get saddled with the wrong decisions for so long," he says, pointing to the way that the DTD specification has become "a millstone". Hurst's prediction should strike a chord with IT veterans: "A lot of different approaches [are] emerging. Eventually it will bed down to one...but not necessarily the best." Gleb Gorov, managing director of IT consulting firm Glot, took up this point, suggesting there's no point in a developer explaining to a large corporate client that it will use the best available XML parser to build a proposed system unless it happens to come from a big name such as Microsoft or IBM. He's not suggesting either company is incapable of producing an industry-leading product, only that many companies are very brand-sensitive. Perhaps there is still work to be done, but there's so much momentum behind XML that it seems destined to become one of the near-universal information technologies.
What's missing?According to Hubbub's Janson, the task of integrating existing systems with trading partners using XML is hampered by the absence of three elements. The first is a low-cost, easy to use message broker technology; the second is the lack of awareness of ebXML, and the third is a local schema repository (as planned by Tradegate). Kanbay's Boyle points to inadequate XML support in current browsers. Currently, browser-based XML applications must be kept simple, but he predicts that in a couple of browser generations we'll see cross-platform applications delivered in XML, "but it's a way to go before it's a huge hit." Rapid change is a feature of IT, but Monash's Hurst thinks some "considered reflection" would help the XML arena. The problem is that "you get saddled with the wrong decisions for so long," he says, pointing to the way that the DTD specification has become "a millstone". Hurst's prediction should strike a chord with IT veterans: "A lot of different approaches [are] emerging. Eventually it will bed down to one...but not necessarily the best." Gleb Gorov, managing director of IT consulting firm Glot, took up this point, suggesting there's no point in a developer explaining to a large corporate client that it will use the best available XML parser to build a proposed system unless it happens to come from a big name such as Microsoft or IBM. He's not suggesting either company is incapable of producing an industry-leading product, only that many companies are very brand-sensitive. Perhaps there is still work to be done, but there's so much momentum behind XML that it seems destined to become one of the near-universal information technologies.
Copyright © 2009 CBS Interactive, a CBS Company. All Rights Reserved. |