XML Development Tools

PDF version of this report
You must have Adobe Acrobat reader to view, save, or print PDF files. The reader
is available for free

XML Development Tools

Lynn Greiner

Docid: 00018393

Publication Date: 1709

Report Type: TUTORIAL


Not all Web development tools can design XML documents, and even among those with XML
capabilities there are significant differences in features and
usability. This report looks at a selection of currently available XML development tools, discusses
the various capabilities, compatibilities, and functionality,
and offers suggestions to help companies select the most
appropriate tools for their applications.

Report Contents:

Executive Summary

[return to top of this report]

XML development tools can be divided into three distinct
categories: editors, parsers, and databases.

Related Faulkner
Development Tools Market Trends

An XML editor is a handy but not necessarily mandatory tool, since
XML is text-based and can be edited in a pinch with something as basic
as Windows Notepad. However, specialized XML editors offer features
that make the creation and modification of documents easier. 

translate XML into its visual elements; without a parser, an XML document
looks like a confusing hodgepodge of tags and text. Parsers show content
creators how the ultimate document looks on the screen, so errors
can be fixed on the spot.

The third component behind many XML pages
is a database. There are several native XML products from which to
choose. An XML document is a database only in the purest sense of
the term. That is, it is a collection of data. In many ways, this
makes it no different from any other file – after all, all files
contain data of some sort.


[return to top of this report]

XML is a standard used to define the structure of data within a document.
Unlike HTML, which is a markup language for presentation, XML is a
language for creating markup languages that describe structured data.
The language has no defined rules for presentation except those developed
to use data stored in an XML document. XML documents have two types
of data:

  • Content – Text, images, data, and video that make up information
    a document contains.

  • Structure or Metadata – Defines the role of each piece of
    content. For example, it describes the difference between a paragraph
    and a numbered list. It also defines the order in which content
    appears, such as titles before paragraphs and column headings before
    data records.

There are several advantages to using XML in Web applications. XML
is an open standard and can be integrated with almost any application
across multiple platforms and operating systems. The content of a
document is kept separate from its structure, which makes it easy
to write scripts to add, delete, or export data. XML also allows developers
to create their own tags, which contain metadata information. This
flexibility means that XML can be used for a wide variety of purposes,
ranging from automated document creation to data interchange. XML
also has an advantage over a relational database management system (RDBMS) in the way it stores data. An XML
document stores data in a hierarchical structure, which allows individual
elements to be nested.

Style sheets are used to format XML output. Cascading style sheets
(CSS) can be used to format XML as well as HTML documents. The Document
Style Semantics and Specification Language (DSSSL) is used to format
SGML documents and can also be applied to XML documents. In addition,
XML documents have their own style sheet formats called Extensible
Style Language (XSL) and Extensible Style Language Translation (XSLT).

An alternative way to view documents on the client side is to set
up the Web server so that it supports XML. One way to do this is to
install the Java Development Kit (JDK) and an XML enabler servlet.
The servlet formats the output of XML documents and sends it to appropriate
client applications. If this type of framework is desired, tools that
support Java development are necessary.

XML documents sometimes refer to Document Type Definitions (DTD).
A DTD is a file associated with an XML document that specifies how
to present elements that make up the document. They are used to control
XML applications and check the structural validity of documents. Although
it is possible to write a DTD from scratch, using an existing DTD
will save time and effort. The Organization for Advancement of Structured
Information Systems (OASIS) Web site has links to downloadable DTDs
for various industries.

An alternative to DTDs is the XML Schema Definition (XSD). Like the
DTD, it also provides a way of defining the structure of XML documents
and validating them. However, XSDs are more powerful than DTDs because
they do not need to be parsed and they can define a broader range
of data types.

When selecting tools, it is important to remember that, although
XML may appear very similar to HTML, there are some major differences
between the two markup languages. HTML browsers tend to be forgiving
of bad syntax, such as missing end tags, changes from upper to lower
case, and superfluous tags. Mistakes like these will cause problems
with XML documents, so it is important for an XML development kit
to include tools that make sure documents are well formed.

A major problem with using XML for business applications is the lack
of standard vocabularies or tag sets. There is no consensus for definition
of key business terms such as invoice identification. For example,
one company might create an XML tag containing a customer’s name and
account number and use this to identify invoices. Another company
might just use an account number.

Current View

[return to top of this report]

XML development can be divided into two broad categories: document
creation and data manipulation. Some development tools are specifically
designed for one of these applications. Other tools can be used for
multiple purposes.

XML Parsers

In order to process XML data, every program or server process needs
an XML parser. A parser is designed to extract the actual data out
of the textual representation and create either events or new data
structures from them. Parsers also check whether documents conform
to the XML standard and have a correct structure. This is essential
for the automatic processing of XML documents.

XML parsers come in two distinct models; validating
and non-validating. Non-validating parsers do not check a document
against any Document Type Definition (DTD). Rather, a non-validating
parser only checks that the document is well-formed (that it
is properly marked up according to XML syntax rules). Validating parsers,
in addition to checking how well-formed the document is,
that it conforms to a specific DTD (either internal or external to
the XML file being parsed).

Xerces2 Java Parser, Xerces Perl and Xerces-C++. Apache and IBM built a
pure Java parser called "XML4J," as well as a C++ version
dubbed "XML4C." XML4J has evolved into Apache’s Xerces2-Java,
which is based on Xerces Native Interface (XNI), a modular framework
for building parser components and configurations; XML4C is now Xerces-C++
XML Parser
which is written in a portable subset of C++. These parsers are included
in Apache’s Xerces and IBM’s Alphaworks suites, and both support the
DOM 1, 2, and Level 3 core, load and save, JAXP 1.4, SAX Versions 1 and 2, and Namespaces 1.1 specifications. Xerces
is the Perl API to the Apache Xerces XML Parser.
Apache XML Commons is a set of components and utilities including a
resolver and a set of APIs.

MSXML. Microsoft packages its parser, Microsoft XML Core Services
(MSXML), with SQL Server 2005 and 2008. It also is available as a Web release.
MSXML 6.0 complies with SAX 2.0, DOM Level 1, Extensible Style Language
Transformation (XSLT), XML Data Reduced schema definition
language (XDR), XML
Schema Definition (XSD), Schema Object Model (SOM), XPath
1.0, and Namespaces. MSXML 5.0 supported XML digital signatures for
Microsoft Office, but that feature was removed in version 6.0.
Microsoft includes XML tools in Visual Studio 2015; XML is at the core
of many features in Visual Studio and the .NET Framework.

JAXP. Sun Microsystems developed a set of pure Java APIs that
allow developers to choose any parser that complies with the XML 1.0
or 1.1 standards. Java API for XML Processing (JAXP) offers core XML functionality
for reading, writing, and manipulating XML documents. It includes
SAX 2.0, and DOM Level 3 interfaces, XSLT 1.0, Namespaces,
XPath 1.0, XInclude 1.0, and XSD
support. JAXP is available in Project GlassFish, Java SE
5.0 or higher, and
is on Java.net. A streaming API for XML (StAX) is
available in JAXP 1.4. Version 1.6 was the final
standalone release.
JAXP is not longer available as a standalone
package; it has been integrated into OpenJDK.


XML is a text-based language, thus, it is possible to
edit it with a simple text editor such as Notepad. Nevertheless, the majority
of text editors do not offer the features that make XML development
relatively easy. A typical XML editor should have the following four

  • It should be able to display files in different views. At the
    very least, this should include a structural view and a view of
    the source code.

  • It should be possible to transform a document, so it can be viewed
    in a Web browser. To do this, the editor must support one
    or more of the
    stylesheet specifications, for example XSLT, CSS, or DSSL.

  • It should support APIs for other programming languages such as
    C++ and Java.

  • It should easily integrate with a validating parser.

Other desirable features include the ability to import and export
other formats, for example HTML and word processing files, and Unicode
support for multilingual accessibility.

XMLSpy. Altova’s XMLSpy Document Editor can be purchased
either with a suite of XML development tools or as a stand-alone
product. It can be used to validate XML, DTD, XSD, XSLT, XPath, and
CSS files in real-time, and supports SOAP, WDSL,
and XQuery. XMLSpy
provides five document views, an Enhanced Grid View for ‘structured
editing, a database View to show repeated elements in a tabular
fashion, a Text View with syntax coloring for low-level work, a
graphical XML Schema Design View, and an integrated Browser View. Each
view has drag-and-drop editing. XMLSpy’s IDE can import text files,
ODBC, ADO, DB2, Sybase, MySQL, PostGRE 8 and 9, Oracle 9i, 10g,
11g, and 12c,
Informix 11.7, SQLite 3.x, Firebird 2.5, Microsoft Access 2003, 2007, 2010 and 2013 and SQL Server
2005 and higher database files, as well as Microsoft Word documents. It can
export XML documents as either text or ODBC files. Unicode support is
also provided. XMLSpy also supports Open Office XML format used to
store Microsoft Office 2007 and higher data, providing access to data
stored in Word, Excel and PowerPoint documents. It offers
XBRL validation and taxonomy editing. Its other IDE features include support
for repository interfaces such as WebDAV and Source Safe, Java and
COM-based APIs for creating customized solutions, and tools for creating
and debugging SOAP requests. It integrates with Visual
Studio and Eclipse, and can generate Java, C# or C++ code from an
XML schema. XMLSpy project files can be bound to
centralized version control software. XMLSpy runs on both 32 and 64-bit Windows and can be used with MSXML and
Apache parsers.

Stylus Studio. Stylus Studio is an advanced XML
Integrated Development Environment (XML IDE) that runs on a Windows
2000, XP, Vista, or Windows 7, 8,
or 10
platform, supporting XQuery, XSLT, XML Schema/DTD, XPath, SQL/XML,
XHTML, DOM, SAX, and Web services. It includes Visual
Studio project support, with Java and C# code generation. It
can convert to and from multiple EDI formats, including the
new Edig EDI dialect. It includes a complete suite of
tools, including editors generating multiple editing views, a parser,
and validator, and supports many database formats
including Microsoft SQL Server, Microsoft Access,
DB2, MySQL, Sybase, Informix,
PostGRE SQL and Oracle. Stylus Studio supports gigabyte-sized
XML file editing, a query plan utility, XQuery mapper,
and the ability to group and
join data sources. It also offers an HTML WYSIWYG editor,
and autolink for XSLT and XQuery mapping tools.The latest version adds
64-bit support, a generic converter from EDI to XML, a new SQL editor
with built-inconnectivity to SQL Server, Oracle, MySQL, and DB2, and
EDGAR filing validation.

MyEclipse XML Editor. MyEclipse XML Editor is part
of MyEclipse, an affordable subscription-based IDE. MyEclipse XML Editor
can be used for stand-alone XML development and for development with
the Struts Modeler and the Java Server Faces (JSF) Outline editor.
It features DTD-based code assist, automatic real time validation as you type, manual
validation, and support for XSL and DTD design, as well as syntax highlighting, tag and
attribute content assistance, source, design, and outline views.

<oXygen/>. The <oXygen/> XML editor
and XSLT debugger is a completely cross-platform XML editor.
It offers management support for XML databases, a WYSIWYG
mode based on W3C standards, multiple validation engines,
multiple XSLT processors and can generate HTML from an XML
schema. It is also available as an Eclipse plug-in.

XML Databases

XML databases are programs that automate XML data exchange. Since
XML separates content from structure, it can be easily adapted to
a variety of client/server functions, including thin-client and portal
systems. It can also describe any data format in any type of database.
Its open standards give it an advantage over proprietary electronic
data interchange (EDI) solutions. Figure 1 depicts the Database Data

Figure 1. XML Database
Data Exchange

Figure 1. XML Database Data Exchange

Source: Author Design

There are several factors that should be taken into consideration
when selecting an XML database. First, an XML database should integrate
well with existing database management systems and other application
servers as well as the applications themselves. It is important to
consider how the database stores data, and its capacity. Ideally, a
database should be able to access XML data from multiple sources using
different data management methods. It also needs to be scalable and
able to provide load balancing for optimum performance.

webMethods Tamino XML Server. Software AG’s webMethods
Tamino XML Server uses XSD to organize XML data. Tamino supports
several Internet standards including XQuery, WebDAV, .NET, Java, EJB,
SOAP, and UDDI. It is compatible with the WebSphere, Apache, Sun Java
System Web servers, and other middleware. Tamino can also access any
ODBC-compliant data source. It comes with APIs that allow it to be accessed directly without
a Web server. Tamino runs on multiple platforms. It is part of
Software AG’s Integration Platforms.

eXist-db. A native XML open-source database with an
extensible query engine and browser-based IDE, eXist offers index-based XQuery processing.
It supports XQuery 3.1/XPath 2.0, XSLT 1.0 and 2.0, REST, WebDAV,
SOAP, XMLRPC, XMLDB, XUpdate and the Atom Publishing Protocol, and
offers full text indexing based on Apache Lucene. It is platform
independent, and is available by subscription with full support.

Sedna. Sedna is a free native XML database that includes a
W3C XQuery implementation, tight integration with XQuery full-text search, and a
node level update language. It provides hot backup, indices, ACID transactions,
persistent storage, and security. It is available under Apache License 2.0.
It is available for Windows, Mac OS X, Linux, FreeBSD, and Solaris. The current
version, 3.5, was released in 2011.

Using Web Application Servers

Another option for those who do not wish to purchase a native XML
database for XML data exchange is to use a Web application server
that supports XML transactions.

IBM’s WebSphere Application Server (WAS) is a Java-based server
available on premises or in the cloud that
supports a wide range of Web technologies, including XML (Schema,
DTD, and XSLT) and SOAP. Load balancing and fail-over features make
WAS a high-performance server. WebSphere can be integrated with third-party
software such as SAP, PeopleSoft, and CICS. It also supports CORBA,
ActiveX, and JDBC database connections. WAS can be purchased with
a suite of XML development tools that includes a parser and an editor.
WebSphere runs on most platforms including Windows, z/OS, Linux,
and HP-UX, AIX, IBM i-family and Solaris, and is
supported in Docker containers

Oracle XML DB is a feature of the Oracle Database. It
provides full support for all of the key XML standards, including XML,
Namespaces, DOM, XQuery, SQL/XML and XSLT,as well
native XML application development. It
also supports the SQL/XML standard, which allows SQL-centric
development techniques to be used to publish XML directly from
relational data stored in Oracle Database 12c Release 2

Oracle Berkeley DB XML is an open source, embeddable XML
database with XQuery-based access to documents stored in
containers and indexed based on their content. Oracle
Berkeley DB XML is built on top of Oracle Berkeley DB and
inherits its rich features and attributes. Like Oracle
Berkeley DB, it runs in process with the application with no
need for human administration. Oracle Berkeley DB XML adds a
document parser, XML indexer and XQuery engine on top of
Oracle Berkeley DB to enable the fastest, most efficient
retrieval of data.


[return to top of this report]

XML Parsers

When it comes to choosing between an event-driven or object-driven
parser, the deciding factor should be the type of processing the parser
will perform. An event-based API provides a simple, low-level access
to XML. It is useful in situations where there is a lot of linear
XML processing. Object-based APIs are useful for a range of applications,
but XML needs to be fully parsed before it is processed. This can
strain system resources, especially if documents are large.
and MSXML offer developers a choice of either event-driven (SAX) or
object-driven (DOM) processing. Oracle’s JAXP offers even more flexibility
by allowing parsers to be swapped without changing application code.

Whereas DTDs can only describe the structure of objects in an XML
document (i.e. the order in which they are nested), XML schemas can
define objects themselves. With schemas it is possible to map element
definitions in XML directly to object classes in programming languages.
Parsers that support schemas can check for errors in content, as well
as structure, of XML documents. Xerces2 and MSXML both have this ability.

JAXP and Xerces2 conform to W3C specifications better than
MSXML. Xerces2 has an advantage when it comes to integration. As well as having
Java, C++, and Perl support, Xerces2’s COM support makes it compatible
with Microsoft’s MSXML. Xerces2’s main disadvantage is its lack of support
for XSLT, which forces developers to use cascading style sheets (CSS)
to view their documents, however it can be used with an
XSLT processor such as Xalan (Xalan-Java includes JAXP). JAXP supports XML style sheets and also has
the most reliable document validation.


XMLSpy’s main advantage is its interface, which is intuitive and
provides an excellent range of editing functions, for example auto
completion, syntax highlighting, and infinite undos/redos. It has
several features that simplify creation of DTDs and XML schemas. For
example, it enables a DTD to be constructed from an XML file. XMLSpy
provides multiple views and windows for editing documents, so an XML
document and its DTD or schema can be edited simultaneously, and it
integrates with a long list of source control products, both open
source and proprietary. Its lack of cross-platform support and the
fact that it relies on other vendors’ products such as MSXML are
major disadvantages. Altova says it will run on MacOS or Linux under
Windows emulation software, but will not support those operating
systems directly.

Stylus Studio suffers from similar disadvantages, and enjoys
similar advantages to XMLSpy. It does, however, include its own
parser, and, like XMLSpy, includes a tool that allows differences between documents to
be quickly found and flagged. This can be invaluable in team-based
development environments. It also integrates with popular source control
packages. Stylus Studio fully supports XML
document validation based on XML Schema or DTDs, both through its
own integrated XML Schema and DTD validator, and well as by providing
complete integration with popular industry XML parsers and XML validators
(MSXML 4.0 SAX, MSXML, DOM, Microsoft .NET XML Parser, Xerces, XSV,
and so on).

XML Databases

The data management method is an important consideration when selecting
a database. The three most common types are flat files, object databases,
and relational databases. The main advantage of flat files is that
most programs can export data in this format. Flat file systems,
however, do
not provide good scalability or performance and they do not support
the hierarchical structure of XML documents. Relational databases
provide good storage and access, but it can be difficult to map XML
objects to relational database fields. Object data management systems
store data in a way that is more compatible with XML because both
use an object-oriented approach to data storage. Tamino
supports this type of data management.

Native XML databases are able to store XML data without modification,
and they have database engines that work directly with XML. Web application
servers that include external XML parsers and mapping, such as WebSphere,
take longer to process transactions due to the data translation that
is required. It is also much easier to modify information structures
on the fly in a native XML database. Unlike RDBMSs, the hierarchical
structure of XML databases works well with nested data.

The drawback of Tamino is that it requires a separate
XML database product to be maintained in addition to the RDBMS. Also,
native XML databases do not have the robust load balancing and security
features of the traditional RDBMS. Combining the relational and XML
databases provides the mission-critical and security features of the
former with the flexibility of the latter. Oracle’s XML DB allows
data and documents from disparate sources to be accessed and combined
into a standard data model. The dual XML/SQL features allow more sophisticated
queries to be performed than with Tamino.

[return to top of this report]

About the Author

[return to top of this report]

Lynn Greiner is Vice President, Technical Services
for a division of a multi-national corporation, and also an award-winning
computer industry journalist. She is a member of Faulkner’s Advisory Panel.

[return to top of this report]