Qore XML Module  1.5
Qore XML Module

Introduction

XML functionality in the Qore xml module is provided by the libxml2 library, which provides a powerful, stable, clean, and thread-safe basis for XML integration in Qore. XML provides a way to describe hierarchical data, and thanks to libxml2, the xml module allows for easy serialization and deserialization between XML strings and Qore data structures.

This module is released under a choice of two licenses:

  • LGPL 2.1
  • MIT (see COPYING.MIT in the source distribution for more information)

The module is tagged as such in the module's header (meaning it can be loaded unconditionally regardless of how the Qore library was initialized).

To use the module in a Qore script, use the %requires directive as follows:

%requires xml

Classes provided by this module:

Also included with the binary xml module:

Note
XML functionality was included in the main Qore shared library until version 0.8.1, at which time the code was removed to make this module.

Module Option Constants

The following constants give information about the availability of XML functionality (dependent on libxml2 features at compile time):

XML Option Constants

Name Type Description
HAVE_PARSEXMLWITHRELAXNG bool Indicates if parse_xml_with_relaxng() and Qore::Xml::XmlReader::relaxNGValidate() are available
HAVE_PARSEXMLWITHSCHEMA bool Indicates if parse_xml_with_schema() and Qore::Xml::XmlReader::schemaValidate() are available

If either of the above constants are False, then calling any of the dependent functions or methods will result in a run-time exception.

Automatic XML Serialization and Deserialization

This section describes automatic XML serialization and deserialization; for iterative or user-controlled XML parsing, see XML Classes.

XML serialization (conversion from Qore data structures to XML strings) in the xml module relies on the fact that Qore hashes retain insertion order, which means that conversion to and from Qore data structures and XML strings can be done without data loss and without reordering the XML elements. In general, XML serialization is relatively straighforward, but there are a few issues to be aware of, particularly regarding element attributes and lists. These issues are described in the following paragraphs.

First, a straightforward example:

hash h = (
"record": (
"name": (
"first": "Fred", "last": "Smith",
),
),
);
printf("%s\n", make_xml(h, XGF_ADD_FORMATTING));

This produces the following result

<?xml version="1.0" encoding="UTF-8"?>
<record>
  <name>
    <first>Fred</first>
    <last>Smith</last>
  </name>
</record>

Serializing and Deserializing XML Attributes

To set XML attributes, the Qore value must be a hash and the attributes are stored in another hash in the key "^attributes^". That is; the value of the "^attributes^" key must be a hash, and each member of this hash will represent an attribute-value pair.

For example:

hash h = (
"record": (
"^attributes^": ("type": "customer"),
"name": (
"first": "Fred", "last": "Smith",
),
),
);
printf("%s\n", make_xml(h, XGF_ADD_FORMATTING));

This produces the following results:

<?xml version="1.0" encoding="UTF-8"?>
<record type="customer">
  <name>
    <first>Fred</first>
    <last>Smith</last>
  </name>
</record>

If instead we wanted to have text instead of child data under the "record" node, we must set the "^value^" key of the hash along with the "^attributes^" key as follows:

hash h = (
"record": (
"^attributes^": ("type": "customer"),
"^value^": "NO-RECORD",
),
);
printf("%s\n", make_xml(h, XGF_ADD_FORMATTING));

Giving the following results:

<?xml version="1.0" encoding="UTF-8"?>
<record type="customer">NO-RECORD</record>

Serializing and Deserializing XML Arrays

Arrays are serialized with repeating node names as follows:

hash h = (
"record": (
"part": (
"part-02-05", "part-99-23", "part-34-28",
),
),
);
printf("%s\n", make_xml(h, XGF_ADD_FORMATTING));

Producing the following results:

<?xml version="1.0" encoding="UTF-8"?>
<record type="customer">
  <part>part-02-05</part>
  <part>part-99-23</part>
  <part>part-34-28</part>
</record>

It gets a little trickier when a key should repeated at the same level in an XML string, but other keys come between, for example, take the following XML string:

<?xml version="1.0" encoding="UTF-8"?>
<para>Keywords: <code>this</code>, <code>that</code>, and <code>the_other</code>.</para>

It's not possible to use a list, because text is required in between. As described earlier, the "^value^" hash key can be used to serialize text in an XML string. In this case, we need to have several text nodes and several code nodes in a mixed-up order to give us the XML string we want. Because qore hases have unique keys (we can't use the same key twice in the same hash), we resort to a key naming trick that allows us to virtually duplicate our key names and therefore arrive at the XML string we want. We do this by appending a '^' character to the end of the key name and then some unique text. When serializing hash keys, any text after (and including) the '^' character is ignored, therefore we can add unique text to the special XML element name prefix to ensure that the input hash contains the data needed so that it will be serialized in the right order to the XML string as in the following example:

hash h = (
"para": (
"^value^": "Keywords: ",
"code" : "this",
"^value^1" : ", ",
"code^1" : "that",
"^value^2" : ", and ",
"code^2" : "the_other",
"^value^3" : ".",
),
);
printf("%s\n", make_xml(h, XGF_ADD_FORMATTING));

Resulting in:

<?xml version="1.0" encoding="UTF-8"?>
<para>Keywords: <code>this</code>, <code>that</code>, and <code>the_other</code>.</para>

By ignoring the text after the second '^' character, unique keys can be given in the input string, and the above code will serialize to the XML string we want. In general, by using this convention, we can properly serialize multiple out-of-order keys without losing data and still have unique names for our hash keys.

Note than when deserializing XML strings to Qore data structures, the above rules are applied in reverse. If any out-of-order duplicate keys are detected, Qore will automatically generate unique hash key names based on the above rules only if the XPF_PRESERVE_ORDER flag is given with the parse_xml(), parse_xml_with_relaxng() or parse_xml_with_schema() function calls.

Serializing and Deserializing XML Comments

Comments can be serialized by using the "^comment^" XML element prefix (as with other special element prefixes, arbitrary text can appear after the "^comment^" prefix to make the element name unique) as in the following example:

hash h = (
"record": (
"^attributes^": ("type": "customer"),
"^comment^1": "values correspond to customer reference values",
"^value^": "NO-RECORD",
"^comment^2": "see docs for more info",
),
);
printf("%s\n", make_xml(h, XGF_ADD_FORMATTING));

Resulting in:

<?xml version="1.0" encoding="UTF-8"?>
<record type="customer"><!--values correspond to customer reference values-->NO-RECORD<!--see docs for more info--></record>

Serializing and Deserializing XML CDATA

Also note that CDATA text will be generated if a hash key starts with '^cdata'; such text will not be processed for escape code substitution. When deserializing XML strings to qore data structures, CDATA text will be placed unmodified under such a hash key as well.

Functions For XML Serialization and Deserialization

Function Name Description
get_xml_value() Retrieves the value of an XML element
make_xml_fragment() Serializes a hash into an XML string without an XML header or formatting
make_xml() Serializes a hash into a complete XML string with an XML header
parse_xml() parses an XML string and returns a Qore hash structure
parse_xml_with_dtd() parses an XML string and validates it against a DTD string and returns a Qore hash structure
parse_xml_with_relaxng() parses an XML string and validates against a RelaxNG schema string and returns a Qore hash structure
parse_xml_with_schema() parses an XML string and validates against an XSD schema string and returns a Qore hash structure

Deprecated Functions For XML Serialization and Deserialization

Classes Providing XML Functionality

Class Description
FileSaxIterator An iterator class for file data
InputStreamSaxIterator An iterator class for input streams
SaxIterator An iterator class for XML strings
XmlDoc For analyzing and manipulating XML documents
XmlNode Gives information about XML data in an XML document
XmlReader For parsing or iterating through the elements of an XML document

XML-RPC

XML-RPC is a lightweight but powerful XML over HTTP web service protocol. The xml module includes builtin support for this protocol as described here. You can find more information about XML-RPC, including specifications and examples at http://xmlrpc.org.

Information about Qore's XML-RPC serialization can be found below.

XML-RPC Serialization in Qore

Qore Type XML-RPC Type Description
string string direct conversion to UTF-8 string
integer i4 or string If the integer requires more than 32 bits to represent, then it is sent as a string
float double direct conversion
number double conversion to a double for serialization
boolean boolean direct conversion
date iso8601 Absolute date/time values will convert to the default time zone for the calling context for the output string if necessary. Note that relative date/time values (durations) will be serialized with the same format as absolute date/time values.
binary base64 direct conversion
list array direct conversion
hash struct direct conversion
all others n/a All other types will cause an XML-RPC-SERIALIZATION-ERROR to be raised.

Classes Providing XML-RPC Functionality

Class Description
Qore::Xml::XmlRpcClient For communicating with XML-RPC servers

Functions Providing XML-RPC Functionality

Function Name Description
make_xmlrpc_call() Serializes a hash into an XML string formatted for an XML-RPC call
make_xmlrpc_fault() Serializes a hash into an XML string formatted for an XML-RPC fault response
make_xmlrpc_value() Serializes a hash into an XML string in XML-RPC value format
make_xmlrpc_response() Serializes a hash into an XML string formatted for an XML-RPC response
parse_xmlrpc_call() deserializies an XML-RPC call string, returning a Qore hash respresenting the call information.
parse_xmlrpc_response() deserializies an XML-RPC response string, returning a Qore hash respresenting the response information.
parse_xmlrpc_value() deserializies an XML-RPC value tree, returning a Qore hash respresenting the information.

Deprecated Functions For XML-RPC Serialization and Deserialization

XML Constants

All constants (and classes and namespaces) provided by this module are created under the Qore namespace (the main namespace for the Qore Programming Language).

Function and Method Tags

NOOP

Code with this flag makes no calculations, but rather returns a constant value. This flag is given to function and method variants that return a default value depending on the type of argument(s). When variants with this flag are resolved at parse time, a "call-with-type-errors" warning is raised (assuming this warning is enabled), unless PO_REQUIRE_TYPES or PO_STRICT_ARGS is set. If PO_REQUIRE_TYPES or PO_STRICT_ARGS is set, then these variants are inaccessible at parse time; resolving to a variant with this flag set at parse time causes an exception to be thrown.

These variants are included for backwards-compatibility with qore prior to version 0.8.0 for functions that would ignore type errors in arguments.

This tag is equal to RUNTIME_NOOP, except no runtime effect is caused by resolving a function or method tagged with NOOP at runtime; this tag only affects parse time resolution.

RUNTIME_NOOP

Code with this flag makes no calculations, but rather returns a constant value. This flag is given to function and method variants that return a default value depending on the type of argument(s). When variants with this flag are resolved at parse time, a "call-with-type-errors" warning is raised (assuming this warning is enabled), unless PO_REQUIRE_TYPES or PO_STRICT_ARGS is set. If PO_REQUIRE_TYPES or PO_STRICT_ARGS is set, then these variants are inaccessible; resolving to a variant with this flag set at parse time or run time causes an exception to be thrown.

These variants are included for backwards-compatibility with qore prior to version 0.8.0 for functions that would ignore type errors in arguments.

This tag is equal to NOOP, except that RUNTIME_NOOP is also enforced at runtime.

RET_VALUE_ONLY

This flag indicates that the function or method has no side effects; it only returns a value, for example.

This tag is identical to CONSTANT except that functions or methods tagged with RET_VALUE_ONLY could throw exceptions.

CONSTANT

This flag indicates that the function or method has no side effects and does not throw any exceptions.

This tag is identical to RET_VALUE_ONLY except that functions or methods tagged with CONSTANT do not throw exceptions.

DEPRECATED

Code with this flag is deprecated and may be removed in a future version of this module; if a variant with this flag is resolved at parse time, a "deprecated" warning is raised (assuming this warning is enabled).

Release Notes

xml Module Version 1.5

Changes and Bug Fixes in This Release

xml Module Version 1.4.2

Changes and Bug Fixes in This Release

  • SoapClient module changes:
    • fixed bugs handling HTTP header values in SOAP calls (issue 2873)
  • SoapHandler module changes:
    • fixed a bug where HTTP request headers were not reported correctly in log callbacks (issue 2874)
    • fixed a bug where duplicate soapAction values would cause an invalid error to be thrown even if there were unique paths for each operation (issue 2871)
  • WSDL module changes:
    • fixed output namespace prefix generation handling with complex WSDLs and XML schemas (issue 3194)
    • fixed a bug handling SOAP encoding and namespace handling with certain WSDLs (issue 3029)
    • fixed a bug handling multiRefs in SOAP responses (issue 2902)
    • fixed a bug handling complexTypes as part types (issue 2900)
    • fixed bugs handling array types (issue 2899)
    • fixed bugs handling HTTP header values in SOAP calls (issue 2873)
    • added support for token and normalizedString types (issue 2859)
    • fixed handling namespace declarations in the SOAP binding element (issue 2858)
    • fixed WSDLLib::getWSDL() when passed a WSDL string with no XML preamble (issue 2857)
    • fixed handling of XSD attribute names with non-word characters (issue 2856)
    • fixed handling of simpleType definitions with union members (issue 2855)
    • fixed parsing string values with an xmlns attribute and no value (issue 3367)

xml Module Version 1.4.1

  • SoapHandler module changes:
    • fixed a bug where the URI path was not respected when resolving SOAP calls (issue 2783)
    • implemented supoprt for handling SOAP faults based on the exception err string (must correspond to the fault name) (issue 2804)
  • WSDL module changes:
    • implemented supoprt for handling SOAP faults in response messages with SOAP bindings (issue 2804)
    • fixed a bug resolving namespaces in nested schemas with late resolution with overlapping namespace prefixes (issue 2786)
    • fixed a type error in message generation (issue 2783)
    • implemented the wsdl_set_global_compat_empty_string_is_nothing() function and the "compat_empty_string_is_nothing" option for the WebService class for backwards compatibility with older versions of the WSDL module (issue 2754)
    • implemented the wsdl_set_global_compat_allow_any_header function and the "compat_allow_any_header" option for the WebService class for backwards compatibility with older versions of the WSDL module (issue 2765)
    • fixed types when deserializing to eliminate performance penalties stripping types in large data structures (issue 2766)
  • fixed soaputil to import XSDs automatically when parsing WSDLs (issue 2784)

xml Module Version 1.4

Changes and Bug Fixes in This Release

xml Module Version 1.3.2

Changes and Bug Fixes in This Release

xml Module Version 1.3.1

Changes and Bug Fixes in This Release

  • fixed a memory leak in XML-RPC parsing (issue 1214)
  • WSDL fixes and enhancements:
    • supress emitting a SOAPAction header in requests if the binding gives an empty string (issue 1226)
    • updated WSOperation::serializeRequest() to allow the SOAPAction header to be overridden in each request (issue 1226)
    • respect XML generation flags in request generation
    • fixed parsing empty base64Binary and hexBinary elements (issue 1227)
  • SoapClient fixes and enhancements:
    • added the SoapClient::callOperation() method (issue 1226)
    • updated SOAP response processing to throw an exception when the server responds with an error code (issue 1228)
    • content-type in exceptional cases follows Soap version (issue 1244)
    • fixed a typo in a debug logging statement (issue 1358)
    • fixed and documented the "info" output hash format (issue 1359)
    • fixed a typo in a debug logging statement (issue 1358)
    • fixed and documented the "info" output hash format (issue 1359)
    • fixed a bug in the SoapClient::constructor() where a WebService object was not supported (issue 1424)
  • added SalesforceSoapClient user module
  • added Salesforce.com.qtest and accompanying WSDLs

xml Module Version 1.3

Changes and Bug Fixes in This Release

xml Module Version 1.2

Changes and Bug Fixes in This Release

xml Module Version 1.1

Changes and Bug Fixes in This Release

  • fixed internal XMl to Qore processing for extreme cases
  • fixed makeFormattedXMLString() to really format the output
  • fixed XML-RPC parsing with empty elements in structures
  • added additional validation to XML-RPC parsing (checking close elements)
  • set the character encoding correctly in the outgoing "Content-Type" request header
  • added suport for the new arbitrary-precision numeric type introduced with Qore 0.8.6