Skip to main content

Handling Special XML Characters

Handling Special XML Characters

Depending on the context, InterSystems IRIS XML support escapes the ampersand character (&) and certain other characters, when it finds those characters within a property of type string or character stream.

Note:

The ESCAPE property parameter controls which characters are recognized as special. This parameter is either "XML" (the default) or "HTML" (not discussed in the documentation).

For these special characters, you can control how the escaping is performed by setting the CONTENT property parameter. The details are different for literal and encoded formats, as follows:

Form of Escaping for Literal and SOAP-encoded Formats
Value of CONTENT (Case-insensitive) XML Document in Literal Format XML Document in SOAP-encoded Format
"STRING" (the default) CData CData
"ESCAPE" XML entity XML entity
"ESCAPE-C14N" XML entity* XML entity*
"MIXED" No escaping is done CData

*For "ESCAPE-C14N", the escaping is done in accordance with the XML Canonicalization standard. The main difference is that a carriage return is escaped as 

Examples

Consider the following class:

Class ResearchXForms.CONTENT Extends (%RegisteredObject, %XML.Adaptor)
{

Parameter XMLNAME = "Demo";

Property String1 As %String;

Property String2 As %String(CONTENT = "STRING");

Property String3 As %String(CONTENT = "ESCAPE");

Property String4 As %String(CONTENT = "MIXED");

}

String2 and String1 are always treated in the same way, because String2 uses the default value for CONTENT.

Literal XML output for this class might look like the following:

<?xml version="1.0" encoding="UTF-8"?>
<Demo>
  <String1><![CDATA[value 1 & value 2]]></String1>
  <String2><![CDATA[value 1 & value 2]]></String2>
  <String3>value 1 &amp; value 2</String3>
  <String4>value 1 & value 2</String4>
</Demo>

SOAP-encoded XML output would be as follows instead:

<?xml version="1.0" encoding="UTF-8"?>
<CONTENT xmlns:SOAP-ENC="http://schemas.xmlsoap.org/soap/encoding/" 
xmlns:s="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <String1><![CDATA[value 1 & value 2]]></String1>
  <String2><![CDATA[value 1 & value 2]]></String2>
  <String3>value 1 &amp; value 2</String3>
  <String4><![CDATA[value 1 & value 2]]></String4>
</CONTENT>

Alternative Way to Prevent the Escaping

There is another way to prevent the escaping of special XML characters. You can define the property as one of the special XML types: %XML.StringOpens in a new tab, %XML.FileCharacterStreamOpens in a new tab, or %XML.GlobalCharacterStreamOpens in a new tab. For these data type classes, CONTENT is "MIXED".

Note that your application is responsible for ensuring that the property value is valid for the scenario in which it will be used; the %XML.StringOpens in a new tab and other classes do not provide this validation.

FeedbackOpens in a new tab