Specifying DOM-style Paths for XML Virtual Documents
This topic describes how to specify DOM-style XML virtual property paths (property paths for XML virtual documents).
You can use these paths to access values and to set values (with noted exceptions).
Most of the following sections assume that the document does not use any XML namespaces. The last section gives information on adapting these paths for a document that does use XML namespaces.
The examples in this topic use the schema shown in Overview of XML Virtual Property Paths.
Getting or Setting Nodes (Basic Paths)
In an XML virtual document, there are five kinds of nodes: the root node, elements, text nodes, comments, and processing instructions. The root node and any element can have child nodes of any type. The other kinds of nodes cannot have child nodes. Attributes are not nodes.
The following table lists basic DOM-style paths to get or set many of the nodes of an XML virtual document. When there are multiple nodes of the same type or with the same name, and when you do not want the first one, see the next section.
You also use these paths when you create more complex DOM-style paths as discussed in later subsections.
Syntax | Refers to |
---|---|
/ | Contents of the root node. You can also use "", if the context makes it clear that you are using a DOM-style path (that is, if no schema is loaded). |
/root_element_name | Contents of the root element, whose name is root_element_name. |
parent/element_name | Contents of the first element of the given name (element_name), within the given parent. Here parent is the full path to its parent element, including (as always) the initial slash. |
element_reference/text() | First text node in the element indicated by element_reference. |
element_reference/comment() |
First comment in the element indicated by element_reference. The value returned does not include the opening syntax (<!--) or the closing syntax (-->). Similarly, do not include the opening or closing syntax when setting the value. InterSystems IRIS® removes all comments when it reads in XML files. The only comments that can be present are comments that you add. (To add them, use SetValueAt() with a path like the one shown here.) |
element_reference/instruction() |
First processing instruction in the element indicated by element_reference. The value returned does not include the opening syntax (<?) or the closing syntax (?>). Similarly, do not include the opening or closing syntax when setting the value. InterSystems IRIS removes all processing instructions when it reads in XML files. The only instructions that can be present are instructions that you add. (To add them, use SetValueAt() with a path like the one shown here.) |
Consider the following XML document:
<?xml version="1.0" ?>
<Patient xmlns='http://myapp.com'>Sample text node
<!--Sample comment-->
<!--Another comment-->
<Name>Jane Doe</Name>
<Address>
<Street>100 Blank Way</Street>
</Address>
</Patient>
The following table shows some example paths for this document:
Example Path | Current Path Value |
---|---|
/Patient/Name | Jane Doe |
/Patient/Address |
<Street>100 Blank Way</Street> In this case, the referenced element contains a child element (in contrast to the previous example). Note that InterSystems IRIS ignores whitespace when comparing DOM-style paths to values. That is, the value here matches the given path whether or not the document contains line breaks and indentation. |
/Patient/Address/Street | 100 Blank Way |
/Patient/text() | Sample text node |
/Patient/comment() | Sample comment |
Suppose that we now use a data transformation that contains only the following code:
set status=target.SetValueAt("892 Broadway","/Patient/Address/Street")
if 'status {do $system.Status.DisplayError(status) quit}
set status=target.SetValueAt("Dr. Badge","/Patient/Doctor/Name")
if 'status {do $system.Status.DisplayError(status) quit}
Notice that one of these paths already exists and the other does not; both paths are valid. After we use this transformation, the new document would then look like the following:
<?xml version="1.0" ?>
<Patient xmlns='http://myapp.com'>Sample text node
<!--Sample comment-->
<!--Another comment-->
<Name>Jane Doe</Name>
<Address>892 Broadway</Address>
<Doctor>
<Name>Dr. Badge</Name>
</Doctor>
</Patient>
Using Mixed Content When Setting DOM-style Paths
You can set paths to values that include both element and text nodes, for example:
set mixed="SOME TEXT<HOMETOWN>BELMONT</HOMETOWN>"
set status=target.SetValueAt(mixed,"/Patient/Address/Street")
A combination of element and text nodes is called mixed content.
For DOM-style paths, InterSystems IRIS determines that a value is mixed content if it contains a left angle bracket (<) character. Consequently, if you must set a DOM-style path to a value that includes a left angle bracket and is not valid XML, use the (< ) character entity reference to avoid an error.
The following table describes how InterSystems IRIS handles mixed content for different types of nodes:
Node Type | How InterSystems IRIS Handles Mixed Content Provided for the Node Value |
---|---|
root | Not supported |
element or comment | InterSystems IRIS replaces the current contents of the node with the given mixed content |
text node or instruction | InterSystems IRIS escapes the XML special characters and then replaces the current contents of the given node |
Attributes are not nodes.
For information about mixed content in schema-dependent paths, see Using Mixed Content When Setting Schema-dependent Paths.
Using the Basic Path Modifiers
You can add the following basic path modifiers to the end of basic paths (listed in the previous section), with noted exceptions. You can use the resulting paths in the same way that you use any of the basic paths.
Refers to an item by item position. Only instances of that item are counted; items of other types are ignored.
-
When you get a value, this syntax returns the nth instance of the item to which the basic path refers (or an empty string otherwise).
-
When you set a value, this syntax either overwrites or creates the nth instance of the item to which the basic path refers.
You can substitute a hyphen (-) to access the last instance. You can also omit the square brackets.
Refers to a child element by child element position.
You can substitute a hyphen (-) to access the last child. You can also omit the square brackets.
Restrictions:
-
You can use this only with a basic path that refers to an element; that is, you cannot use it with functions such as comment().
-
You can use this syntax only when getting a value, not when setting a value.
You can combine this path modifier with the other path modifiers, if you use the /[n] modifier as the last modifier.
Refers to an item by node position.
-
When you get a value, this syntax returns the nth node, if that node is an instance of the item to which the basic path refers. Otherwise the path is invalid, and an error is returned.
-
When you set a value, this syntax overwrites the nth node, if that node is an instance of the item to which the basic path refers. Otherwise the path is invalid, and an error is returned.
Different path modifiers, listed in a later section, enable you to insert or append nodes. (Also see Summary of Path Modifiers.)
Consider the following XML document:
<?xml version="1.0" ?>
<Patient xmlns='http://myapp.com'>
<!--Sample comment-->
<!--Another comment-->
Sample text node
<Name>Fred Williams</Name>
<FavoriteColors>
<FavoriteColor>Red</FavoriteColor>
<FavoriteColor>Green</FavoriteColor>
</FavoriteColors>
<Doctor>
<Name>Dr. Arnold</Name>
</Doctor>
</Patient>
The following table shows some example paths for this document:
Example Path | Current Path Value | Notes |
---|---|---|
/Patient/Name | Fred Williams | |
/[1]/[1] | Fred Williams | This path accesses the first child element within the first element of the document (which is the only element in the document, according to the XML standard). The square brackets are optional here. |
/Patient/FavoriteColors/[1] | Red | The square brackets are optional here. |
/Patient/FavoriteColors/[2] | Green | The square brackets are optional here. |
/[1]/[2]/[1] | Red | The square brackets are optional here. |
/[1]/[2]/[2] | Green | The square brackets are optional here. |
/Patient/Name[$1] | An empty string | This path is invalid. The first node within <Patient> is not a <Name> element. |
/Patient/Name[$4] | Fred Williams | |
/Patient/Doctor[$6] | <Name xmlns='http://myapp.com'>Dr. Arnold</Name> | |
/Patient/4 | An empty string | This path is invalid. <Patient> does not have a fourth element. |
/Patient/comment()[1] | Sample comment | The square brackets are required, because without square brackets, this path would be interpreted as an element name. |
/Patient/comment()[2] | Another comment | The square brackets are required, because without square brackets, this path would be interpreted as an element name. |
/Patient/comment()[$2] | Another comment | The square brackets are required, because without square brackets, this path would be interpreted as an element name. |
/Patient/comment()[-] | Another comment | The square brackets are required, because without square brackets, this path would be interpreted as an element name. |
Using the Full() Function
For a path that refers to an element (either a basic path or a path that uses basic modifiers), you can also obtain the opening and closing tags of the element. To do so, add full() to the end of the path.
You can use the full() function when you are setting a value. Within DTL, this is permitted only within a data transformation that uses the append action; see Assignment Actions for XML Virtual Documents.
Consider the following XML document:
<?xml version="1.0" ?>
<Patient xmlns='http://myapp.com'>
<Name>Jack Brown</Name>
<Address>
<Street>233 Main St</Street>
</Address>
</Patient>
The following table shows some example paths for this document:
Example Path | Current Path Value |
---|---|
/Patient/Name/full() | <Name xmlns='http://myapp.com'>Jack Brown</Name> |
/Patient/Address/full() | <Address xmlns='http://myapp.com'><Street>233 Main St</Street></Address> |
/Patient/Address/Street/full() | <Street xmlns='http://myapp.com'>233 Main St</Street> |
For the root note, use of the full() function is implied. That is, the following two paths are equivalent:
/
/full()
If you use GetValueAt(), you can also specify an additional format argument (f) that retrieves the full element. For details, see The pFormat Argument.
Getting or Setting the Value of an XML Attribute
To access the value of an attribute, you can use one of the following DOM-style paths. Here (and in the rest of this section), element_reference is a complete DOM-style path to an element.
Syntax | Refers to |
---|---|
element_reference/@attribute_name | Value of the given attribute of the given element. |
element_reference/@[n] | (For use only when retrieving values) Value of the nth attribute (in alphabetical order) of the given element. |
element_reference/@[-] | Value of the last attribute of the given element. |
You can omit the square brackets.
For example, consider the following XML document:
<?xml version="1.0" ?>
<Patient MRN='000111222' DL='123-45-6789' xmlns='http://myapp.com'>
<Name>Liz Jones</Name>
</Patient>
The following table shows some example paths for this document:
Example Path | Current Path Value |
---|---|
/Patient/@MRN | 000111222 |
/Patient/@[1] | 000111222 |
/Patient/@2 | 123-45-6789 |
Using Path Modifiers to Insert or Append Nodes
To insert or append nodes, add the following path modifiers to the end of basic paths. Use the path modifiers listed here only when you are setting a value.
Also see the next section for a couple of additional options.
Inserts an instance of the item to which the basic path refers, right before the nth instance of that item, in the given context. Nothing is overwritten. See the following table for details.
Here and in the rest of this subsection, n is an integer.
Example Path | Behavior |
---|---|
/Patient/Episode[~5] |
Inserts a new <Episode> element within <Patient>, before the existing fifth <Episode> element. If <Patient> does not include five <Episode> elements, InterSystems IRIS performs padding; it creates empty <Episode> elements so that the inserted <Episode> is the fifth <Episode>. All the newly inserted elements are at the end of the <Patient> element. If the path refers to intermediate, nonexistent elements, InterSystems IRIS creates those. |
/Patient/element(Episode)[~5] |
Inserts an <Episode> element within <Patient>, before the existing fifth element. If <Patient> does not include five elements (of any type), this path is invalid. The element() function does not generate empty elements for padding. |
/Patient/[~5] | Not allowed, because there is no information about the kind of element to insert. |
/Patient/element()[~5] |
For example, consider the following XML document:
<?xml version="1.0" ?>
<Patient xmlns='http://myapp.com'>
<Name>Betty Hodgkins</Name>
<FavoriteColors>
<FavoriteColor>Purple</FavoriteColor>
</FavoriteColors>
</Patient>
Also consider the following code from within a data transformation:
set status=target.SetValueAt("INSERTED COLOR","/Patient/FavoriteColors/FavoriteColor[~4]")
if 'status {do $system.Status.DisplayError(status) quit}
This line of code transforms the original document into the following:
<?xml version="1.0" ?>
<Patient>
<Name>Betty Hodgkins</Name>
<FavoriteColors>
<FavoriteColor>Purple</FavoriteColor>
<FavoriteColor/>
<FavoriteColor/>
<FavoriteColor>INSERTED COLOR</FavoriteColor>
</FavoriteColors>
</Patient>
For another example, consider the following XML document:
<Patient xmlns='http://myapp.com'>
<Name>Colin McMasters</Name>
<Address>
<Street>102 Windermere Lane</Street>
</Address>
</Patient>
Also considering the following code from within a data transformation:
set status=target.SetValueAt("INSERTED ADDRESS","/Patient/Address/Street[~2]")
if 'status {do $system.Status.DisplayError(status) quit}
This line of code transforms the original document into the following:
<?xml version="1.0" ?>
<Patient>
<Name>Colin McMasters</Name>
<Address>
<Street>102 Windermere Lane</Street>
<Street>INSERTED ADDRESS</Street>
</Address>
</Patient>
Inserts an instance of the item to which the basic path refers, right before the nth node in the given parent. Nothing is overwritten. The path is invalid if the parent does not contain at least n nodes.
Example Path | Behavior |
---|---|
/Patient/Episode[~$3] | Inserts a new <Episode> element within <Patient>, before the existing third node in that parent. The path is invalid if the parent does not have three nodes. |
/Patient/element(Episode)[~$3] | Not allowed. The element() function works only with element positions. |
/Patient/[~3] | Not allowed, because there is no information about the kind of element to insert. |
/Patient/element()[~3] | Not allowed for multiple reasons; see above items. |
Appends an instance of the item to which the basic path refers, as the (new) last node of the given parent. Nothing is overwritten.
Example Path | Behavior |
---|---|
/Patient/Episode[~] | Appends a new <Episode> element within <Patient>, as the last node in that parent. If the path refers to intermediate, nonexistent elements, InterSystems IRIS creates those. |
/Patient/element(Episode)[~] | Appends an <Episode> element within <Patient>, as the last node in that parent. If the path refers to intermediate, nonexistent elements, the path is invalid. |
/Patient/[~] | Not allowed, because there is no information about the kind of element to append. |
/Patient/element()[~] | Not allowed, because there is no information about the kind of element to append. |
For example, the following shows part of a code element in a data transformation:
set status=target.SetValueAt("orange","/Patient/FavoriteColors/Color[~]")
if 'status {do $system.Status.DisplayError(status) quit}
set status=SetValueAt("pink","/Patient/FavoriteColors/Color[~]")
if 'status {do $system.Status.DisplayError(status) quit}
This adds two new <Color> children to the <FavoriteColors> element. If the <FavoriteColors> element does not exist, InterSystems IRIS creates it.
Also see Summary of Path Modifiers.
Using the element() Function
You can use the element() function when getting or setting values.
Using element() When Getting a Value
Syntax | Behavior |
---|---|
element_reference/element() | Returns the first child element of the given element. |
element_reference/element()[n] | Returns the nth child element of the given element. |
element_reference/element()[-] | Returns the last child element of the given element. |
Using element() When Setting a Value
Syntax | Behavior |
---|---|
parent_element/element(element_name)[~n] | Inserts the specified element (given by the element_name argument) right before the nth child element of the given parent. This path is invalid if the given element does not have at least n child elements. |
parent_element/element(element_name)[~] | Appends the specified element (given by the element_name argument) as the last node in the given parent. |
Getting Positions of Elements
You can use the following syntaxes to get positions of elements.
Syntax | Returns |
---|---|
element_reference/position() | Element position of the given element within its parent. |
element_reference/node-position() | Node position of the given element within its parent. For node position, InterSystems IRIS considers all kinds of nodes, not just elements. |
Getting Counts of Elements
You can use the following syntaxes to get counts of elements.
Syntax | Returns |
---|---|
|
Count of child elements within the given parent. |
|
Count of elements of the given name, within the given parent. Notice that there is no slash after the name of the element (in contrast with the previous set of paths). |
|
Count of child nodes of the given element. |
|
Count of attributes of the given element. |
You can omit the square brackets in all cases except for /[*]. Note that InterSystems IRIS also supports the last() function (equivalent to count()) and the node-last() function (equivalent to node-count()); you might prefer to use last() and node-last() if you are familiar with XPATH, which has a similar last() function.
Accessing Other Metadata
You can use the following functions to access other metadata of the XML virtual document. You can use these functions only at the end of a path.
Function | Returns |
---|---|
/node-type() | Type of the given node. This function returns one of the following values:
|
/name() | Full name of the given node. For example: s01:Patient |
/local-name() | Local name of the given node. For example: Patient |
/prefix() | Namespace prefix of the given node. For example: s01 |
/namespace-uri() | URI of the namespace to which the given node belongs. For example: www.myapp.org |
/prefixes() | All the namespace prefixes and their corresponding URIs, in the scope of the given element. This information is returned as a comma-separated list. Each list item consists of the namespace prefix, followed by an equal signs (=), followed by the URI. The default namespace URI is listed first without a prefix. For example: =http://tempuri.org,s01=http://myns.com |
Summary of Path Modifiers
The following table summarizes the path modifier for DOM-style paths:
Path modifier | Uses | Methods that can use paths that contain this modifier | Provides padding (as needed) when used with SetValueAt()? |
---|---|---|---|
[n] | Getting or setting nth instance | GetValueAt() and SetValueAt() | Yes |
/[n] | Getting nth child element | GetValueAt() | Not applicable |
[~n] | Inserting nth instance | SetValueAt() | Yes |
[~] | Appending instance | SetValueAt() | No |
[$n] | Getting or setting instance at nth node position | GetValueAt() and SetValueAt() | No |
[~$n] | Inserting instance at nth node position | SetValueAt() | No |
Variations for Documents That Use Namespaces
If the document uses XML namespaces, for each element or attribute that is in a namespace, you must modify that section of the path to include a namespace prefix, followed by colon (:). A namespace prefix is one of the following:
-
If you have loaded the corresponding XML schema, use a namespace token as described in XML Namespace Tokens. For example: use $2:element_name rather than element_name
-
If you have not loaded the XML schema, use the namespace prefix exactly as it appears in the document. For example: s01:Patient
-
Use the wildcard * to ignore the namespace. For example: *:Patient
Another option is to ignore all namespaces in the document. To do this, start the path with the wildcard *:/ rather than /
For example: *:/Patient/@MRN
You cannot use any wildcards in a path when you are setting the value for that path.
The output document of a DTL does not necessarily use the same namespace prefixes as the input document. The namespaces are the same, but the prefixes are generated. According to the XML standard, there is no significance to the choice of prefix.
Testing DOM-style Paths in the Terminal
It can be useful to test virtual document property paths in the Terminal before using them in business processes, data transformations, and so on, particularly when you are getting familiar with the syntax. To do so for DOM-style XML paths, do the following in the Terminal or in test code:
-
Create a string that contains the text of a suitable XML document.
-
Use the ImportFromString() method of EnsLib.EDI.XML.DocumentOpens in a new tab to create an instance of an XML virtual document from this string.
-
Use the GetValueAt() and SetValueAt() methods of this instance.
The following method demonstrates these steps:
ClassMethod TestDOMPath()
{
set string="<Patient xmlns='http://myapp.com'>"
_"<Name>Jolene Bennett</Name>"
_"<Address><Street>899 Pandora Boulevard</Street></Address>"
_"</Patient>"
set target=##class(EnsLib.EDI.XML.Document).ImportFromString(string,.status)
if 'status {do $system.Status.DisplayError(status) quit}
set pathvalue=target.GetValueAt("/Patient/Name",,.status)
if 'status {do $system.Status.DisplayError(status) quit}
write pathvalue
}
The following shows output from this method:
SAMPLES>d ##class(Demo.CheckPaths).TestDOMPath()
Jolene Bennett
For additional options for GetValueAt(), see The pFormat Argument.