CopperSpice API  1.9.1
Introduction to XQuery

XQuery is a language for querying XML data or non-XML data which can be modeled as XML. C++ and Java are statement based languages while the XQuery language is expression based. For information about the XQuery specification refer to the documentation from the W3C consortium.

The simplest XQuery expression is an XML element constructor. The recipe element shown below is an XQuery expression which forms a complete XQuery. This element does not actually query anything. It creates an empty recipe element in the output.


An XQuery expression can also be enclosed in curly braces and embedded in another XQuery expression. This XQuery has a document expression embedded in a node expression. This example creates a new html element in the output and sets its id attribute to be the id attribute from an html element in the file "other.html".

<html xmlns=""

Using Path Expressions To Match And Select Items

In C++ or Java you would write a nested for() loop and recursive functions to traverse XML trees in search of elements of interest. In XQuery iterative and recursive algorithms are replaced by path expressions.

A path expression looks similar to a pathname for locating a file in a hierarchical file system. It is a sequence of one or more steps separated by slash "/" or double slash "//". Although path expressions are used for traversing XML trees, not file systems, in CsXmlPatterns we can model a file system to look like an XML tree, using an XQuery to traverse a file system.

You can think of a path expression as an algorithm for traversing an XML tree to find and collect items of interest. This algorithm is evaluated by evaluating each step moving from left to right through the sequence. A step is evaluated with a set of input items (nodes and atomic values), sometimes called the focus. The step is evaluated for each item in the focus. These evaluations produce a new set of items called the result which becomes the focus which is passed to the next step. Evaluation of the final step produces the final result, which is the result of the XQuery. The items in the result set are presented in document order and without duplicates.

With CsXmlPatterns, a standard way to present the initial focus to a query is to call QXmlQuery::setFocus(). Another common way is to let the XQuery itself create the initial focus by using the first step of the path expression to call the XQuery doc() function. The doc() function loads an XML document and returns the document node.

The document node is not the same as the document element. A document node is a node constructed in memory when the document is loaded. and represents the entire XML document, not the document element. The document element is the single top level XML element in the file. The doc() function returns the document node which becomes the singleton node in the initial focus set. The document node will have one child node, and that child node will represent the document element.

Example 1

Consider the following XQuery where the doc() function loads the "cookbook.xml" file and returns the document node. The document node then becomes the focus for the next step "//recipe". The double slash means select all recipe elements found below the document node, regardless of where they appear in the document tree. The query selects all recipe elements in the cookbook.


Example 2

Consider the following XQuery which builds on the previous one.


The previous XQuery is a single path expression composed of three steps.

  1. creates the initial focus by calling doc()
  2. find every descendant node which is a recipe element
  3. collect the child nodes which are title elements

The single slash before the title element selects only those title elements which are child elements of a recipe element, not grandchildren. The XQuery evaluates to a final result set containing the title element of each recipe element in the cookbook.

Axis Steps

The most common kind of path step is called an axis step, which tells the query engine which way to navigate from the context node, and which test to perform when it encounters nodes along the way. An axis step has two parts, an axis specifier, and a node test. For each node in the focus set, the query engine navigates out from the node along the specified axis and applies the node test to each node it encounters. The nodes selected by the node test are collected in the result set, which becomes the focus set for the next step.

In the example XQuery above the second and third steps are both axis steps. Both apply the element(name) node test to nodes encountered while traversing along some axis. But in this example, the two axis steps are written in a shorthand form, where the axis specifier and the node test are not written explicitly but are implied. XQueries are normally written in this shorthand form, but they can also be written in the longhand form. If we rewrite the XQuery in the longhand form it will expand to the following code.


The two axis steps have been expanded. To create an axis step concatenate an axis specifier and a node test. The output of the expanded XQuery will be exactly the same as the output of the shorthand form. The following sections list the axis specifiers and node tests which are available.

The first step has been rewritten where descendant-or-self:: is the axis specifier and element(recipe) is the node test. The second step has been rewritten as where child:: is the axis specifier and element(title) is the node test.

Axis Specifiers

An axis specifier defines the direction you want the query engine to take, when it navigates away from the context node. CsXmlPatterns supports the following axes.

Axis SpecifierContents
self:: context node itself
attribute:: all attribute nodes of the context node
child:: all child nodes of the context node (not attributes)
descendant:: all descendants of the context node (children, grandchildren, etc)
descendant-or-self:: all nodes in descendant + self
parent:: the parent node of the context node, or empty if there is no parent
ancestor:: all ancestors of the context node (parent, grandparent, etc)
ancestor-or-self:: all nodes in ancestor + self
following:: all nodes in the tree containing the context node, not including descendant which follow the context node
preceding:: all nodes in the tree containing the context node, not including ancestor which precede the context node
following-sibling:: all children of the context node's parent which follow the context node
preceding-sibling:: all children of the context node's parent which precede the context node

Node Testing

A node test is a conditional expression which must be true for a node to be selected. The conditional expression can test the kind of node or the kind and the name. The kind can be one of six different values.

  • element
  • attribute
  • text
  • document
  • comment
  • processing instruction

The XQuery specification for node tests defines a third condition which is used to test the annotation of a node, this is not currently supported in CsXmlPatterns. The following table contains a list of the supported node testing functions. For more information about how to use these functions, refer to the section covering name testing

Node TestMatches
node() nodes of any kind and any name
text() nodes with the kind text and any name
comment() nodes with the kind comment and any name
element() nodes with the kind element and any name
element(name) nodes with the kind element and the given name
attribute() nodes with the kind attribute and any name
attribute(name) nodes with the kind attribute and the given name
processing-instruction() nodes with the kind processing instruction
processing-instruction(name) nodes with the kind processing instruction with the given name
document-node() nodes with the kind document and any name (there is only one)
document-node(element(name)) nodes with the kind document and the given document element name

Shorthand Form

Writing axis steps using the longhand form with axis specifiers and node tests is semantically clear but syntactically verbose. The shorthand form is easy to learn and, once you learn it, just as easy to read. In the shorthand form, the axis specifier and node test are implied by the syntax. XQueries are normally written in the shorthand form. The following table shows some of the frequently used shorthand forms.

Shorthand SyntaxExpands ToMatches
name child::element(name)child nodes that are name elements
* child::element()child nodes that are elements
.. parent::node()parent nodes
@* attribute::attribute()attribute nodes
@name attribute::attribute(name)name attributes
// descendant-or-self::node()descendant nodes when used instead of '/'

The XQuery language has a detailed section on the shorthand form which they refer to as the abbreviated syntax. There are other examples of path expressions written in the shorthand form and a section written in the longhand form.

Language Specification
Abbreviated Syntax
Longhand Form

Name Testing

Any node test with a name parameter is called a name test. When using a name test the name of the node must match the name parameter. The kind of the node will match based on which node test is being called.

In the node test shown below both recipe and title are name tests written in the shorthand form. Resolving a name to the expanded form involves replacing the namespace prefix, if one exists with a namespace URI. The expanded name then consists of the namespace URI and the local name. The XQuery name is resolved to an expanded form using namespace declarations.


The recipe and title in the previous example above do not have namespace prefixes because it was not specified in the "cookbook.xml" file. To add a default namespace change the document element as shown below.

<cookbook xmlns="http://cookbook/namespace">

The xmlns attribute is called a default namespace declaration because it does not include a namespace prefix. By including this default namespace declaration any element in the document without a prefix is automatically in the default namespace.

Any attribute without a prefix is not affected by the default namespace declaration. These are never considered to be in a namespace. The URL used for a namespace does not need to a valid web page. When running the previous XQuery example, no output is produced. With the addition of the default namespace our query no longer matches any of the elements in the cookbook file. There are two ways to declare the namespace in the XQuery.

The first way is to supply a namespace prefix and then adjust the node name test.

declare namespace c = "http://cookbook/namespace";

The second way is to declare the namespace as the default element namespace.

declare default element namespace "http://cookbook/namespace";

Both of these will produce the same output, which is shown below.

<title xmlns="http://cookbook/namespace">Quick and Easy Mushroom Soup</title>
<title xmlns="http://cookbook/namespace">Cheese on Toast</title>
<title xmlns="http://cookbook/namespace">Hard-Boiled Eggs</title>

It is worth mentioning, this output is slightly different from the output produced before we added the default namespace declaration to the cookbook file. CsXmlPatterns automatically includes the correct namespace attribute in each <title> element in the output. Refer to the QXmlName documentation for further details.

Wildcards in Name Tests

The wildcard '*' can be used in a name test. To find all the attributes in the cookbook but select only the ones in the xml namespace, use the xml: namespace prefix but replace the local name (the attribute name) with the wildcard.


If you save this XQuery in "file.xq" and then run, it will not work. An error message saying similar to the following will be reported.

Error SENR0001 in file:
Attribute xml:id can not be serialized because it appears at the top level.

The XQuery actually ran correctly however it selects several xml:id attributes and puts them in the result set. The result set is then serialized, however it is not well formed XML.

XQuery can do more than just find and select elements and attributes. It can also construct new ones on the fly, which is what we need if we want CsXmlPatterns to show the selected attributes. To find all the name attributes in the cookbook no matter what namespace they are in, replace the namespace prefix with a wildcard.


To find and select all the attributes of the document element in the cookbook, replace the entire name test with the wildcard.

declare default element namespace "http://cookbook/namespace";

Using Predicates In Path Expressions

Predicates can be used to further filter the nodes selected by a path expression. A predicate is an expression in square brackets ('[' and ']') that either returns a boolean value or a number. A predicate can appear at the end of any path step in a path expression. The predicate is applied to each node in the focus set. If a node passes the filter, the node is included in the result set. The query below selects the recipe element that has the <title> element "Hard-Boiled Eggs".

declare default element namespace "http://cookbook/namespace";
doc("cookbook.xml")/cookbook/recipe[title = "Hard-Boiled Eggs"]

The dot expression ('.') can be used in predicates and path expressions to refer to the current context node. The following query uses the dot expression to refer to the current <method> element. The query selects the empty <method> elements from the cookbook.

declare default element namespace "http://cookbook/namespace";
doc('cookbook.xml')//method[string-length(.) = 0]

Passing the dot expression to the string-length() function is optional. When this function is called with no parameter, the context node is assumed to be the following.

declare default element namespace "http://cookbook/namespace";
doc('cookbook.xml')//method[string-length() = 0]

Selecting an empty <method> element might not be very useful by itself. It does not tell you which recipe has the empty method.

<method xmlns="http://cookbook/namespace"/>

Instead, what you probably want is the <recipe> elements which have empty <method> elements as shown in the following query.

declare default element namespace "http://cookbook/namespace";
doc('cookbook.xml')//recipe[string-length(method) = 0]

The predicate uses the string-length() function to test the length of each <method> element in each <recipe> element found by the node test. If a <method> contains no text, the predicate evaluates to true and the <recipe> element is selected. If the method contains some text, the predicate evaluates to false, and the <recipe> element is discarded. The output is the entire recipe which has no instructions for preparation.

<recipe xmlns="http://cookbook/namespace" xml:id="HardBoiledEggs">
<title>Hard-Boiled Eggs</title>
<ingredient name="Eggs" quantity="3" unit="eggs"/>
<time quantity="3" unit="minutes"/>

After going through this example we can conclude using string-length() to find an empty element is unreliable. It works in this case because the method element is written as <method/>, guaranteeing its string length will be 0. This query still works if the method element is written as <method></method>. However it will fail if there is any whitespace between the opening and ending <method> tags. A more robust way to find the recipes with empty methods is presented in the section on Boolean Predicates.

There are many more functions and operators defined for XQuery and XPath. They are all documented in the specification.

Positional Predicates

Predicates are often used to filter items based on their position in a sequence. For path expressions processing items loaded from XML documents, the normal sequence is document order. This query returns the second <recipe> element in the "cookbook.xml" file.

declare default element namespace "http://cookbook/namespace";

The other frequently used positional function is last(), which returns the numeric position of the last item in the focus set. Stated another way, last returns the size of the focus set. This query returns the last recipe in the cookbook.

declare default element namespace "http://cookbook/namespace";

This query returns the next to last <recipe>.

declare default element namespace "http://cookbook/namespace";
doc('cookbook.xml')/cookbook/recipe[last() - 1]

Boolean Predicates

The other kind of predicate evaluates to true or false. A boolean predicate takes the value of its expression and determines its effective boolean value according to the following rules.

  • An expression which evaluates to a single node is true
  • An expression which evaluates to a string is false if the string is empty, it is true if the string is not empty
  • An expression which evaluates to a boolean value is used directly
  • If the expression evaluates to anything else it is an error

We have already seen some boolean predicates in use. Earlier, we showed a query which did not detect empty elements in a reliable way. The [string-length(method) = 0] is a boolean predicate which would fail in the example if the empty method element was written with both opening and closing tags and there was whitespace between the tags. Here is a more robust way that uses a different boolean predicate.

declare default element namespace "http://cookbook/namespace";

This example uses the empty()function to test whether the method contains any steps. If the method contains no steps, then <empty(step)> will return true and hence the predicate will evaluate to true. But even this version is not sufficient. Suppose the method does contain steps, but all the steps themselves are empty. This is still a case where a recipe with no instructions will not be detected. There is a better way as shown below.

declare default element namespace "http://cookbook/namespace";

This query uses the not() and normalize-space() functions. The function normalize-space(method)) returns the contents of the method element as a string, but with all the whitespace normalized. The string value of each <step> element will have its whitespace normalized and then all the normalized step values will be concatenated. If that string is empty then not() returns true and the predicate is true.

We can also use the position() function in a comparison to inspect positions with conditional logic. The position() function returns the position index of the current context item in the sequence of items.

declare default element namespace "http://cookbook/namespace";
doc('cookbook.xml')/cookbook/recipe[position() = 2]

The first position in the sequence is position 1 not 0. We can also select all the recipes after the first one using the following query.

declare default element namespace "http://cookbook/namespace";
doc('cookbook.xml')/cookbook/recipe[position() > 1]

Constructing Elements

In the documentation about using wildcards in name testing, there were three examples. Each one selected a different list of XML attributes from the cookbook. Running these queries using CsXmlPatterns will produce an error because the result set can not be serialized since it does not contain well formed XML.

An attribute must be attached to an element. For each attribute in the result set we need to create an XML element. We can do this using a for clause with a bound variable, and a return clause with an element constructor.

for $i in doc("cookbook.xml")//@xml:*
return <p>{$i}</p>

The for clause produces a sequence of attribute nodes from the result of the path expression. Each attribute node in the sequence is bound to the variable $i. The return clause then constructs a <p> element around the attribute node.

Running this query produces the following output.

<p xml:id="MushroomSoup"/>
<p xml:id="CheeseOnToast"/>
<p xml:id="HardBoiledEggs"/>

The output contains one <p> element for each xml:id attribute in the cookbook. XQuery puts each attribute in the right place in its <p> element, despite the fact that in the return clause, the $i variable is positioned as if it is meant to become <p> element content.

The other two examples from the wildcard section can be rewritten the same way. Here is the XQuery that selects all the name attributes, regardless of namespace.

for $i in doc("cookbook.xml")//@*:name
return <p>{$i}</p>

This is the output.

<p name="Fresh mushrooms"/>
<p name="Garlic"/>
<p name="Olive oil"/>
<p name="Milk"/>
<p name="Water"/>
<p name="Cream"/>
<p name="Vegetable stock"/>
<p name="Ground black pepper"/>
<p name="Dried parsley"/>
<p name="Bread"/>
<p name="Cheese"/>
<p name="Eggs"/>

The following XQuery shows how to select all the attributes from the document element.

declare default element namespace "http://cookbook/namespace";
for $i in doc("cookbook.xml")/cookbook/@*
return <p>{$i}</p>

This is the output produced by running the prior query.

<p xmlns="http://cookbook/namespace" count="3"/>

Element Constructors are Expressions

Since node constructors are expressions they can be used in an XQuery wherever expressions are allowed.

declare default element namespace "http://cookbook/namespace";
let $docURI := 'cookbook.xml'
return if(doc-available($docURI))
then doc($docURI)//recipe/<output>{./node()}</output>
else <output>Failed to load {$docURI}</output>

If the "cookbook.xml" file is loaded without an error, then one <output> element is constructed for each <recipe> element in the cookbook. The child nodes of the <recipe> are copied into the <output> element. If the cookbook document does not exist or does not contain well formed XML, a single <output> element is constructed containing an error message.

Constructing Atomic Values

XQuery also has atomic values. An atomic value is a value in the value space of one of the built-in data types in the XML Schema language. These atomic types have built-in operators for doing arithmetic, comparisons, and for converting values to other atomic types. Refer to the Built-in Datatype Hierarchy for the entire tree of built in, primitive, and derived atomic types.

To construct an atomic value as element content, enclose an expression in curly braces and embed it in the element constructor.

<e>{sum((1, 2, 3))}</e>

Running this XQuery produces the following result.


To compute the value of an attribute enclose the expression in curly braces and embed it in the attribute value.

declare variable $insertion := "example";
<p class="important {$insertion} obsolete"/>

Running this XQuery produces the following output.

<p class="important example obsolete"/>
declare default element namespace "http://cookbook/namespace";
let $docURI := 'cookbook.xml'
return if(doc-available($docURI))
then doc($docURI)//recipe/<output>{./node()}</output>
else <output>Failed to load {$docURI}</output>

If "cookbook.xml" is loaded without error, a <output> element is constructed for each <recipe> element in the cookbook, and the child nodes of the <recipe> are copied into the <output> element. But if the cookbook document does not exist or does not contain well formed XML, a single <output> element is constructed containing an error message.

FAQ- Why did my path expression not match anything?

The most common cause of this issue is a failure to declare one or more namespaces in your XQuery. Consider the following query for selecting all the examples in an XHTML document.


It will not match anything because "index.html" is an XHTML file and all XHTML files declare the default namespace "" in their top <html> element. However, the query does not declare this namespace so the path expression expands "html" to "{}html" and tries to match that expanded name. The actual expanded name is "{}html".

One possible solution is to declare the correct default namespace in the XQuery as shown below.

declare namespace x = "";

Another common cause of this issue is confusing the document node with the top element node, they are different. This query will not match anything.


The doc() function returns the document node not the top element node <html>. Do not forget to match the top element node in the path expression.


FAQ- What if my input namespace is different from my output namespace?

Remember to declare both namespaces in your XQuery. Consider the following query which is meant to generate XHTML output from XML input.

The html, body, and p nodes in the output should be in the standard XHTML namespace. The default namespace is declared as "". This is the correct namespace for the output however the same default namespace will be applied to the node names in the path expression. Our path expression will not match the nodes from the input file since the namespaces are not the same.

declare default element namespace "";
for $i in doc("testResult.xml")/tests/test[@status = "failure"]
order by $i/@name
return <p>{$i/@name}</p>

To correctly match the input we must declare the output namespace with a namespace prefix and use the prefix with the node names in the path expression.

declare namespace x = "";
for $i in doc("testResult.xml")/tests/test[@status = "failure"]
order by $i/@name
return <x:p>{$i/@name}</x:p>

FAQ- Why does my return clause not work?

Recall that XQuery is an expression-based language and not statement-based. Because an XQuery contains several expressions, understanding XQuery expression precedence is very important. Consider the following query.

for $i in(reverse(1 to 10)),
$d in xs:integer(doc("numbers.xml")/numbers/number)
return $i + $d

The prior query looks reasonable however there are some issues. This is supposed to be a FLWOR expression containing a for clause and a return clause, which it does. But it also contains an arithmetic expression as part of the return. The issue is the missing parentheses around the arithmetic expression. Without the parentheses the return clause only returns $i. Since the scope of the variable $d ends after the return clause, a "variable out of scope" error will be reported.

This is the corrected query.

for $i in(reverse(1 to 10)),
$d in xs:integer(doc("numbers.xml")/numbers/number)
return ($i + $d)

FAQ- Why did my expression not get evaluated?

You probably misplaced some curly braces. When you want an expression evaluated inside an element constructor, enclose the expression in curly braces. Without the curly braces, the expression will be interpreted as text. This example uses the sum() expression in an <e> element. The table below shows cases where the curly braces are missing or incorrectly located.

element constructor with expression...evaluates to...
<e>sum((1, 2, 3))</e><e>sum((1, 2, 3))</e>
<e>sum({(1, 2, 3)})</e><e>sum(1 2 3)</e>
<e>{sum((1, 2, 3))}</e><e>6</e>

FAQ- My predicate is correct, so why does it not select the right items?

Either the predicate is in the wrong place in your path expression or parentheses are missing. Consider the following "doc.txt" input file.


Suppose you want the first <span> element of every <p> element. Apply a position filter "[1]" to the /span path step.

let $doc := doc('doc.txt')
return $doc/doc/p/span[1]

Applying the "[1]" filter to the /span step returns the first <span> element of each <p> element.


You can write the same query using the following code.

for $a in doc('doc.txt')/doc/p/span[1]
return $a

Or you can reduce it right down to this code.


Suppose you really want only one <span> element which is the first one in the document. Then you have to do more filtering and there are two ways you can do it. You can apply the "[1]" filter in the same place as above but enclose the path expression in parentheses.

let $doc := doc('doc.txt')
return ($doc/doc/p/span)[1]

Or you can apply a second position filter "[1]" to the /p path step.

let $doc := doc('doc.txt')
return $doc/doc/p[1]/span[1]

Either way, the query will return only the first <span> element in the document.


FAQ - Why does my FLWOR not work as expected?

Some programmers expect a XQuery FLWOR to behave like a C++ for() loop. These constructs are not the same and do not work in the same way.

The following query evaluates to 4 -4 -2 2 -8 8. The for clause sets up a loop iteration which evaluates the FLWOR multiple times. On each loop the variable $a takes on the next value returned by the "in" expression.

for $a in (8, -4, 2)
let $b := ($a * -1, $a)
order by $a
return $b

In C++ a return statement will break out of a for() loop. This does not occur in XQuery. Instead, the return clause is simply the last clause of the FLWOR. It means, "Append the return value to the result list and then begin the next iteration of the FLWOR".

The let clause does not set up an iteration through a sequence of values. It is a variable binding. On each iteration, it binds the entire sequence of values on the right to the variable on the left. In the example above, it binds (4 -4) to $b on the first iteration, (-2 2) on the second iteration, and (-8 8) on the third iteration.

The order by clause does not do sorting on each iteration of the FLWOR. It simply evaluates its expression to get an ordering value. These ordering values are kept in a parallel list. The result list is sorted at the end using the parallel list of ordering values.

The following query has no for clause so it does not iterate through anything and does not do any ordering. It binds the entire sequence (2, 3, 1) to $i one time only. The order by clause only has one thing to order and does nothing and the query evaluates to 2 3 1. We did not include a where clause in the example since no filtering was needed.

let $i := (2, 3, 1)
order by $i[1]
return $i

FAQ - Why are my elements created in the wrong order?

The short answer is that your elements are not created in the wrong order. When appearing as operands to a path expression, there is no correct order. Consider the following query, which uses the input file "doc.txt".


The query finds all the <p> elements in the file. For each <p> element, it builds a <p> element in the output containing the concatenated contents of all the <p> element's child <span> elements. Running the query might produce the following output, which is not sorted in the expected order.


You can use a for loop to ensure the order of the result set corresponds to the order of the input sequence:

for $a in doc('doc.txt')//p
return <p>{$a/span/node()}</p>

This version produces the same result set but in the expected order:


FAQ - Why can I not use true and false in my XQuery?

It is possible however you can not use the names true and false directly even though they look like boolean constants. The simple way to create the boolean values is to use the built in functions true() and false() wherever you want to use true and false. The other way is to invoke the boolean constructor.