XPath

Post Reply
KBleivik
Site Admin
Posts: 88
Joined: Tue Jan 31, 2006 3:10 pm
Location: Moss Norway
Contact:

XPath

Post by KBleivik »

XPath is a very important member of the XML familiy. XPath is in a sense comparable to SQL in a database. Without XPath, you can not control XSLT (or other XML technologies) with any kind of granularity. To mention a few, XPath is used by other members of the family like XSLT, XPointer, XLink, XML Schema and XML Query in addition to other API’s like DOM, PHP XML parsers like, SimpleXML, XMLReader etc. Just like SQL, XPath is a query language, but its syntax is more cloesly related to file paths.

Background:
XPath points into XML documents, regards them as node trees and provides a mechanism for defining fragment identifiers. XPath works by applying a well-known mathaphor - the path through a hierachy of directories / folders in a file system – to the hierarchical structure of an XML document.

The most important construct of XPath is the location path (that should not be mixed with XPointer’s locations, location types and location sets. See the XPointer sticky post). A location path is used to address a certain node-set of a document. That is achieved by concatenating multiple steps, which describe with increasing specificity which part of the document should be addressed, into one location path. Each location step is defined as consiting of three distinct parts: an axis, a node test and a predicate.

An XML document can be viewed as different trees, XML Information set tree, DOM tree and XPath tree. The XPath tree is derived from the tree of the informations set, but they are not identical. The top of the tree is the root node, that by definition is not an element node. Conceptually you can consider the root node to be the document. The root has only one element child, the document element. So do not confuse the root node with the document element. The root node is the base of the tree, with branches (children) that can consist of comments and processing instruction (PI’s). So the following <MyElement /> is a well-formed minimal XML document with an invisible conceptual root node and one child, the document element MyElement. The root node, being a conceptual node in the tree (a "place holder") can not be assigned a name, "aside from root" that is used in many contexts. You can always grab the root of a document using the following syntax, /*. As in file paths, the slash (/) represents the root of the document structure, and the asterix (*) is a wildcard that match any element that occurs at that location, in this example the root. The following //* is an example of a location path that selects all elements below the root. It is an abbreviated form of the follwing step /descendant-or-self::node()/*. As a last example, //title is more specific and will locate any title element anywhere in the document.

General Model:
Most generally, XPath is an expression that is evaluated to yield an object. In what context does this evaluation take place? XPath knows about four object types, a node-set, number, string and boolean. Each expression is evaluated within a given context.

Functions:
XQuery 1.0 and XPath 2.0 Functions and Operators

Status:
XPath version 1.0 is stable. Since XSLT with its ability to transform XML documents is one of the key componets of an XML based infrastructure, XSLT 2.0 is based on XPath 2.0

Tutorials:
http://www.w3schools.com/xpath/default.asp

Related links:
What's New in XPath 2.0

Why XSLT 2.0 and XPath 2.0?

XML Information Set
Kjell Gunnar Bleivik
Make it simple, as simple as possible but no simpler: | DigitalPunkt.no |

Post Reply