Semantic Technologies
Semantic Technologies
XML Example Document File:
https://drive.google.com/file/d/1A1147lStUqV4D6GM6Hqp0CQDPT48L9nX/view?usp=sharing
<books>
<link type="text/css" rel="stylesheet" id="dark-mode-custom-link"/>
<link type="text/css" rel="stylesheet" id="dark-mode-general-link"/>
<style lang="en" type="text/css" id="dark-mode-custom-style"/>
<style lang="en" type="text/css" id="dark-mode-native-style"/>
<book id="1">
<title>Java How to Program</title>
<translation edition="1">Spanish</translation>
<translation edition="1">Chinese</translation>
<translation edition="1">Japanese</translation>
<translation edition="2">French</translation>
<price>49.99</price>
</book>
<book id="2">
<title>C++ How to Program</title>
<translation edition="1">Korean</translation>
<translation edition="1">French</translation>
<translation edition="1">Spanish</translation>
<price>35.99</price>
</book>
<book id="3">
<title>Python How to Program</title>
<translation edition="3">Bengali</translation>
<translation edition="3">Russian</translation>
<translation edition="3">Hebrew</translation>
<price>29.99</price>
</book>
</books>
XPath Tester Online Tool:
https://www.freeformatter.com/xpath-tester.html
Navigate to the above URL and then copy and paste the XML file to the
XML input field or you can upload your XML file.
Let's now run some XPath expressions in the XPath field and see what is the output.
Return all the books in the XML file:
Xpath expression: /books/book
This will return 3 book elements:
Element='<book id="1">
<title>Java How to Program</title>
<translation edition="1">Spanish</translation>
<translation edition="1">Chinese</translation>
<translation edition="1">Japanese</translation>
<translation edition="2">French</translation>
<price>49.99</price>
</book>'
Element='<book id="2">
<title>C++ How to Program</title>
<translation edition="1">Korean</translation>
<translation edition="1">French</translation>
<translation edition="1">Spanish</translation>
<price>35.99</price>
</book>'
Element='<book id="3">
<title>Python How to Program</title>
<translation edition="3">Bengali</translation>
<translation edition="3">Russian</translation>
<translation edition="3">Hebrew</translation>
<price>29.99</price>
</book>'
2. Return all the book titles:
Xpath expression: /books/book/title
This will return 3 book titles:
Element='<title>Java How to Program</title>'
Element='<title>C++ How to Program</title>'
Element='<title>Python How to Program</title>'
3. Return all the book titles and book prices:
Xpath expression: /books/book/title | /books/book/price
This will return 3 book titles and 3 book prices:
Element='<title>Java How to Program</title>'
Element='<price>49.99</price>'
Element='<title>C++ How to Program</title>'
Element='<price>35.99</price>'
Element='<title>Python How to Program</title>'
Element='<price>29.99</price>'
4. Return the book element with id=2
Xpath expression: //book[@id=2]
or
Xpath expression: /books/book[@id=2]
This will return the book element with id=2:
The difference in the first expression is that it will search all the XML file for book element anywhere in the file
Element='<book id="2">
<title>C++ How to Program</title>
<translation edition="1">Korean</translation>
<translation edition="1">French</translation>
<translation edition="1">Spanish</translation>
<price>35.99</price>
</book>'
5. Return the title of the book with id=2
Xpath expression: //book[@id=2]/title
or
Xpath expression: /books/book[@id=2]/title
This will return the title of the book element with id=2:
The difference in the first expression is that it will search all the XML file for book element anywhere in the file
Element='<title>C++ How to Program</title>'
6. Return the books with Japanese translation.
Xpath expression: //book/translation[text()="Japanese"]
This will return the book that has translation in Japanese
Element='<translation edition="1">Japanese</translation>'
7. Return the titles of the books with Japanese translation.
Xpath expression: //book/translation[text()="Japanese"]/../title
This will return the title of the book that has translation in Japanese
Element='<title>Java How to Program</title>'
The /../ goes up a level and hence we can retrieve the title element because if we used this XPath expression: //book/translation[text()="Japanese"]/title instead we should receive "NO MATCH" in the XPath output.
8. Return the titles and prices of the books with Japanese translation.
Xpath expression: //book/translation[text()="Japanese"]/../title | //book/translation[text()="Japanese"]/../price
This will return the title and the price of the book that has translation in Japanese
Element='<title>Java How to Program</title>'
Element='<price>49.99</price>'
This is a python program that uses XPath expressions to search a webpage for a specific variable.
Download the python web crawler to search and retrieve data from any web page from my GitHub Repo:
XPath uses path expressions to select nodes or node-sets in an XML document.
These path expressions look very much like the path expressions you use with traditional computer file systems:
XPath includes over 200 built-in functions.
There are functions for string values, numeric values, booleans, date and time comparison, node manipulation, sequence manipulation, and much more.
Today XPath expressions can also be used in JavaScript, Java, XML Schema, PHP, Python, C and C++, and lots of other languages.
In XPath, there are seven kinds of nodes: element, attribute, text, namespace, processing-instruction, comment, and document nodes.
XML documents are treated as trees of nodes. The topmost element of the tree is called the root element.
XPath uses path expressions to select nodes in an XML document. The node is selected by following a path or steps. The most useful path expressions are listed below:
Expression
Description
nodename
Selects all nodes with the name "nodename"
/
Selects from the root node
//
Selects nodes in the document from the current node that match the selection no matter where they are
.
Selects the current node
..
Selects the parent of the current node
@
Selects attributes
In the table below we have listed some path expressions and the result of the expressions:
Path Expression
Result
bookstore
Selects all nodes with the name "bookstore"
/bookstore
Selects the root element bookstore
Note: If the path starts with a slash ( / ) it always represents an absolute path to an element!
bookstore/book
Selects all book elements that are children of bookstore
//book
Selects all book elements no matter where they are in the document
bookstore//book
Selects all book elements that are descendant of the bookstore element, no matter where they are under the bookstore element
//@lang
Selects all attributes that are named lang
Predicates are used to find a specific node or a node that contains a specific value.
Predicates are always embedded in square brackets.
In the table below we have listed some path expressions with predicates and the result of the expressions:
Path Expression
Result
/bookstore/book[1]
Selects the first book element that is the child of the bookstore element.
Note: In IE 5,6,7,8,9 first node is[0], but according to W3C, it is [1]. To solve this problem in IE, set the SelectionLanguage to XPath:
In JavaScript: xml.setProperty("SelectionLanguage","XPath");
/bookstore/book[last()]
Selects the last book element that is the child of the bookstore element
/bookstore/book[last()-1]
Selects the last but one book element that is the child of the bookstore element
/bookstore/book[position()<3]
Selects the first two book elements that are children of the bookstore element
//title[@lang]
Selects all the title elements that have an attribute named lang
//title[@lang='en']
Selects all the title elements that have a "lang" attribute with a value of "en"
/bookstore/book[price>35.00]
Selects all the book elements of the bookstore element that have a price element with a value greater than 35.00
/bookstore/book[price>35.00]/title
Selects all the title elements of the book elements of the bookstore element that have a price element with a value greater than 35.00
XPath wildcards can be used to select unknown XML nodes.
Wildcard
Description
*
Matches any element node
@*
Matches any attribute node
node()
Matches any node of any kind
In the table below we have listed some path expressions and the result of the expressions:
Path Expression
Result
/bookstore/*
Selects all the child element nodes of the bookstore element
//*
Selects all elements in the document
//title[@*]
Selects all title elements which have at least one attribute of any kind
By using the | operator in an XPath expression you can select several paths.
In the table below we have listed some path expressions and the result of the expressions:
Path Expression
Result
//book/title | //book/price
Selects all the title AND price elements of all book elements
//title | //price
Selects all the title AND price elements in the document
/bookstore/book/title | //price
Selects all the title elements of the book element of the bookstore element AND all the price elements in the document