Advertisement:

Skystone Software

http://www.SkystoneSoftware.com

.NET Programming

Using the System.Xml.XmlDocument
Published: 3/20/2006

Overview
The fastest way to read or write Xml documents in a linear fashion is to use the XmlTextReader and XmlTextWriter classes in the System.Xml namespace. If read- or write-only forward-only access is too limiting, however, the System.Xml.XmlDocument class provides the ability to easily modify existing documents (adding, removing, or modifying elements and / or attributes) and leverages the powerful search capabilities of XPath.

Getting Started

The XmlDocument can load Xml from a string in memory, from a URL, or from a file path. There are two methods that enable this functionality: Load and LoadXml. To load a document from either a URL or file path, use the Load method. Following is an example of using the LoadXml method to load an Xml document from memory. (NOTE: we will be working with this particular document throughout this article)

Dim sXML As String = "<?xml version=""1.0"" encoding=""utf-8"" ?>" 

'build document... 
sXML = String.Concat(sXML, ControlChars.CrLf, "<Rootnode>") 
sXML = String.Concat(sXML, ControlChars.CrLf, ControlChars.Tab, "<ChildElement1 />") 
sXML = String.Concat(sXML, ControlChars.CrLf, ControlChars.Tab, "<ChildElement1 />") 
sXML = String.Concat(sXML, ControlChars.CrLf, ControlChars.Tab, "<ChildElement1 />") 
sXML = String.Concat(sXML, ControlChars.CrLf, ControlChars.Tab, "<ChildElement2 />") 
sXML = String.Concat(sXML, ControlChars.CrLf, ControlChars.Tab, "<ChildElement2 />") 
sXML = String.Concat(sXML, ControlChars.CrLf, ControlChars.Tab, "<ChildElement2 />") 
sXML = String.Concat(sXML, ControlChars.CrLf, ControlChars.Tab, "<ChildElement3 />") 
sXML = String.Concat(sXML, ControlChars.CrLf, ControlChars.Tab, "<ChildElement3 />") 
sXML = String.Concat(sXML, ControlChars.CrLf, ControlChars.Tab, "<ChildElement3 />") 
sXML = String.Concat(sXML, ControlChars.CrLf, "</Rootnode>") 

'load XML into XMLDocument class... 
Dim clsDocument As New System.Xml.XmlDocument 
clsDocument.PreserveWhitespace = True 
clsDocument.LoadXml(sXML) 

NOTE: The PreserveWhitespace property tells the XmlDocument class to retain our whitespace formatting, in this case the tabs and line feeds that we entered to keep the document readable.

Navigating Elements

Now that we have a document loaded into our XmlDocument object, we can navigate through the elements much like we would nodes in a TreeView. The entry point for the elements in the document is the ChildNodes property of the XmlDocument object, which returns a collection of nodes of type System.Xml.XmlNodeList.

'navigate through the elements... 
For Each clsRootNode As System.Xml.XmlNode In clsDocument.ChildNodes 
	Debug.WriteLine(clsRootNode.Name) 
Next 

The output from this code block is as follows:

xml 
#whitespace 
Rootnode

The top-level elements in our document are our xml declaration, some whitespace (whitespace is considered to be an element by the parser) and our "Rootnode" element. To exclude the declaration element (and whitespace), use the NodeType property of the node, as follows:

For Each clsRootNode As System.Xml.XmlNode In clsDocument.ChildNodes 
	If clsRootNode.NodeType = Xml.XmlNodeType.Element Then Debug.WriteLine(clsRootNode.Name) 
Next 

This navigation is hierarchal, as each node exposes its own ChildNodes property. In this way, you can navigate through collections of nodes and subcollections within each node, as follows:

For Each clsRootNode As System.Xml.XmlNode In clsDocument.ChildNodes 
	If clsRootNode.NodeType = Xml.XmlNodeType.Element Then 
		For Each clsChildNode As System.Xml.XmlNode In clsRootNode.ChildNodes 
			Debug.WriteLine(clsChildNode.Name) 
		Next 
	End If 
Next 
Modifying the Document

The XmlDocument class also provides methods for editing, adding and removing elements and attributes within the document. Modifying the text within the boundaries of an element is accomplished by using the InnerText property of that element. For example, we can use the previous example to write some text to the inner text property of certain nodes as we loop through them, like this:

For Each clsRootNode As System.Xml.XmlNode In clsDocument.ChildNodes 
	If clsRootNode.NodeType = Xml.XmlNodeType.Element Then 
		For Each clsChildNode As System.Xml.XmlNode In clsRootNode.ChildNodes 
			If clsChildNode.Name = "ChildElement1" Then 
				clsChildNode.InnerText = "Hello World!" 
			End If 
		Next 
	End If 
Next 

New elements can also be added to the document via the ChildNodes collection of any element. To add a new element to the root level element of our document, we might use the following code:

For Each clsRootNode As System.Xml.XmlNode In clsDocument.ChildNodes 
	If clsRootNode.Name = "Rootnode" Then 
		'create a new child element... 
		Dim clsChildNode As System.Xml.XmlNode = clsDocument.CreateElement("NewElement") 
		'append to the previously-selected element... 
		clsRootNode.AppendChild(clsChildNode) 
	End If 
Next 

What's important to note about this example is that Nodes (and Attributes) cannot be created explicitly. They must be created by the XmlDocument that will contain them. This ensures that they have the same namespace as the Xml document represented by the XmlDocument class. To create Nodes, use the CreateElement method on the XmlDocument object that contains the Xml document to which you want to add a new node.

Similarily, Attributes can be added as follows:

For Each clsRootNode As System.Xml.XmlNode In clsDocument.ChildNodes 
	If clsRootNode.Name = "Rootnode" Then 
		'create a new child element... 
		Dim clsChildNode As System.Xml.XmlNode = clsDocument.CreateElement("NewElement") 
		'append to the previously-selected element... 
		clsRootNode.AppendChild(clsChildNode) 
		'create a new attribute to append to this element... 
		Dim clsAttribute As System.Xml.XmlAttribute = clsDocument.CreateAttribute("NewAttr1") 
		clsAttribute.Value = "Test" 
		'append our new attribute to the previously-selected element... 
		clsChildNode.Attributes.Append(clsAttribute) 
	End If 
Next 

Attributes can be enumerated or accessed by name using the Attributes property of the Element object, as follows:

clsChildNode.Attributes("NewAttr1").Value = "Hello World!" 

It is simple to remove elements from a document using the Remove method of the element's parent:

clsChildNode.ParentNode.RemoveChild(clsChildNode) 
Simple Queries

The XmlDocument does not limit you to enumerated access to the node collections as outlined above; it also supports simple queries and XPath searching (more on XPath later). The simplest way to retrieve elements from any location in the document is to use the GetElementsByTagName method of the XmlDocument object. This method returns an XmlNodeList collection of all of the nodes in the document with the specified name. To access the first node in our example document called "ChildElement3", use the following code:

'find the first element called "ChildElement3"... 
Dim clsNode As System.Xml.XmlNode = clsDocument.GetElementsByTagName("ChildElement3").Item(0)
Advanced Queries - XPath

While the details of writing XPath queries is beyond the scope of this article (I recommend W3Schools.com for a primer on XPath), some examples of how powerful it can be are certainly in order. XPath is a query language (like SQL) specific to Xml that enables the retrieval of specific elements within a document by location, relative location, or other specifications. Xpath queries against an XmlDocument object generally return a collection of nodes (again, as an XmlNodeList) that match the criteria specified in the query. Here's an example of an XPath query that collects all of the elements in the document with the name "ChildElement3":

Dim clsNodes As System.Xml.XmlNodeList = clsDocument.SelectNodes("//Rootnode/ChildElement3") 
Debug.WriteLine(String.Format("There are {0} nodes called ""ChildElement1"" underneath the root element ""Rootnode"".", clsNodes.Count)) 

In our document, this is effectively identical to the previous example in which we used GetElementsByTagName to retrieve the collection of elements named "ChildElement3". The important distinction is that our XPath query is a bit more specific. It asks for only those elements named "ChildElement3" that are directly beneath any element called "Rootnode". Again, because of the design of our document this makes no difference, but you can see how in complex documents XPath can provide much more specific filtering capabilities than GetElementsByTagName.

One of the ways that XPath provides enhanced querying abilities is by allowing us to search for nodes that have specific Attribute values (or have specific Attributes at all, regardless of value). Here is an example of using XPath to find an Element with a specific Attribute value:

Dim clsNode As System.Xml.XmlNode = clsDocument.SelectSingleNode("//ChildElement3[@NewAttr1='Hello World!']") 
Debug.WriteLine(clsNode.OuterXml) 
Committing Changes

Now that we've modified our Xml document, we probably want to save it somewhere. This can be accomplished quite easily via the Save method of the XmlDocument object, or by reading the OuterXml property:

Dim sXml As String = clsDocument.OuterXml 
clsDocument.Save("c:\test.xml") 



Written by Scott Waletzko for Skystone Software.
Copyright 2005-2007 by Echosoft Design Studios, LLC, All Rights Reserved.