Advertisement:

Skystone Software

http://www.SkystoneSoftware.com

.NET Programming

Validating Xml Documents (VS 2005)
Published: 9/9/2006

Overview
Xml is becoming a pervasive methodology for the storage and dissemination of data in computer science, driven not only by its cross-platform appeal but also by the myriad of accompanying technologies that now exist to extend Xml into something much more than what was originally intended. One of the biggest appealing attributes of Xml is its dynamic nature - it can be used to describe virtually any set of data, regardless of complexity. Many developers (myself included) typically create Xml document according to a sort of assumed schema - because of its nature, Xml is typically self-defining. This is fine for smaller projects, but for projects where many individuals may be writing code to access the Xml document or the document has a large number of elements and attributes with different restrictions on type, a published schema for the data set becomes necessary. Fortunately, Xml Schema (XSD) and Document Type Definitions (DTD) exist to validate Xml documents against a specific design schema to ensure that the Xml adheres to a set of design rules.

Prerequisites

This article will cover how to use the .NET 2.0 Framework to validate existing Xml documents against existing Xsd schema documents. Prior knowledge of both Xml and Xsd are assumed. I recommend the following tutorials on Xml and Xsd if you need to brush up:

NOTE: The following examples apply also to the use of DTD instead of XSD. Simply use Xml.ValidationType.DTD instead of Xml.ValidationType.Schema.

Xsd in .NET 2.0

The System.Xml.Schema namespace in the .NET Framework exists to facilitate usage of Xml Schema (Xsd) documents within .NET programs. Significant changes were made to the .NET Framework between versions 1.1 and 2.0, which means that the method for validating Xml documents against schemas has changed somewhat. Specifically, the XslValidatingReader clas has become obsolete in the .NET 2.0 framework, with its functionality moved to the base System.Xml.XmlReader class.

Getting Started

To begin, we will start with an XmlDocument object that contains the Xml which we be validating. The following document contains valid Xml according to the schema we have developed:

<?xml version="1.0" encoding="utf-8" ?> 
<Data xmlns="http://www.skystonesoftware.com/XmlValidationExample"> 
    <Item ID="1" Name="Item #1" URL="http://www.vbcity.com/" /> 
    <Item ID="2" Name="Item #2" URL="http://www.vbcity.com/" /> 
    <Item ID="3" Name="Item #3" URL="http://www.vbcity.com/" /> 
    <Item ID="4" Name="Item #4" URL="http://www.vbcity.com/" /> 
</Data> 

The following schema defines the design of this document:

<?xml version="1.0"?> 
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.skystonesoftware.com/XmlValidationExample" xmlns="http://www.skystonesoftware.com/XmlValidationExample" elementFormDefault="qualified"> 
    <xs:element name="Data"> 
        <xs:complexType> 
            <xs:sequence> 
                <xs:element name="Item" minOccurs="1" maxOccurs="unbounded"> 
                    <xs:complexType> 
                        <xs:sequence /> 
                        <xs:attribute name="ID"> 
                            <xs:simpleType> 
                                <xs:restriction base="xs:int" /> 
                            </xs:simpleType> 
                        </xs:attribute> 
                        <xs:attribute name="Name"> 
                            <xs:simpleType> 
                                <xs:restriction base="xs:string" /> 
                            </xs:simpleType> 
                        </xs:attribute> 
                        <xs:attribute name="URL"> 
                            <xs:simpleType> 
                                <xs:restriction base="xs:anyURI" /> 
                            </xs:simpleType> 
                        </xs:attribute> 
                    </xs:complexType> 
                </xs:element> 
            </xs:sequence> 
        </xs:complexType> 
    </xs:element> 
</xs:schema> 

Assuming that the above schema definition has been typed into a TextBox on the current form called "txtXsd", we load the schema definition into an XmlSchema object as follows:

' read the schema into a stream... 
Dim clsStream As New System.IO.MemoryStream() 
Dim bSchema As Byte() = System.Text.ASCIIEncoding.ASCII.GetBytes(Me.txtXsd.Text) 
clsStream.Write(bSchema, 0, bSchema.Length) 
clsStream.Flush() 
clsStream.Position = 0 

' load the schema into a schema object... 
Dim clsSchema As System.Xml.Schema.XmlSchema = System.Xml.Schema.XmlSchema.Read(clsStream, Nothing) 
clsStream.Close() 
clsStream.Dispose() 

Once the schema has been loaded, an XmlReader object is created, and configured to use the schema for validation as it reads the document:

' configure the reader to use validation, and add the schema we just loaded... 
Dim clsReaderSettings As New System.Xml.XmlReaderSettings() 
clsReaderSettings.ValidationType = Xml.ValidationType.Schema 
clsReaderSettings.Schemas.Add(clsSchema) 

' read xml into a stream and then into an XmlReader... 
clsStream = New System.IO.MemoryStream() 
Dim bXml As Byte() = System.Text.Encoding.ASCII.GetBytes(Me.txtXml.Text) 
clsStream.Write(bXml, 0, bXml.Length) 
clsStream.Flush() 
clsStream.Position = 0 

' create a reader that will read the document and validate it against the XSD... 
Dim clsReader As System.Xml.XmlReader = System.Xml.XmlReader.Create(clsStream, clsReaderSettings) 

Again, this code assumes that the Xml document has been typed into a TextBox called "txtXml" on the same form that the executing code resides on.

Validating the Xml

Now that we have created our XmlReader object and configured it to use our schema to validate the document loaded into it, validating the document is as simple as reading it line by line, trapping for errors:

Try 

    ' validate document... 
    Do While clsReader.Read() 
    Loop 

    ' release... 
    clsStream.Close() 
    clsStream.Dispose() 
    clsReader.Close() 

    ' notify... 
    MessageBox.Show(Me, "The document is valid according to the schema!") 

Catch exXml As System.Xml.XmlException 
    MessageBox.Show(Me, String.Concat("Xml Error: ", exXml.Message)) 
Catch exXsd As System.Xml.Schema.XmlSchemaException 
    MessageBox.Show(Me, String.Concat("Xml Validation Error: ", exXsd.Message)) 
Catch ex As Exception 
    MessageBox.Show(Me, String.Concat("General Error: ", ex.Message)) 
End Try 

If the XmlReader gets through the document without error, then the document is valid according to the Xml Schema. Otherwise, an error (either an XmlException or an XmlSchemaExeception) will be raised.

Alternate Methods

For more control over the validation of the document, you can use the XmlSchemaValidator, which is new to the .NET 2.0 framework. This method requires more coding and explicit knowledge of the schema itself, but provides much more low-level access to element and attribute validation, allowing you to validate the entire document and creating a list of errors, rather than erroring on the initial error. The following link contains detailed code samples for this process:

http://msdn2.microsoft.com/en-us/library/system.xml.schema.xmlschemavalidator.aspx
Summary

Use the XmlReader and XmlSchema classes to validate an Xml document against an associated Xsd schema in order to determine whether or not the document is valid. For more detailed access to the validation process (and the ability to itemize errors rather than abort on the initial error), use the XmlSchemaValidator class instead.



Written by Scott Waletzko for Skystone Software.
Copyright 2005-2007 by Echosoft Design Studios, LLC, All Rights Reserved.