facebook google plus twitter
Webucator's Free XML Tutorial

Lesson: XML Schema Basics

Welcome to our free XML tutorial. This tutorial is based on Webucator's Introduction to XML Training course.

In this lesson you will learn about XML Schemas and how they compare to DTDs.

Lesson Goals

  • Learn the purpose of XML Schema.
  • Learn the limitations of DTDs.
  • Learn the power of XML Schema.
  • Learn how to validate an XML Instance with an XML schema.

What is an XML Schema?

XML Schema is an XML-based language used to create other XML-based languages and data models. An XML schema defines element and attribute names for a class of XML documents. The schema also specifies the structure that those documents must adhere to and the type of content that each element can hold.

You might be thinking, isn't this what DTDs are? That's right! But XML Schemas are more powerful than DTDs (more on that later in the lesson).

XML documents that attempt to adhere to an XML schema are said to be instances of that schema. If they correctly adhere to the schema, then they are valid instances. This is not the same as being well formed. A well-formed XML document follows all the syntax rules of XML, but it does not necessarily adhere to any particular schema. So, again, an XML document can be well formed without being valid, but it cannot be valid unless it is well formed.

XML Schemas vs. DTDs

DTDs are similar to XML schemas in that they are used to create classes of XML documents. DTDs were around long before the advent of XML. They were originally created to define languages based on SGML, the parent of XML. Although DTDs are still common, XML Schema is a much more powerful language.

As a means of understanding the power of XML Schema, let's look at the limitations of DTD.

  1. DTDs do not have built-in datatypes.
  2. DTDs do not support user-derived datatypes.
  3. DTDs allow only limited control over cardinality (the number of occurrences of an element within its parent).
  4. DTDs do not support Namespaces or any simple way of reusing or importing other schemas.

A First Look

An XML schema describes the structure of an XML instance document by defining what each element must or may contain. An element is limited by its type. For example, an element of complex type can contain child elements and attributes, whereas a simple-type element can only contain text. The diagram below gives a first look at the types of XML Schema elements.

Note: we will review this in the next presentation.Schema Elements

Schema authors can define their own types or use the built-in types. Throughout this course, we will refer back to this diagram as we learn to define elements. You may want to save this diagram (right-click the image and select "Save Image As..."), so that you can easily reference it.

The following is a high-level overview of schema types.

  1. Elements can be of simple type or complex type.
  2. Simple type elements can only contain text. They cannot have child elements or attributes.
  3. All the built-in types are simple types (e.g., xs:string).
  4. Schema authors can derive simple types by restricting another simple type. For example, an email type could be derived by limiting a string to a specific pattern.
  5. Simple types can be atomic (e.g., strings and integers) or non-atomic (e.g., lists).
  6. Complex-type elements can contain child elements and attributes as well as text.
  7. By default, complex-type elements have complex content, meaning that they have child elements.
  8. Complex-type elements can be limited to having simple content, meaning they only contain text. They are different from simple type elements in that they have attributes.
  9. Complex types can be limited to having no content, meaning they are empty, but they may have attributes.
  10. Complex types may have mixed content - a combination of text and child elements.

A Simple XML Schema

Let's take a look at a simple XML schema, which is made up of one complex-type element with two child simple-type elements.

Code Sample:

SchemaBasics/Demos/Author.xsd
<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
  <xs:element name="Author">
    <xs:complexType>
      <xs:sequence>
        <xs:element name="FirstName" type="xs:string" />
        <xs:element name="LastName" type="xs:string" />
      </xs:sequence>
    </xs:complexType>
  </xs:element>
</xs:schema>

As you can see, an XML schema is an XML document and must follow all the syntax rules of any other XML document; that is, it must be well formed. XML schemas also have to follow the rules defined in the "Schema of schemas," which defines, among other things, the structure of an element and attribute names in an XML schema.

Although it is not required, it is a common practice to use the xsqualifier to identify schema elements and types.

The document element of XML schemas is xs:schema. It takes the attribute xmlns:xs with the value of http://www.w3.org/2001/XMLSchema, indicating that the document should follow the rules of XML Schema. This will be clearer after you learn about namespaces.

In this XML schema, we see a xs:element element within the xs:schema element. xs:element is used to define an element. In this case it defines the element Author as a complex-type element, which contains a sequence of two elements: FirstName and LastName, both of which are of the simple type, string.

Validating an XML Instance Document

In the last section, you saw an example of a simple XML schema, which defined the structure of an Author element. The code sample below shows a valid XML instance of this XML schema.

Code Sample:

SchemaBasics/Demos/MarkTwain.xml
<?xml version="1.0"?>
<Author xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
	xsi:noNamespaceSchemaLocation="Author.xsd">
    <FirstName>Mark</FirstName>
    <LastName>Twain</LastName>
</Author>

This is a simple XML document. Its document element is Author, which contains two child elements: FirstName and LastName, just as the associated XML schema requires.

The xmlns:xsi attribute of the document element indicates that this XML document is an instance of an XML schema. The document is tied to a specific XML schema with the xsi:noNamespaceSchemaLocation attribute.

There are many ways to validate the XML instance. If you are using an XML authoring tool, it very likely is able to perform the validation for you. Alternatively, there is a simple online XML Schema validator tool listed below.

Creating an XML Schema

Duration: 60 to 90 minutes.

In this exercise, you will write an XML Schema for the business letter shown below. You will then give your schema to another student, who will mark up the business letter as a valid XML document according to your schema. Likewise, you will markup the business letter according to someone else's schema. Make sure that the XML file contains the xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" and xsi:noNamespaceSchemaLocation="your-schema-file-name.xsd" attributes in the document element.

Both documents should be saved in the SchemaBasics/Exercises folder. To test whether the XML file is valid, you can use the XMLSpy editor.

Business Letter:

SchemaBasics/Exercises/BusinessLetter.txt
November 29, 2011

Joshua Lockwood
Lockwood & Lockwood
291 Broadway Ave.
New York, NY 10007
United States

Dear Mr. Lockwood:

Along with this letter, I have enclosed the following items:

	- two original, execution copies of the Webucator Master Services Agreement
	- two original, execution copies of the Webucator Premier Support for 
		Developers Services Description between 
		Lockwood & Lockwood and Webucator, Inc.
	
Please sign and return all four original, execution copies to me at your
earliest convenience.  Upon receipt of the executed copies, we will 
immediately return a fully executed, original copy of both agreements to you.

Please send all four original, execution copies to my attention as follows:

	Webucator, Inc.
	4933 Jamesville Rd.
	Jamesville, NY 13078  USA
	Attn: Bill Smith
	
If you have any questions, feel free to call me at 800-555-1000 x123 
or e-mail me at bsmith@webucator.com.

Best regards,

Bill Smith
VP, Operations