XML Schema Definition (XSD)
What is XML Schema Definition (XSD)?
XML Schema Definition or XSD is a recommendation by the World Wide Web Consortium (W3C) to describe and validate the structure and content of an XML document. It is primarily used to define the elements, attributes and data types the document can contain. The information in the XSD is used to verify if each element, attribute or data type in the document matches its description.
An XSD is similar to earlier XML schema languages, such as Document Type Definition (DTD), but it is a more powerful alternative as it provides greater control over the XML structure.
XML schema details
In general, a schema is an abstract representation of an object's characteristics and relationship to other objects in a document. An XML schema represents the relationships between the attributes and elements of an XML object.
The process of creating a schema involves analyzing the document's structure and defining each structural element encountered. For example, a schema for a document describing a website would define a website element, a webpage element and other elements that describe possible content divisions within any page on that site. These elements are defined within a set of tags in HTML and also in XML.
As the language of the XML schema, XSD is similar to a database schema that describes the data within a database. It defines the building blocks of an XML document, including:
- its attributes and elements;
- the number and order of child elements;
- the corresponding data types of multiple elements and attributes; and
- the default and fixed values for elements and attributes.
The need for XML schemas and XSD
There are hundreds of standardized XML formats that are used globally. Most of these XML standards can be defined and understood with XML schemas.
The key benefits of XML schemas, and therefore of XSD, are as follows:
- They support data types, making it easier to validate data correctness, define data facets and restrictions and convert data between different data types.
- They are written in XML, making them easier to extend and edit.
- Schemas allow the use of XML Parser to parse files, document modification with XML document object model (DOM) and transformation with Extensible Stylesheet Language Transformations (XSLT).
- Schemas help create a common language for XML content, allowing senders to describe information (e.g., date) in ways that all receivers will understand so that the data is maintained consistently.
In addition, XSD makes it easy to reuse schema in other schema, reference multiple schemas in the same document and create new data types derived from standard data types.
XML Schema Definition (XSD) elements
An element is the building block of an XML document and is defined within the XSD. The three types of XSD schema elements can be defined as:
- Simple
- Complex
- Global
The choice depends on whether the element is a parent element or a leaf element. An element can be defined in the XSD as <xs:element name = "x" type = "y"/>.
The simpleType element contains only text and cannot have attributes. Examples include:
- xs:integer
- xs:Boolean
- xs:string
- xs:date
For example: Syntax: <xs:element name = "phone_number" type = "xs:int" />
However, a complexType can contain text, elements and attributes. It can be a parent to all the elements and attributes within it. Further, it provides structure within XML documents and allows building a simple element hierarchy in a document.
With a globalType element, a single type can be defined in the XML document, which can then be used by all other references. It simplifies document maintenance.
Elements declared as the direct child of an XML schema are considered global elements and can be used throughout the schema. However, elements declared within a complexType cannot be used elsewhere in the schema because they are considered local elements.
XML Schema Definition (XSD) attributes
XSD attributes provide additional information within an element. They have two properties: name and type. The syntax is written as <xs:attribute name = "x" type = "y"/>.
Unlike XSD elements, an XSD attribute is always of the simpleType. It can have a fixed value or a default value. Further, an attribute group, which defines an association between a name and a set of attribute declaration, can be reused in complexType definitions.
In XSD, a nillable attribute of an element can be declared as true to represent a null value being sent to or from a relational database with an element.
XSD type definitions
In creating a new simple or complex data type, an XSD type definition is used. The data type can be either named or anonymous. While a named data type is always defined globally, the anonymous type does not have a name and thus cannot be referenced. A base type definition is used as a base to create new definitions.
- SimpleType data type. It is derived from an XML schema's built-in data type.
- ComplexType data type. It includes element declarations, references and attribute declarations. Before defining this data type in the XSD, its elements and attributes must first be defined.
- AnyType data type. From this data type, all the complexTypes and simpleTypes are derived. However, its content cannot be restricted and an anyType element can be mapped only to the same data type element during transformations.
- AnySimpleType data type. This data type is also mapped only to elements of the same data type during transformations.
XSD content models
XSD content models are used to define the order of elements in a group in an XSD.
- Choice group. This content model describes a choice between multiple elements of a group and can include other groups and elements.
- Sequence group. This model specifies that the elements defined must appear once in the XML document in a specific order. Similar to the choice group, it can also include other groups and elements.
- All group. All group specifies that the elements need to appear once in the XML document and in any order. This group cannot contain other groups.
Advantages of XSD over DTD
XSD offers several advantages over DTD. For one, XSD is written in XML so that it doesn't require intermediary processing by a parser. Because DTD is not written in XML, it requires the help of parsers. This means separate parsers are needed for XML and DTD.
XSD also offers self-documentation, automatic schema creation and the ability to be queried through XSLT.
Other advantages of XSD over DTD are:
- XSD is extensible while DTD is not. This makes it easier to derive new elements from existing elements in XSD.
- XSD also supports data types, so the content of an element can be restricted. DTD cannot restrict content of an element as it does not support data types.
- XSD supports element default values, whereas DTD cannot.
- It is possible to include or import multiple XML schemas within an XML schema. This is not possible with DTD.
Of course, there are some challenges and limitations of XSD as well. It can be unnecessarily complex, and it lacks a formal mathematical description. It also provides limited support for unordered content.
See also: AJAX, SPML, BPEL , XACML , ebXML, Information and Content Exchange