XML DTD: Document Type Definition
Posted September 18, 2010on:
XML stands for eXtensible Markup Language.
XML is a markup language much like HTML
XML was designed to carry data, not to display data
XML tags are not predefined.
XML is designed to be self-descriptive, and has user defined tags.
XML is a W3C Recommendation
XML was designed to transport and store data.
XML is a software- and hardware-independent tool for carrying information.
XML Separates Data from HTML
XML is Used to Create New Internet Languages
• XHTML the latest version of HTML
• WSDL for describing available web services
• WAP and WML as markup languages for handheld devices
• RSS languages for news feeds
• SMIL for describing multimedia for the web
XML documents are characterized by two distinct properties: well-formedness and validity.
XML with correct syntax is “Well Formed” XML.
XML validated against a DTD or XML Schema is “Valid” XML.
Well Formed XML Documents
A “Well Formed” XML document has correct XML syntax
- XML documents must have a root element
- XML elements must have a closing tag
- XML tags are case sensitive
- XML elements must be properly nested
- XML attribute values must be quoted
A “Valid” XML document is a “Well Formed” XML document, which also conforms to the rules of a Document Type Definition (DTD)
It defines the document structure with a list of legal elements and attributes.
A DTD can be declared inline inside an XML document, or as an external reference.
Defining XML DTD
<?xml version="1.0" ?> <!DOCTYPE bookstore [ <!ELEMENT bookstore (book+) > <!ELEMENT book (title,author,year,price)> <!ELEMENT title (#PCDATA)> <!ELEMENT author (#PCDATA)> <!ELEMENT year (#PCDATA)> <!ELEMENT price (#PCDATA)> <!ATTLIST book category CDATA #REQUIRED> <!ATTLIST title lang CDATA #REQUIRED> ]> <bookstore> <bookLLLLLLLLLLLLLL category="PHP"> <title langLLLLLL="en">PHP Made Easy</title> <author>XYZ</author> <year>2005</year> <price>30.00</price> </book> <book category="CHILDREN"> <title lang="en">Harry Potter</title> <author>J K. Rowling</author> <year>2005</year> <price>29.99</price> </book> <book category="WEB"> <title lang="en">Learning XML</title> <author>Erik T. Ray</author> <year>2003</year> <price>39.95</price> </book> </bookstore>
- !DOCTYPE bookstore defines that the root element of this document is bookstore
- !ELEMENT bookstore (book+) defines that the there should be atleast one element of type book.
- !ELEMENT book defines that the note element contains four elements: “tile, author, year, price”
- !ELEMENT title defines the to element to be of type “#PCDATA”
- !ELEMENT author defines the from element to be of type “#PCDATA”
- !ELEMENT year defines the heading element to be of type “#PCDATA”
- !ELEMENT price defines the body element to be of type “#PCDATA”
The Building Blocks of XML Documents
Seen from a DTD point of view, all XML documents (and HTML documents) are made up by the following building blocks:
PCDATA means parsed character data.
PCDATA is text that WILL be parsed by a parser. The text will be examined by the parser for entities and markup.
CDATA means character data.
CDATA is text that will NOT be parsed by a parser.
Tags inside the text will NOT be treated as markup and entities will not be expanded.
Validating With the XML Parser
If XML file has error and we try to open the XML document, the XML Parser might generate an error. By accessing the parseError object, you can retrieve the error code, the error text, or even the line that caused the error.
Note: The load( ) method is used for files, while the loadXML( ) method is used for strings.