XML Syntax Rules
The syntax rules of XML are very simple and logical. The rules are easy to learn, and easy to use.
XML Documents Must Have a Root Element
XML documents must contain one root element that is the parent of all other elements:<root>
<child>
<subchild>.....</subchild>
</child>
</root>
<?xml version="1.0" encoding="UTF-8"?>
<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>
The XML Prolog
This line is called the XML prolog:<?xml version="1.0" encoding="UTF-8"?>
XML documents can contain international characters, like Norwegian øæå or French êèé.
To avoid errors, you should specify the encoding used, or save your XML files as UTF-8.
UTF-8 is the default character encoding for XML documents.
Character encoding can be studied in our Character Set Tutorial.
UTF-8 is also the default encoding for HTML5, CSS, JavaScript, PHP, and SQL.
All XML Elements Must Have a Closing Tag
In HTML, some elements might work well, even with a missing closing tag:<p>This is a paragraph.
<br>
<p>This is a paragraph.</p>
<br />
The XML prolog does not have a closing tag.
This is not an error. The prolog is not a part of the XML document.
This is not an error. The prolog is not a part of the XML document.
XML Tags are Case Sensitive
XML tags are case sensitive. The tag <Letter> is different from the tag <letter>.Opening and closing tags must be written with the same case:
<Message>This is incorrect</message>
<message>This is correct</message>
XML Elements Must be Properly Nested
In HTML, you might see improperly nested elements:<b><i>This text is bold and italic</b></i>
<b><i>This text is bold and italic</i></b>
XML Attribute Values Must be Quoted
XML elements can have attributes in name/value pairs just like in HTML.In XML, the attribute values must always be quoted.
INCORRECT:
<note date=12/11/2007>
<to>Tove</to>
<from>Jani</from>
</note>
<note date="12/11/2007">
<to>Tove</to>
<from>Jani</from>
</note>
Entity References
Some characters have a special meaning in XML.If you place a character like "<" inside an XML element, it will generate an error because the parser interprets it as the start of a new element.
This will generate an XML error:
<message>salary < 1000</message>
<message>salary < 1000</message>
< | < | less than |
> | > | greater than |
& | & | ampersand |
' | ' | apostrophe |
" | " | quotation mark |
Only < and & are strictly illegal in XML, but it is a good habit to replace > with > as well.
Comments in XML
The syntax for writing comments in XML is similar to that of HTML.<!-- This is a comment -->
Two dashes in the middle of a comment are not allowed.
Not allowed:
<!-- This is a -- comment -->
<!-- This is a - - comment -->
White-space is Preserved in XML
XML does not truncate multiple white-spaces (HTML truncates multiple white-spaces to one single white-space):XML: | Hello Tove |
HTML: | Hello Tove |
XML Stores New Line as LF
Windows applications store a new line as: carriage return and line feed (CR+LF).Unix and Mac OSX uses LF.
Old Mac systems uses CR.
XML stores a new line as LF.