Page 1 of 1

Whitespace and XML.

Posted: Fri Dec 07, 2007 7:50 pm
by KBleivik
Whitespace can be a nightmare in XML, more precisely in XSL(T) and other technologies. Knowing how to handle whitespace can save you hours of work and let you sleep better in the nights. A space, a tab, a linefeed and a carriage return all introduce white space, that may be interpreted different in different parsers and browsers.

Note: No whitespace is allowed before the XML declaration:

<?xml version="1.0" ?>

elemenTag>Some text here</elemenTag>

is different from:
Some text here

that is again different from

Some text here


Some useful hints:

Example 1: Use xml:space to handle white space. Valid values are preserve and default.

Example 2: Write your own XML Schema to handle white space.

Code: Select all

<?xml version="1.0?>
<xsd:schema xmlns:xsd="">
<xsd:element name="myElement">
<xsd:restriction base="xsd:string">
<xsd:whiteSpace value="collapse" />
Valid values are preserve, replace and collapse.

Example 3: Line break in XSLT.

Code: Select all

The examples below, that are taken from the SitePoint book by Thomas Meyer (April 2006) No Nonsense XML Web Development With PHP show how different style sheets can be implemented to transform XML documents. You should view them in different browsers, older and the newest versions.

Note the following cite from that book page 48.

"Well by default, the XSLT standard mandates that whenever there is only whitespace (including line breaks) between two tags, the whitespace should be ignored. But when there is text between two tags, (e.g. TO:), then the whitespace in and around that text should be passed along to the output".


Code: Select all

<?xml version="1.0"?>
    <to>WPW members</to>
    <message>The importance of well-formed and valid tagging.</message>

Code: Select all

<?xml version="1.0"?>
<?xml-stylesheet type="text/css" href="letter.css" version="1.0"?>
    <to>WPW members</to>
    <message>The importance of well-formed and valid tagging.</message>
Style sheet code, letter.css

Code: Select all

letter {
    display: block;
    margin: 10px;
    padding: 5px;
    width: 300px;
    height: 100px;
    border: 1px solid #00000;
        overflow: auto;
        background-color: #cccccc;
        font: 12px Arial;
to, from {
    display: block;
    font-weight: bold;
message {
    display: block;
    font: 11px Arial;

Code: Select all

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="letter2text.xsl" version="1.0"?>
    <to>WPW members</to>
    <message>The importance of well-formed and valid tagging.</message>
Style sheet code, letter2text.xsl

Code: Select all

<xsl:stylesheet version="1.0" xmlns:xsl="">
    <xsl:output method="text"/>
    <xsl:template match="/letter">
        <xsl:apply-templates select="*"/>
    <xsl:template match="to">
        <xsl:text>TO: </xsl:text>
    <xsl:template match="from">
        <xsl:text>FROM: </xsl:text>
    <xsl:template match="message">
        <xsl:text>MESSAGE: </xsl:text>

Code: Select all

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="letter2html.xsl" version="1.0"?>
    <to>WPW members</to>
    <message>The importance of well-formed and valid tagging.</message>
Style sheet code, letter2html.xsl

Code: Select all

<xsl:stylesheet version="1.0"
    <xsl:output method="html"/>
    <xsl:template match="/letter">
    <xsl:template match="to">
        <b>TO: </b><xsl:apply-templates/><br/>
    <xsl:template match="from">
        <b>FROM: </b><xsl:apply-templates/><br/>
    <xsl:template match="message">
        <b>MESSAGE: </b><xsl:apply-templates/><br/>

Code: Select all

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="letter2xml.xsl" version="1.0"?>
    <to>WPW members</to>
    <message>The importance of well-formed and valid tagging.</message>
Style sheet code, letter2xml.xsl

Code: Select all

    <xsl:output method="xml" indent="yes"/>
    <xsl:template match="/letter">
    <xsl:template match="to">
    <xsl:template match="from">
    <xsl:template match="message">

Code: Select all

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="letter2xhtml.xsl" version="1.0"?>
    <to>WPW members</to>
    <message>The importance of well-formed and valid tagging.</message>
Style sheet code, letter2xhtml.xsl

Code: Select all

<xsl:stylesheet version="1.0"
    <xsl:output method="xml" indent="yes" omit-xml-declaration="yes"
        media-type="application/xhtml+xml" encoding="iso-8859-1"
        doctype-public="-//W3C//DTD XHTML 1.0 Transitional//EN"   doctype-system=""/>
    <xsl:template match="/letter">
    <xsl:template match="to">
        <b>TO: </b><xsl:apply-templates/><br/>
    <xsl:template match="from">
        <b>FROM: </b><xsl:apply-templates/><br/>
    <xsl:template match="message">
        <b>MESSAGE: </b><xsl:apply-templates/><br/>
Pay careful attention to the last two examples. In addition to demonstrating the importance of one source and many applications (formats) it demonstrates how the message is presented in different browsers and how different Se BOTS may interpret the markup.

1. Explain the differences for different (versions) of web browsers. View source.

2. Explain how todays SE Bots see the markup.

3. Explain how tomorrows SE Bots may see the code and how important well-formed (and valid) code may be.

4. Try the code on your own test / web server by "illeagally" nesting tags, leave tags open and change one letter in some tags eg. from lowercase to uppercase etc.