XHTML/HTML

Tags: , January 7, 2005 (20 comments)

There has always been a lot of discussion about which one to use, and people have attacked this from different angles, with arguments including:

  • Internet Explorer enters "Quirks Mode" when an XHTML document has the XML prologue, yet has a 70% marketshare
  • If there's a slight error in a XHTML page, being correctly served as application/xhtml+xml browsers which support it fail to render it
  • XHTML is the future, so why stay with HTML?

So, basically, if we were able to hand out equally valid XHTML 1.1 and HTML 4.01 Strict depending on whether the UA supports application/xhtml+xml, all our problems have gone? Wrong. Browsers that don't support CSS will still fail, so make sure you use structured, semantic mark-up, then, even without styles, it will 99% of the time, look alright, and be readable, and understandable :)

But, how are we meant to sniff wheather the UA supports application/xhtml+xml, and through that XHTML? Javascript? No, it can be disabled. So what? PHP, or another server side language.

If your server supports PHP, here's the code:

<?php
// Charset
$charset = 'utf-8';
 
function fix_code($buffer) {
   $str = (str_replace(" />", ">", $buffer));
   return (str_replace("xml:lang", "lang", $str));
}
if ((stristr($_SERVER["HTTP_ACCEPT"], 'application/xhtml+xml'))  || (stristr($_SERVER["HTTP_USER_AGENT"], 'W3C_Validator')) || (stristr($_SERVER["HTTP_USER_AGENT"], 'WDG_Validator'))) {
$mime = 'application/xhtml+xml';
} else {
$mime = 'text/html';
}
header ("Content-type$mime");
If ($mime == "application/xhtml+xml") {
echo '<?xml version="1.0" encoding="' . $charset . '"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"
        "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">';
} else {
echo '<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
        "http://www.w3.org/TR/html4/strict.dtd">';
ob_start("fix_code");
}
?>
 
<html <?php if ($mime == "application/xhtml+xml") { echo'xmlns="http://www.w3.org/1999/xhtml"'; } ?> xml:lang="en">
 <head>
  <meta http-equiv="content-type" content="<?php echo $mime; ?>; charset=<?php echo $charset; ?>" />

I cleaned out this version, putting the settings at the top, making easier for someone who doesn't know PHP to use, but compromising a couple of milliseconds :P

If anyone can convert this to ASP they will be thanked, get the code put up in the post and be creditied for it...

The above represented my views at the time, however, things change, so please read XHTML/HTML Followup.

[Updated 10th January 2005 - Adding xml:lang to lang support]

[Updated 10th January 2005 - Bug Fix]

[Updated 2nd March 2005 - Mistakes pointed out by Mithoric]

[Updated 21st August 2005 - New Views]

Comments

  1. Ollie
    says

    January 7, 2005 15:40:00+00:00

    I've never seen something like this before; I think I might use it if I ever bother to redo my site, or if someone ever asks for XHTML 1.1. Anyway, nice idea :) .

    "but compromising a couple of milliseconds"

    Boy, those milliseconds do slip by... :P

  2. Geoffrey Sneddon
    says

    January 7, 2005 15:57:08+00:00

    Just a note, you can of course add attributes to both the html/head, and add elements before the <meta> tag...

    It's actually used here, so please use valid code in comments :P, so, you can see here how it turns out in a realistic environment.

  3. Jalenack
    says

    January 8, 2005 22:13:51+00:00

    I love this script...I've used it on almost all my pages created recently...is it much different from the one you posted originally at There are no differences between this and the CF one, apart from the settings being seperated. (besides the settings stuff at the top...i can figure those out :P ) .. anyhoo, tis good. THanx

  4. Geoffrey Sneddon
    says

    January 8, 2005 22:21:22+00:00

    There are no differences between this and the CF one, apart from the settings being seperated.

  5. Ryankop
    says

    January 9, 2005 03:03:27+00:00

    I personally like old HTML better than Xhtml. Sorry mates.

    Even Geoffrey knows that :P

  6. Geoffrey Sneddon
    says

    January 9, 2005 03:04:19+00:00

    You're just too lazy to learn XHTML/CSS :P

  7. Andrew K
    says

    January 9, 2005 23:17:34+00:00

    You should add something like...return (str_replace('xml:lang', 'lang', $buffer));

    ...so that all xml:lang attributes also get converted to their html equiv. Although you're switching between xml:lang and lang for the html element, multi language sites will also contain xml:lang/lang attributes in the content.

  8. Geoffrey Sneddon
    says

    January 10, 2005 00:23:00+00:00

    Good point Andrew (works out who before seeing where the link goes... I really must get off CF :D), I'll switch it in a moment...

  9. Geoffrey Sneddon
    says

    January 10, 2005 01:07:13+00:00

    Sorry about this getting errors in any browser that supports XHTML, just having a couple of problems, but they're now all fixed :) and the new code is up...

  10. Glen
    says

    January 17, 2005 19:33:26+00:00

    I've lost the URL of the page this was originally from, but the version i saw incorporated q ratings to check firstly if the UA supported application/xhtml+xml, and then to see if it preferred application/xhtml+xml over text/html.

    You reckon it's worth adding those to this one?

  11. Geoffrey Sneddon
    says

    January 17, 2005 23:29:40+00:00

    This version has no code borrowed from that other version, although the idea for doing this comes from that, this script is made to be lightweight and so that, which had a lot RegEx replaces, which hads overhead to the script, anyhow I'll try and dig up the URI and do some benchmarking...

  12. Hugo Leite
    says

    January 18, 2005 01:40:36+00:00

    Hi there, liked the code you have, but theres a better (faster) way in http://www.htmldog.com/guides/htmladvanced/declarations/

    HtmlDog has good examples and explanations(?) anyone who wants to learn basic or advanced xhtml should stop by it.

  13. Geoffrey Sneddon
    says

    January 18, 2005 18:41:03+00:00

    IE doesn't support any type of XHTML, so I really don't see the point of sending it XHTML 1.0, you might as well send it as XHTML 1.1, IE understands that just as well...

  14. Geoff
    says

    January 18, 2005 19:39:11+00:00

    IE doesn't support any type of XHTML, so I really don't see the point of sending it XHTML 1.0, you might as well send it as XHTML 1.1, IE understands that just as well?

    well you could send xhtml 1.0 in html compliance mode and save yourself some work (don't have to replace the /> with >)

  15. Geoffrey Sneddon
    says

    January 18, 2005 22:12:15+00:00

    I think your missing the point (se below), IE still sees the XHTML 1.0 (even when served as text/html) as very messy HTML, so enters quirks mode, whereas using the HTML 4.01 Strict doctype, which it recognises and enters standards mode...

  16. Mithoric
    says

    March 1, 2005 21:23:21+00:00

    Actually I believe that diagram is more suited to you. You definitely seem to have the wrong opinion of IE. For one thing, it doesn't have a 90% market share. Never has. And secondly a plain xhtml page will display near perfect enough in IE as in FF (which although I use over IE definitely has some major major flaws in it's rendering and not mentioning the fact that it's slow as hell when compared to IE in that respect). Just because CSS2 isn't completely supported by IE doesn't mean that you shouldn't use xhtml. I dare you to find a completely x* compliant browser, go on! That's right, not even your beloved gecko based browser can stand up to that challenge.

  17. Geoffrey Sneddon
    says

    March 2, 2005 19:34:21+00:00

    Hmm... IE fails on the XML prologue, and with XHTML meant to be sent as application/xhtml+xml, and through that the XML prologue, so, instead of sending it as XHTML as text/html and no XML prologue, I'm sending it as HTML 4.01 Strict and (obviously) no XML prologue. (W3C Note on XHTML Media Types)

    As for the market share, that's my mistake, although it did once have an 80% market share, although that's now fallen to nearer 70%

    Lastly, I never said any browser had full X* compliance, and never meant to anywhere...

    Also, I obviously deleted the image when cleaning up my web stuff, although I'll try and dig it up and put it back online...

  18. Mithoric
    says

    March 3, 2005 23:24:55+00:00

    I'm not an idiot I can see what you're doing (Mhtml on CF BTW), I've been reading W3 docs for the last 5 years. But what you are doing is WRONG. You can't just send data of one type and disguise it as another type. The thought of it is just plain ridiculous, it's extremely bad programming practice. What if I were to send a png with a different MIME and no extension? The browser would be beside itself with all this data and no idea what it's doing, it would most likely attempt to render it anyway but then fall flat on it's face. It's the same thing with what you're doing, except that you are lucky that it uses the same rendering engine with just different rules either way.

    You end up with xhtml code under html 4 standard, code which WAS NOT for that standard.

  19. Geoffrey Sneddon
    says

    March 4, 2005 15:58:50+00:00

    I see your point...

    I accept it is wrong, but until all major browsers support XHTML sent as application/xhtml+xml with the XML Prologue, and I don't like sending XHTML as text/html (although I'll admit I've done it).

    The main thing is that HTML 4.01 Strict differs from XHTML 1.1 in only a few ways:

    • The DTD is not modularised, which is completely irrelevent when talking about the code that the DTD allows
    • In HTML 4.01 Strict some tags should be left open such as <img> and <br>
    • The xml:lang attribute doesn't exist, as HTML isn't based off XML
    • XHTML 1.1 should be sent as application/xhtml+xml not text/html

    IMO those are the main differences, and although XHTML code was not for the HTML 4.01 Strict standard, after the changes have been applied, it is almost identical, although it is almost certinally still wrong.

    As for XHTML 2, that changes a lot, and although someone will almost certinally write a script to convert it to HTML - that person won't be me, it's too different, and is vastly different from the HTML 4.01 Strict standard.

  20. gary hartney » MIME Artistry
    says

    August 22, 2005 00:41:39+01:00

    [...] By Geoffrey Sneddon, originally posted on his blog. [...]

Leave a Reply