tedd on 10 October 2011, 7 years ago

It would be interesting to see the popularity of the different DOCTYPES including the new simple HTML5.

I'm teaching that in my college classes -- it seems less confusing for students.



Sam Soltano (site administrator) on 10 October 2011, 7 years ago

Hello Tedd,

Thank you for that proposal. We do already extract the markup language from the doctype, with the exception of HTML5. The reason why we don't report HTML5 is that the usage of the minimalistic HTML5 doctype is only a very weak indication that the page actually uses HTML5. What we see very often under the HTML5 doctype is a mixture of HTML 4 and 5 with some XHTML thrown in.

Modern browsers are so terribly good at accepting and interpreting any HTML tag soup, that developers often don't really care about the doctype and the corresponding "correct" HTML version. We have not yet found a meaningful way to classify a page as HTML5. Any proposals towards that goal are welcome.

