HTML is an SGML-based language and SGML-based languages are not easy to extend or consume generically. (SGML is an international standard for markup languages like HTML; see Wikipedia for more information on SGML.) The major issue is that such languages are too flexible. The two major problems are:
As such, the tool consuming these languages (e.g., a browser or an editor), must be aware of every aspect of the language.
On the other hand, XML-based languages (XML being a subset of SGML; again, see Wikipedia for more information on XML) are stricter:
XML also enforces case sensitivity and use of quotation marks to enclose attribute values, but it is the two issues above that make XML so easily extensible.
A nice side effect was that it provided freedom from choice. And with this freedom comes the knowledge that you can look at any valid XHTML document and know what to expect. No need for a style guide to tell authors when to and not to put quotation marks around their attributes or to write in lowercase or uppercase letters.
As it turns out, XHTML wasn't adopted as whole-heartedly as had been anticipated and, as such, didn't add as much value as we all had hoped it would. The W3C stopped work on XHTML 2 in 2009.