- Writing JavaScript for XHTML
- Problem: Nothing Works
- Solution: The CDATA Trick
- Problem: Names in XHTML and HTML are represented in different cases
- Solution: Use or convert to lower case
- Problem: My Cookie Isn’t Saved!
- Solution: Use the Storage Object
- Problem: I Can’t Use document.write()
- Solution: Use DOM Methods
- Hello World!
- Problem: I want to remain forward compatible!
- Solution: Avoid HTML-specific DOM
- Problem: My Favourite JS Library still Breaks
- I Read about E4X. Now, This Is Perfect, Isn’t It?
- Finally: Content Negotiation
- Further Reading
- See also
- JavaScript | Как создать объект DOM-документа из строки с HTML-разметкой на клиенте?
- Пример работы
- Способ № 2 — Через HTML-элемент iframe
- Дополнительная информация
Writing JavaScript for XHTML
In practise, very few XHTML documents are served over the Web with the correct MIME media type, application/xhtml+xml . Whilst authored to the stricter rules of XML, they are sent with the media type for HTML ( text/html ). The receiving browser considers the content to be HTML, and does not utilise its XML parser.
There are a number of reasons for this. Partially it is because, prior to version 9, Internet Explorer was incapable of handling XHTML sent with the official XHTML media type at all. (Rather than displaying content, it would present the user with a file download dialog.) But it is also founded in the experience that JavaScript, authored carefully for HTML, can break when placed with an XML environment.
This article shows some of the reasons alongside with strategies to remedy the problems. It will encourage web authors to use more XML features and make their JavaScript interoperable with real XHTML applications.
(Note that XHTML documents which behave correctly in both application/xhtml+xml and text/html environments are sometimes known as ‘polyglot’ documents.)
To test the following examples locally, use Firefox’s extension switch. Just write an ordinary (X)HTML file and save it once as test.html and once as test.xhtml.
Problem: Nothing Works
After switching the MIME type suddenly no inline script works anymore. Even the plain old alert() method is gone. The code looks something like this:
Solution: The CDATA Trick
This problem usually arises, when inline scripts are included in comments. This was common practice in HTML, to hide the scripts from browsers not capable of JS. In the age of XML comments are what they were intended: comments. Before processing the file, all comments will be stripped from the document, so enclosing your script in them is like throwing your lunch in a Piranha pool. Moreover, there’s really no point to commenting out your scripts — no browser written in the last ten years will display your code on the page.
The easy solution is to do away with the commenting entirely:
This will work so long as your code doesn’t contain characters which are «special» in XML, which usually means < and & . If your code contains either of these, you can work around this with CDATA sections:
Note that the CDATA section is only necessary because of the < in the code; otherwise you could have ignored it.
A third solution is to use only external scripts, neatly sidestepping the special-character problem.
Alternatively, the CDATA section can be couched within comments so as to be able to work in either application/xhtml+xml or text/html:
And if you really need compatibility with very old browsers that do not recognize the script or style tags resulting in their contents displayed on the page, you can use this:
See this document for more on the issues related to application/xhtml+xml and text/html (at least as far as XHTML 1.* and HTML 4; HTML5 addresses many of these problems).
Problem: Names in XHTML and HTML are represented in different cases
Scripts that used getElementsByTagName() with an upper case HTML name no longer work, and attributes like nodeName or tagName return upper case in HTML and lower case in XHTML.
Solution: Use or convert to lower case
For methods like getElementsByTagName(), passing the name in lower case will work in both HTML and XHTML. For name comparisons, first convert to lower case before doing the comparison (e.g., «el.nodeName.toLowerCase() === ‘html'»). This will ensure that documents in HTML will compare correctly and will do no harm in XHTML where the names are already lower case.
Problem: My Cookie Isn’t Saved!
We found out already, that the document object in XML files is different from the ones in HTML files. Now we take a look at one widly used property that is missing in XML files. In XML documents there is no document.cookie. That is, you can write something like
in XML as well, but nothing is saved in cookie storage.
Solution: Use the Storage Object
With Firefox 2 there was a new feature enabled, the HTML 5 Storage object. Although this feature is not free of critics, you can use it to bypass the non-existing cookie, if your document is of type XML. Again, you will have to write your own wrapper to respect any given combination of MIME type and browser.
Problem: I Can’t Use document.write()
This problem has the same cause as the one above. This method does not exist in XMLDocuments anymore. There are reasons why this decision was made, one being that a string of invalid markup will instantly break the whole document.
Solution: Use DOM Methods
Many people avoided DOM methods because of the typing to create one simple element, when document.write() worked. Now you can’t do this as easily as before. Use DOM methods to create all of your elements, attributes and other nodes. This is XML proof, as long as you keep the namespace problem in focus (e.g., there is a document.createElementNS method).
Of course, you can still use strings like in document.write(), but it takes a little more effort. For example:
var string = 'Hello World!
'; var parser = new DOMParser(); var documentFragment = parser.parseFromString(string, "text/xml"); body.appendChild(documentFragment); // assuming 'body' is the body element
But be aware that if your string is not well-formed XML (e.g., you have an & where it should not be), then this method will crash, leaving you with a parser error.
Problem: I want to remain forward compatible!
Given the direction away from formatting attributes and the possibility of XHTML becoming eventually more prominent (or at least the document author having the possibility of later wanting to make documents available in XHTML for browsers that support it), one may wish to avoid features which are not likely to stay compatible into the future.
Solution: Avoid HTML-specific DOM
The HTML DOM , even though it is compatible with XHTML 1.0, is not guaranteed to work with future versions of XHTML (perhaps especially the formatting properties which have been deprecated as element attributes). The regular XML DOM provides sufficient methods via the Element interface for getting/setting/removing attributes.
Problem: My Favourite JS Library still Breaks
If you use JavaScript libraries like the famous prototype.js or Yahoo’s one, there is bad news for you: As long as the developers don’t apply the fixes mentioned above, you won’t be able to use them in your XML-XHTML applications.
Two possible ways still are there, but neither is very promissing: Take the library, recode it and publish it or e-mail the developers, e-mail your friends to e-mail the developers and e-mail your customers to e-mail the developers. If they get the hint and are not too annoyed, perhaps they start to implement XML features in their libraries.
I Read about E4X. Now, This Is Perfect, Isn’t It?
As a matter of fact, it isn’t. E4X is a new method of using and manipulating XML in JavaScript. But, standardized by ECMA, they neglected to implement an interface to let E4X objects interact with DOM objects our document consists of. So, with every advantage E4X has, without a DOM interface you can’t use it productively to manipulate your document. However, it can be used for data, and be converted into a string which can then be converted into a DOM object. DOM objects can similarly be converted into strings which can then be converted into E4X.
Finally: Content Negotiation
Now, how do we decide, when to serve XHTML as XML? We can do this on server side by evaluating the HTTP request header. Every browser sends with its request a list of MIME types it understands, as part of the HTTP content negotiation mechanism. So if the browser tells our server, that it can handle XHTML as XML, that is, the Accept: field in the HTTP head contains application/xhtml+xml somewhere, we are safe to send the content as XML.
In PHP, for example, you would write something like this:
if( strpos( $_SERVER['HTTP_ACCEPT'], "application/xhtml+xml" ) ) < header( "Content-type: application/xhtml+xml" ); echo ''."\n"; > else < header( "Content-type: text/html" ); >
This distinction also sends the XML declaration, which is strongly recommended, when the document is an XML file. If the content is sent as HTML, an XML declaration would break IE’s Doctype switch, so we don’t want it there.
For completeness here is the Accept field, that Firefox 2.0.0.9 sends with its requests:
Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5
Further Reading
You will find several useful articles in the developer wiki:
DOM 2 methods you will need are:
See also
JavaScript | Как создать объект DOM-документа из строки с HTML-разметкой на клиенте?
Первым параметром метод принимает строку, которая по сути является полноценной разметкой HTML-страницы.
Вторым параметром метод принимает тип распознавания для будущего документа. Это один из:
- « text/html «
- «text/xml»
- «application/xml»
- «application/xhtml+xml»
- «image/svg+xml»
В случае с созданием HTML-документа, нужно выбрать строковое представление « text/html «.
Пример работы
Есть строка — потенциальный фрагмент будущего документа:
Создаём новый объект DOMParser:
Вызываем метод синтаксического анализа по строке:
let nDoc = ndp.parseFromString(stroka, "text/html")
На выходе получаем объект документа (document), который в элементе body содержит полную разметку, переданную нами в строке.
Более того, если мы передадим не просто HTML-фрагмент документа, а полную HTML-разметку с доктайпами, комментариями и эштемээлями, то на выходе документ будет также хорошо собран. Это важно!
Способ № 2 — Через HTML-элемент iframe
Мы создаём средствами JavaScript новый объект HTML-элемента iframe.
let nif = document.createElement('iframe');
Мы подсаживаем этот iframe в текущий открытый документ, чтобы наш iframe начал участвовать в рендеринге (визуализации) страницы.
Для атрибута srcdoc нашего iframe мы устанавливаем значение — нужную нам строку для парсинга в HTML-документ.
Далее мы дожидаемся события load для iframe, которое будет свидетельствовать о завершении построения объектной модели документа. Это значит, что можно будет работать по новому документу всеми стандартными методами DOM.
let nDoc; nif.onload = (event)=>< // Выводим в консоль №1 console.log(nif.contentDocument); // Выводим в консоль №2 console.log(event.target.contentDocument); // . или. // Сохраняем в перменную №1 nDoc = nif.contentDocument // Сохраняем в перменную №2 nDoc = event.target.contentDocument >
Далее мы получаем новый объект документа. Внимание! Вложенного документа! Не исходного открытого во вкладке браузера, а вложенного в iframe. Только у отрисованного iframe свойство contentDocument будет содержать нужный нам документ. Если iframe не будет отрисован, а просто создан в дереве родительского документа, то свойство contentDocument вернёт null.
Дополнительная информация
Существует такое понятие, как «фрагмент документа«. Очень важно отличать его от URI-фрагмента, который обозначается решёткой # и имеет связь с элементом на странице по идентификатору id .
Так вот «фрагмент документа» подробно описан в стандарте DOM, в разделе «5.5. Interface Range«. Также имеется отдельный раздел в стандарте DOM Parsing and Serialization, который там называется «8. Extensions to the Range interface«.
Объекты, реализующие интерфейс Range , называются живыми диапазонами (live ranges).
Как это выглядит на практике? Создадим новый объект Range.
Вызовем на полученном объекте Range метод createContextualFragment() . Внутрь метода мы передаём нашу строку с HTML-разметкой:
let df = nRange.createContextualFragment(stroka)
По итогу нам возвращается новый объект «документ-фрагмент». Какие плюсы мы получаем?
Мы можем обходить вложенные элементы объекта «документ-фрагмент». Это очень круто т. к. по сути нам может быть не всегда нужен полный объект документа!
Например, методом querySelectorAll() с параметром ‘*’ , мы можем получить все объекты элементов, отправленных в первоначальной строке
Или мы можем отобрать все объекты по типу:
[. df.querySelectorAll('*')].filter(element=>element.nodeName=='P') или [. df.querySelectorAll('*')].filter(element=>element.tagName=='P')
К сожалению удобный для работы метод getElementsByTagName() не работает на объектах «документ-фрагмент»