[langsec-discuss] Internet Explorer XML entity processing

Harald Lampesberger h.lampesberger at cdcc.faw.jku.at
Wed Mar 20 10:53:47 UTC 2013

Hello langsec,

I recently worked through the XML standards and stumbled upon a
potential (?) weird machine in IE's XML entity declaration
implementation. I don't know if the behavior changes with IE settings or
security zones...

* XML entity example:
<!DOCTYPE lol [
<!ENTITY var "value">
<lol>look at &var;</lol>
result: <lol>look at value</lol>

* So far nothing new, a classic DoS attack starts like this [2]:
<!DOCTYPE lol [
<!ENTITY v1 "&v2;&v2;">
<!ENTITY v2 "&v3;&v3;">
<!ENTITY v3 "&v4;&v4;">
<!ENTITY v4 "a">
result: <lol>look at aaaaaaaa</lol>

Entities are resolved recursively in the infoset, nice.
XML spec defines some default entities to keep alphabets of element
content and control characters disjoint:
" & ' > <
All unicode characters are available as entities too:
&#x30; = 0 = 0

* We can create new elements, that are also evaluated, in the infoset
through entities:
<!DOCTYPE lol [
<!ENTITY v1 "&#x3C;rofl&#x3E;&v2;&#x3C;/rofl&#x3E;">
<!ENTITY v2 "content">
result: <lol><rofl>content</rofl></lol>

* And we can also declare new entities through parameter entities (based
on [1]):
<!DOCTYPE lol [
<!ENTITY v3 "&v2;">
<!ENTITY % v1 "<!ENTITY v2 "content">">
result: <lol>content</lol>
This example mixes "general entities" (&bla;) and "parameter entities"
(%bla;). Parameter entities are evaluated before the general entities.
IE crunches it but Chrome reports an error.

Especially the last example somehow looks like a tagging or semi-thue
system and definitely has computational power. The question is, how
much? Another question is, does IE use the same XML library as WCF?

When I have more time, I will continue my search for the weird machine
and try to come up with better examples. I also appreciate feedback :-)


[1] http://www.w3.org/TR/xml11/#sec-entexpand
[2] http://clawslab.nds.rub.de/wiki/index.php/XML_Entity_Expansion

