Dodo

Dodo is a pure PHP implementation of the HTML DOM, based on a rigorous PHP binding of the WHATWG WebIDL specification of the DOM API. It aims to be modern and correct, and where possible complete. Like the domino JavaScript library it was inspired by, it is intended for server-side DOM manipulation and so deliberately avoids implementing the portions of the DOM dealing with dynamic loading, browser layout, and other similar features. It does aim to be *fast*.

In MediaWiki
Dodo will become available in MediaWiki as a composer dependency of  when it is mature.

Over time it is expected to gradually replace all usage of the native PHP DOM extension, which is buggy, ill-maintained, and out-of-date.

Everywhere else
Install the wikimedia/dodo package from Packagist:

composer require wikimedia/dodo

Semantic versioning is used.

The major version number will be incremented for every change that breaks backwards compatibility.

Architecture overview
For full reference documentation, please see the documentation generated from the source (or the source itself)


 * Generated API documentation

The API implemented by Dodo is mostly defined by the PHP binding for WebIDL. This is described in the IDLeDOM documentation.

Examples
In the above code sample, we first construct an HTML document with the given title, then parse an HTML string in order to populate the &lt;body> element of the document. In order to demonstrate how property-style access is supported, we then re-serialize the body and return it.

Performance
Dodo has not yet been fully benchmarked, but we hope it will be competitive with the native PHP DOM extension.

There are two aspects of performance: memory usage and speed.

In order to minimize memory usage, the number of fields in each DOM Node has been minimized wherever possible. Fields like the nodeType and nodeName are not actually stored in the DOM Node, but implemented via dynamic dispatch based on the type of the object.

For speed, Dodo uses a fairly common optimization that represents node children in a linked list (a circular linked list, in particular) and avoids creating the backing arrays required by the spec to implement (eg) Node::children unless they are requested. Writing your code to iterate using Node::firstChild and Node::nextSibling instead of iterating over the Node::children array will fastest (as it is in most browser DOM implementations as well).

In order to avoid maximum performance, the getters and setters required by the DOM spec are implemented as explicit methods, for example Node::getNodeType. The complex getter/setter behavior required by the DOM can't be guaranteed via PHP properties, but we do support property-style access (eg $node->nodeType) via the magic methods __get and __set. These impose a performance penalty, however, so for best performance client code will use explicit calls to the getter and setter methods.