Class HTMLPurifier_Lexer_DOMLex
Parser that uses PHP 5's DOM extension (part of the core).
In PHP 5, the DOM XML extension was revamped into DOM and added to the core. It gives us a forgiving HTML parser, which we use to transform the HTML into a DOM, and then into the tokens. It is blazingly fast (for large documents, it performs twenty times faster than HTMLPurifier_Lexer_DirectLex,and is the default choice for PHP 5.
- HTMLPurifier_Lexer
- HTMLPurifier_Lexer_DOMLex
Direct known subclasses
HTMLPurifier_Lexer_PH5PNote: PHP's DOM extension does not actually parse any entities, we use our own function to do that.
Warning: DOM tends to drop whitespace, which may wreak havoc on indenting. If this is a huge problem, due to the fact that HTML is hand edited and you are unable to get a parser cache that caches the the output of HTML Purifier while keeping the original HTML lying around, you may want to run Tidy on the resulting output or use HTMLPurifier_DirectLex
Located at x2engine/framework/vendors/htmlpurifier/HTMLPurifier.standalone.php
public
|
|
public
|
|
protected
|
#
tokenizeDOM(
Iterative function that tokenizes a node, putting it into an accumulator. To iterate is human, to recurse divine - L. Peter Deutsch |
protected
boolean
|
|
protected
|
|
protected
array
|
#
transformAttrToAssoc(
Converts a DOMNamedNodeMap of DOMAttr objects into an assoc array. |
public
|
|
public
string
|
#
callbackUndoCommentSubst( array $matches )
Callback function for undoing escaping of stray angled brackets in comments |
public
string
|
#
callbackArmorCommentEntities( array $matches )
Callback function that entity-izes ampersands in comments so that callbackUndoCommentSubst doesn't clobber them |
protected
string
|
#
wrapHTML( string $html,
Wraps an HTML fragment in the necessary HTML |
CDATACallback(),
create(),
escapeCDATA(),
escapeCommentedCDATA(),
extractBody(),
normalize(),
parseData(),
removeIEConditional()
|
$_special_entity2str,
$tracksLineNumbers
|