Class HTMLPurifier_Lexer_DOMLex

Parser that uses PHP 5's DOM extension (part of the core).

In PHP 5, the DOM XML extension was revamped into DOM and added to the core. It gives us a forgiving HTML parser, which we use to transform the HTML into a DOM, and then into the tokens. It is blazingly fast (for large documents, it performs twenty times faster than HTMLPurifier_Lexer_DirectLex,and is the default choice for PHP 5.

HTMLPurifier_Lexer

HTMLPurifier_Lexer_DOMLex

Direct known subclasses

HTMLPurifier_Lexer_PH5P

Note: Any empty elements will have empty tokens associated with them, even if this is prohibited by the spec. This is cannot be fixed until the spec comes into play.
Note: PHP's DOM extension does not actually parse any entities, we use our own function to do that.
Warning: DOM tends to drop whitespace, which may wreak havoc on indenting. If this is a huge problem, due to the fact that HTML is hand edited and you are unable to get a parser cache that caches the the output of HTML Purifier while keeping the original HTML lying around, you may want to run Tidy on the resulting output or use HTMLPurifier_DirectLex
Located at x2engine/framework/vendors/htmlpurifier/HTMLPurifier.standalone.php

Methods summary
`public`	# `__construct( )` Overrides `HTMLPurifier_Lexer::__construct()`
`public HTMLPurifier_Token[]`	# `tokenizeHTML( string $html, HTMLPurifier_Config $config, HTMLPurifier_Context $context )` Parameters `$html` `string` $html `$config` `HTMLPurifier_Config` $config `$context` `HTMLPurifier_Context` $context Returns `HTMLPurifier_Token[]` Overrides `HTMLPurifier_Lexer::tokenizeHTML()`
`protected HTMLPurifier_Token`	# `tokenizeDOM( DOMNode $node, HTMLPurifier_Token[] & $tokens )` Iterative function that tokenizes a node, putting it into an accumulator. To iterate is human, to recurse divine - L. Peter Deutsch Iterative function that tokenizes a node, putting it into an accumulator. To iterate is human, to recurse divine - L. Peter Deutsch Parameters `$node` `DOMNode` $node DOMNode to be tokenized. `$tokens` `HTMLPurifier_Token[]` $tokens Array-list of already tokenized tokens. Returns `HTMLPurifier_Token` of node appended to previously passed tokens.
`protected boolean`	# `createStartNode( DOMNode $node, HTMLPurifier_Token[] & $tokens, boolean $collect )` Parameters `$node` `DOMNode` $node DOMNode to be tokenized. `$tokens` `HTMLPurifier_Token[]` $tokens Array-list of already tokenized tokens. `$collect` `boolean` $collect Says whether or start and close are collected, set to false at first recursion because it's the implicit DIV tag you're dealing with. Returns `boolean` if the token needs an endtoken
`protected`	# `createEndNode( DOMNode $node, HTMLPurifier_Token[] & $tokens )` Parameters `$node` `DOMNode` $node `$tokens` `HTMLPurifier_Token[]` $tokens
`protected array`	# `transformAttrToAssoc( DOMNamedNodeMap $node_map )` Converts a DOMNamedNodeMap of DOMAttr objects into an assoc array. Converts a DOMNamedNodeMap of DOMAttr objects into an assoc array. Parameters `$node_map` `DOMNamedNodeMap` $node_map DOMNamedNodeMap of DOMAttr objects. Returns `array` Associative array of attributes.
`public`	# `muteErrorHandler( integer $errno, string $errstr )` An error handler that mutes all errors An error handler that mutes all errors Parameters `$errno` `integer` $errno `$errstr` `string` $errstr
`public string`	# `callbackUndoCommentSubst( array $matches )` Callback function for undoing escaping of stray angled brackets in comments Callback function for undoing escaping of stray angled brackets in comments Parameters `$matches` `array` $matches Returns `string`
`public string`	# `callbackArmorCommentEntities( array $matches )` Callback function that entity-izes ampersands in comments so that callbackUndoCommentSubst doesn't clobber them Callback function that entity-izes ampersands in comments so that callbackUndoCommentSubst doesn't clobber them Parameters `$matches` `array` $matches Returns `string`
`protected string`	# `wrapHTML( string $html, HTMLPurifier_Config $config, HTMLPurifier_Context $context )` Wraps an HTML fragment in the necessary HTML Wraps an HTML fragment in the necessary HTML Parameters `$html` `string` $html `$config` `HTMLPurifier_Config` $config `$context` `HTMLPurifier_Context` $context Returns `string`

Methods inherited from HTMLPurifier_Lexer
`CDATACallback(), create(), escapeCDATA(), escapeCommentedCDATA(), extractBody(), normalize(), parseData(), removeIEConditional()`

Properties inherited from HTMLPurifier_Lexer
`$_special_entity2str, $tracksLineNumbers`

Packages

Classes

Interfaces

Exceptions

Functions

Class HTMLPurifier_Lexer_DOMLex

Direct known subclasses

Overrides

Parameters

Returns

Overrides

Parameters

Returns

Parameters

Returns

Parameters

Parameters

Returns

Parameters

Parameters

Returns

Parameters

Returns

Parameters

Returns