-
Notifications
You must be signed in to change notification settings - Fork 116
Processor Instructions
The html5 spec does not allow processor instructions. We do. This is a server side library and we believe they are useful.
Take the document:
<!DOCTYPE html>
<html>
<?foo bar?>
</html>
The <?foo bar?> is a processor instruction. Processor instructions start with a <?, are followed with a node name (foo in this case), and close with a ?>.
When this is parsed using \HTML5::loadHTML() the processor instruction node will be one of \DOMProcessingInstruction with a nodeName property of foo and a data property of bar.
Processing instructions can be useful when we act on them. For example, manipulating the DOM. The instruction processor takes an instruction and acts on it. An instruction processor is defined by the interface \HTML5\InstructionProcessor with a single method of process. For example, let's create a dummy counter.
<?php
use \HTML5\InstructionProcessor
class foo implements InstructionProcessor {
public $bar = 0;
public function process(\DOMElement $element, $name, $data) {
$this->bar++;
return $element;
}
}
This class is really simple. Every time there is a processor instruction a counter is incremented. The element for the processor instruction is returned. The returned element is what is attached to the DOM. If a processing instruction wants to be replaced with a different element, that element should be returned.
The instruction processor needs to be attached to the DOM tree builder to be used. To do this we need a custom parsing function. Because we already have the building blocks this is really quite simple.
function my_parser(\HTML5\Parser\InputStream $input) {
// Create an instance of the processing instruction.
$foo = new foo();
$events = new DOMTreeBuilder();
// Attach it to the event based DOM tree builder.
$events->setInstructionProcessor($foo);
$scanner = new Scanner($input);
$parser = new Tokenizer($scanner, $events);
$parser->parse();
return $events->document();
}
To parse the document use my_parser instead of one of the built in parsers and the instruction processor will be called for each one.
For more details on how this works take a peak inside of \HTML5\Parser\DOMTreeBuilder.