@@ -17,13 +17,15 @@ $html = $converter->convert($djotString);
1717public function __construct(
1818 bool $xhtml = false,
1919 bool $warnings = false,
20- bool $strict = false
20+ bool $strict = false,
21+ bool|SafeMode|null $safeMode = null
2122)
2223```
2324
2425- ` $xhtml ` : When ` true ` , produces XHTML-compatible output (self-closing tags like ` <br /> ` ).
2526- ` $warnings ` : When ` true ` , collects warnings during parsing (see [ Error Handling] ( #error-handling ) ).
2627- ` $strict ` : When ` true ` , throws ` ParseException ` on parse errors (see [ Error Handling] ( #error-handling ) ).
28+ - ` $safeMode ` : When ` true ` or a ` SafeMode ` instance, enables XSS protection (see [ Safe Mode] ( #safe-mode ) ).
2729
2830### Methods
2931
@@ -133,6 +135,124 @@ public function clearWarnings(): self
133135
134136Clears any collected warnings.
135137
138+ #### setSafeMode
139+
140+ ``` php
141+ public function setSafeMode(bool|SafeMode|null $safeMode): self
142+ ```
143+
144+ Enable, disable, or configure safe mode after construction. Pass ` true ` for defaults, a ` SafeMode ` instance for custom configuration, or ` null ` /` false ` to disable.
145+
146+ ## Safe Mode
147+
148+ Safe mode provides built-in XSS protection for user-generated content.
149+
150+ ### Basic Usage
151+
152+ ``` php
153+ use Djot\DjotConverter;
154+
155+ // Enable with sensible defaults
156+ $converter = new DjotConverter(safeMode: true);
157+ $html = $converter->convert($userInput);
158+ ```
159+
160+ ### What Safe Mode Does
161+
162+ 1 . ** URL Sanitization** : Blocks dangerous URL schemes in links and images
163+ - Blocked by default: ` javascript: ` , ` vbscript: ` , ` data: ` , ` file: `
164+ - Safe URLs like ` https: ` , ` mailto: ` , and relative paths are allowed
165+
166+ 2 . ** Attribute Filtering** : Strips event handler attributes
167+ - Blocks attributes starting with ` on ` (e.g., ` onclick ` , ` onload ` , ` onerror ` )
168+ - Blocks specific dangerous attributes (` srcdoc ` , ` formaction ` )
169+ - Allows safe attributes like ` class ` , ` id ` , ` data-* `
170+
171+ 3 . ** Raw HTML Handling** : Controls how raw HTML is processed
172+ - ` escape ` (default): HTML-encodes raw HTML so it displays as text
173+ - ` strip ` : Removes raw HTML entirely
174+ - ` allow ` : Passes raw HTML through (not recommended)
175+
176+ ### SafeMode Class
177+
178+ ``` php
179+ use Djot\SafeMode;
180+
181+ // Factory methods
182+ $safeMode = SafeMode::defaults(); // Standard protection
183+ $safeMode = SafeMode::strict(); // Strips raw HTML completely
184+ ```
185+
186+ #### Configuration Methods
187+
188+ ``` php
189+ // URL scheme control
190+ $safeMode->setDangerousSchemes(['javascript', 'vbscript', 'data']);
191+ $safeMode->addDangerousScheme('ftp');
192+ $safeMode->getDangerousSchemes();
193+
194+ // Whitelist approach (only these schemes allowed)
195+ $safeMode->setAllowedSchemes(['https', 'mailto']);
196+ $safeMode->getAllowedSchemes();
197+
198+ // Attribute filtering
199+ $safeMode->setBlockedAttributePrefixes(['on']); // Blocks onclick, onload, etc.
200+ $safeMode->setBlockedAttributes(['srcdoc', 'formaction']);
201+ $safeMode->getBlockedAttributePrefixes();
202+ $safeMode->getBlockedAttributes();
203+
204+ // Raw HTML handling
205+ $safeMode->setRawHtmlMode(SafeMode::RAW_HTML_ESCAPE); // Default
206+ $safeMode->setRawHtmlMode(SafeMode::RAW_HTML_STRIP); // Remove completely
207+ $safeMode->setRawHtmlMode(SafeMode::RAW_HTML_ALLOW); // Pass through
208+ $safeMode->getRawHtmlMode();
209+ ```
210+
211+ #### Validation Methods
212+
213+ ``` php
214+ $safeMode->isUrlSafe('https://example.com'); // true
215+ $safeMode->isUrlSafe('javascript:alert(1)'); // false
216+
217+ $safeMode->isAttributeSafe('class'); // true
218+ $safeMode->isAttributeSafe('onclick'); // false
219+
220+ $safeMode->sanitizeUrl('javascript:alert(1)'); // ''
221+ $safeMode->filterAttributes([
222+ 'class' => 'highlight',
223+ 'onclick' => 'hack()',
224+ ]); // ['class' => 'highlight']
225+ ```
226+
227+ ### Custom Configuration Example
228+
229+ ``` php
230+ use Djot\DjotConverter;
231+ use Djot\SafeMode;
232+
233+ // Only allow HTTPS links, strip raw HTML
234+ $safeMode = SafeMode::defaults()
235+ ->setAllowedSchemes(['https'])
236+ ->setRawHtmlMode(SafeMode::RAW_HTML_STRIP);
237+
238+ $converter = new DjotConverter(safeMode: $safeMode);
239+ ```
240+
241+ ### Enabling After Construction
242+
243+ ``` php
244+ $converter = new DjotConverter();
245+
246+ // Enable later
247+ $converter->setSafeMode(true);
248+
249+ // Or with custom config
250+ $converter->setSafeMode(SafeMode::strict());
251+
252+ // Disable
253+ $converter->setSafeMode(false);
254+ ```
255+
136256## Error Handling
137257
138258The parser can optionally report warnings and errors with line/column information.
@@ -429,25 +549,95 @@ $renderer->setTableCellSeparator(' | ');
429549
430550### MarkdownRenderer
431551
432- Renders an AST Document to CommonMark-compatible Markdown.
552+ Renders an AST Document to CommonMark-compatible Markdown. Useful for:
553+ - Converting Djot content to Markdown for systems that only support Markdown
554+ - Migrating content between formats
555+ - Generating Markdown documentation from Djot source
433556
434557``` php
558+ use Djot\DjotConverter;
435559use Djot\Renderer\MarkdownRenderer;
436560
561+ $converter = new DjotConverter();
562+ $document = $converter->parse($djotText);
563+
437564$renderer = new MarkdownRenderer();
438565$markdown = $renderer->render($document);
439566```
440567
568+ ** Conversion Table:**
569+
570+ | Djot | Markdown Output |
571+ | ------| -----------------|
572+ | ` *strong* ` | ` **strong** ` |
573+ | ` _emphasis_ ` | ` *emphasis* ` |
574+ | ` {-deleted-} ` | ` ~~deleted~~ ` (GFM) |
575+ | ` {+inserted+} ` | ` <ins>inserted</ins> ` |
576+ | ` {=highlighted=} ` | ` <mark>highlighted</mark> ` |
577+ | ` ^superscript^ ` | ` <sup>superscript</sup> ` |
578+ | ` ~subscript~ ` | ` <sub>subscript</sub> ` |
579+ | `` `code` `` | `` `code` `` |
580+ | ` [text](url) ` | ` [text](url) ` |
581+ | `  ` | `  ` |
582+ | ` # Heading ` | ` # Heading ` |
583+ | ` > quote ` | ` > quote ` |
584+ | ` - list ` | ` - list ` |
585+ | ` 1. ordered ` | ` 1. ordered ` |
586+ | ` - [ ] task ` | ` - [ ] task ` |
587+ | ` :symbol: ` | ` :symbol: ` |
588+ | ` [^note] ` | ` [^note] ` |
589+ | ` $math$ ` | ` $math$ ` |
590+ | ` $$display$$ ` | ` $$display$$ ` |
591+ | Tables | GFM tables with alignment |
592+ | Divs | Content only (no wrapper) |
593+ | Spans | Content only (no wrapper) |
594+ | Definition lists | Bold term + ` : description ` |
595+ | Line blocks | Hard breaks (` \n ` ) |
596+ | Raw HTML | Passed through |
597+ | Comments | Stripped |
598+
441599** Behavior:**
442- - Converts Djot to CommonMark Markdown
443- - Uses GFM extensions where available (strikethrough with ` ~~ ` )
444- - Falls back to HTML for features without Markdown equivalents:
445- - Superscript: ` <sup>text</sup> `
446- - Subscript: ` <sub>text</sub> `
447- - Highlight: ` <mark>text</mark> `
448- - Insert: ` <ins>text</ins> `
449- - Preserves table alignment
450- - Preserves footnotes
600+ - Produces CommonMark-compatible output
601+ - Uses GFM extensions where available (strikethrough, tables, task lists, footnotes)
602+ - Falls back to inline HTML for features without Markdown equivalents
603+ - Escapes special Markdown characters in text content
604+ - Handles nested backticks in code spans and fenced blocks
605+ - Preserves table column alignment
606+ - Normalizes multiple blank lines
607+
608+ ** Example:**
609+
610+ ``` php
611+ $djot = <<<'DJOT'
612+ # Hello *World*
613+
614+ This has {=highlighted=} and {-deleted-} text.
615+
616+ | Name | Score |
617+ |-------|------:|
618+ | Alice | 95 |
619+ DJOT;
620+
621+ $document = $converter->parse($djot);
622+ $markdown = (new MarkdownRenderer())->render($document);
623+ ```
624+
625+ Output:
626+ ``` markdown
627+ # Hello **World**
628+
629+ This has <mark>highlighted</mark> and ~~deleted~~ text.
630+
631+ | Name | Score |
632+ | --- | ---: |
633+ | Alice | 95 |
634+ ```
635+
636+ ** Limitations:**
637+ - Djot divs (` ::: class ` ) lose their class/attributes (content is preserved)
638+ - Djot spans (` [text]{.class} ` ) lose their attributes (content is preserved)
639+ - Definition lists are approximated (not native Markdown)
640+ - Some whitespace/formatting may differ from original
451641
452642## AST Node Types
453643
0 commit comments