HTML cleansing (feincms3.cleanse
)¶
HTML cleansing is by no means only useful for user generated content. Managers also copy-paste content from word processing programs, the rich text editor’s output isn’t always (almost never) in the shape we want it to be, and a strict allowlist based HTML sanitizer is the best answer I have.
- class feincms3.cleanse.CleansedRichTextField(*args, **kwargs)[source]¶
This is a subclass of django-ckeditor’s
RichTextField
. The recommended configuration is as follows:CKEDITOR_CONFIGS = { "default": { "toolbar": "Custom", "format_tags": "h1;h2;h3;p;pre", "toolbar_Custom": [[ "Format", "RemoveFormat", "-", "Bold", "Italic", "Subscript", "Superscript", "-", "NumberedList", "BulletedList", "-", "Anchor", "Link", "Unlink", "-", "HorizontalRule", "SpecialChar", "-", "Source", ]], }, } # Settings for feincms3.plugins.richtext.RichText CKEDITOR_CONFIGS["richtext-plugin"] = CKEDITOR_CONFIGS["default"]
The corresponding
HTML_SANITIZERS
configuration for html-sanitizer would look as follows:HTML_SANITIZERS = { "default": { "tags": { "a", "h1", "h2", "h3", "strong", "em", "p", "ul", "ol", "li", "br", "sub", "sup", "hr", }, "attributes": { "a": ("href", "name", "target", "title", "id", "rel"), }, "empty": {"hr", "a", "br"}, "separate": {"a", "p", "li"}, # Additional default settings not listed here. }, }
At the time of writing those are the defaults of html-sanitizer, so you don’t have to do anything.
If you want or require a different cleansing function, simply override the default with
CleansedRichTextField(cleanse=your_function)
. The cleansing function receives the HTML as its first and only argument and returns the cleansed HTML.