Jinja2 escape all HTML but img, b, etc

You can write your own filter. The scrubber library is pretty good at cleaning up HTML. The filter will need to wrap the returned string in jinja2.Markup so the template will not re-escape it.

Edit: a code example

import jinja2
import scrubber

def sanitize_html(text):
    return jinja2.Markup(scrubber.Scrubber().scrub(text))

jinja_env.filters['sanitize_html'] = sanitize_html

The Bleach library can do very well.

For example, assuming the variable 'jinja_env' is in scope:

from bleach import clean
from markupsafe import Markup

def do_clean(text, **kw):
    """Perform clean and return a Markup object to mark the string as safe.
    This prevents Jinja from re-escaping the result."""
    return Markup(clean(text, **kw))

jinja_env.filters['clean'] = do_clean

Then in a template you might have something like:

<p>{{ my_variable|clean(tags=['img', 'b', 'i', 'em', 'strong'], attributes={'img': ['src', 'alt', 'title', 'width', 'height']}) }}</p>

You can also use a callable (instead of a list) in the attributes, allowing more thorough validation of the attributes (e.g. checking that src provides a valid URL). Documentation shows an example.


You'll want to parse the input on submission using a white list approach - there are several good examples in this question and viable options out there.

Once you have done that, you can mark any variables that will contain HTML that should not be escaped with the safe filter:

{{comment|safe}}