← Blog

Markdown in Python: parsing, rendering, and conversion libraries

The five Python libraries for working with Markdown — markdown, mistune, markdown-it-py, mistletoe, and Pandoc — compared on speed, features, and security.

If you need to parse or render Markdown in a Python app — a blog engine, a docs site, an AI assistant turning chat output into HTML, a Jupyter pipeline — Python has five practical options. This guide compares them on speed, feature coverage, security, and the use cases each one wins.

The five libraries

LibraryMaintainedSpeedGFM extensionsPlugins
markdown (Python-Markdown)Yes, matureSlowVia extensionsMature ecosystem
mistuneYesFastest pure-PythonYes (built-in)Plugin API
markdown-it-pyYesFastYes (built-in)Port of JSmarkdown-it plugins
mistletoeYesFastYesLimited
pandoc (viapypandoc)Yes (wrapped)Slow (subprocess)YesPandoc filters

If you want a quick answer:

  • Default pick:markdown-it-py — fast, GFM-compatible, plugin ecosystem
  • Maximum speed:mistune
  • Maximum compatibility with non-Markdown formats:pandoc
  • Already invested in Python-Markdown: stay there

Library 1:markdown (Python-Markdown)

The oldest and most-used. Conservative, well-tested, slow.

pip install markdown
import markdown

html = markdown.markdown(text)

# With extensions for tables, fenced code, etc.
html = markdown.markdown(
    text,
    extensions=['tables', 'fenced_code', 'codehilite', 'toc'],
)

The extension system is its strongest feature. The ecosystem includes:

  • tables — GFM tables
  • fenced_code — triple-backtick fences
  • codehilite — Pygments-powered syntax highlighting
  • toc — auto-generated table of contents
  • meta — YAML frontmatter parsing
  • pymdown-extensions — a curated bundle of 30+ additional extensions

Used byMkDocs,Pelican, and many Django blog engines.

Best for: established Django/Flask projects, MkDocs sites, anyone with existing Python-Markdown extension code. SeeMkDocs: getting started for the MkDocs context.

Library 2:mistune

The fastest pure-Python Markdown renderer. CommonMark + GFM compatible.

pip install mistune
import mistune

markdown = mistune.create_markdown(
    escape=False,            # allow raw HTML (set to True for user input!)
    plugins=['table', 'task_lists', 'strikethrough', 'url'],
)

html = markdown(text)

Benchmarks usually show mistune 2-5x faster than Python-Markdown on the same input. The plugin API is straightforward — you can write a custom renderer that emits whatever output you want (HTML, AST, plain text).

Best for: high-volume rendering pipelines, sites with thousands of pages built per minute, AI assistants that render Markdown on every response.

Library 3:markdown-it-py

A direct port of the JSmarkdown-it library to Python. CommonMark-compliant by spec, GFM via plugins, the most active development of the bunch.

pip install markdown-it-py
from markdown_it import MarkdownIt

md = MarkdownIt('commonmark', {'breaks': True, 'html': False})
md.enable(['table', 'strikethrough'])

html = md.render(text)

Used byJupyterLab (Jupyter notebook Markdown rendering),MyST (the technical-doc Markdown extension), and growing fast in the scientific-Python ecosystem.

Best for: anyone who wants CommonMark conformance, Jupyter integrations, the MyST docs ecosystem.

Library 4:mistletoe

Pure Python, fast, CommonMark-compliant, easily extensible via Python class inheritance instead of plugin APIs.

pip install mistletoe
from mistletoe import markdown
html = markdown(text)

The killer feature: custom renderers are just Python subclasses. If you want a LaTeX renderer, a plain-text renderer, a JSON-AST renderer, write the subclass. Less plugin surface than markdown-it-py, more direct hackability.

Best for: when you need to render Markdown to a non-HTML target (LaTeX, plain text, custom XML).

Library 5:pandoc viapypandoc

Not a Python library — a Python wrapper around the Pandoc CLI. Heavyweight (requires Pandoc installed system-wide) but unbeatable for format conversion.

pip install pypandoc
# Plus: brew install pandoc (or apt, choco, etc.)
import pypandoc

# Markdown → HTML
html = pypandoc.convert_text(md_text, 'html', format='gfm')

# Markdown → DOCX
pypandoc.convert_text(md_text, 'docx', format='gfm', outputfile='output.docx')

# Markdown → PDF (needs LaTeX install)
pypandoc.convert_text(md_text, 'pdf', format='gfm', outputfile='output.pdf')

Best for: scripts that need to produce DOCX, PDF, EPUB, LaTeX from Markdown. SeeMarkdown to PDF andMarkdown to Word for the format-specific stories.

Sanitization (the part everyone misses)

Most Python Markdown libraries don't sanitize HTML by default. If your Markdown source contains:

Some prose.

<script>
  fetch('/admin/users').then(...)
</script>

…most libraries pass that<script> through to the output. If that HTML is then rendered in another user's browser, you've shipped XSS.

Two-layer defense:

  1. Configure the renderer to refuse raw HTML. Most have an option:mistune(escape=True),MarkdownIt('commonmark', {'html': False}). With raw HTML disabled, the<script> tag becomes literal&lt;script&gt; in the output.
  2. Sanitize after rendering. Usebleach on the output HTML before storing or displaying:
import bleach

clean = bleach.clean(
    rendered_html,
    tags=['p', 'a', 'em', 'strong', 'code', 'pre', 'ul', 'ol', 'li',
          'h1', 'h2', 'h3', 'h4', 'h5', 'h6', 'blockquote', 'br',
          'table', 'thead', 'tbody', 'tr', 'th', 'td', 'img'],
    attributes={'a': ['href', 'title'], 'img': ['src', 'alt', 'title']},
    strip=True,
)

Bleach gives you a fine-grained allowlist of tags and attributes. Anything not in the allowlist gets stripped.

SeeMarkdown to HTML for the broader sanitization story.

Jupyter notebooks (Markdown cells)

Jupyter notebooks render Markdown cells viamarkdown-it-py (in recent JupyterLab versions). The dialect is roughly GFM + LaTeX math ($…$ for inline,$$…$$ for display).

Common headings-in-jupyter-notebook questions:

  • "Why aren't my headings clickable in the navigation panel?" — JupyterLab's outline panel only picks up# headings, not bold-then-newline approximations.
  • "Why does my numbered list restart at 1 every cell?" — each cell is a separate Markdown document. Continuation between cells isn't supported.
  • "How do I get a TOC?" — installjupyterlab-toc extension; it generates one from the headings.

For LaTeX math support, both inline ($E=mc^2$) and block math ($$\int…$$) work in Markdown cells. SeeMarkdown syntax cheat sheet for the full math syntax.

Speed benchmark (rough)

Rendering a 100KB Markdown document on a 2026-vintage laptop:

LibraryTime
mistune~15 ms
markdown-it-py~30 ms
mistletoe~40 ms
markdown (Python-Markdown)~120 ms
pandoc (subprocess)~300 ms

Numbers vary with content complexity (lots of inline code is slower; lots of tables is slower). But the order is stable across input.

Picker

Your situationPick
Building docs with MkDocsmarkdown (Python-Markdown) — MkDocs uses it
High-volume render pathmistune
CommonMark + plugin ecosystemmarkdown-it-py
Render to LaTeX / custom formatmistletoe
Need DOCX / PDF outputpypandoc
Render in Jupyter cellsmarkdown-it-py (Jupyter uses it)
HTTP-callable, no installMarkdown Tidy API

Try Markdown Tidy free

Paste markdown, get a polished document — no signup required.