Markdown in Python: parsing, rendering, and conversion libraries
The five Python libraries for working with Markdown — markdown, mistune, markdown-it-py, mistletoe, and Pandoc — compared on speed, features, and security.
If you need to parse or render Markdown in a Python app — a blog engine, a docs site, an AI assistant turning chat output into HTML, a Jupyter pipeline — Python has five practical options. This guide compares them on speed, feature coverage, security, and the use cases each one wins.
The five libraries
| Library | Maintained | Speed | GFM extensions | Plugins |
|---|---|---|---|---|
markdown (Python-Markdown) | Yes, mature | Slow | Via extensions | Mature ecosystem |
mistune | Yes | Fastest pure-Python | Yes (built-in) | Plugin API |
markdown-it-py | Yes | Fast | Yes (built-in) | Port of JSmarkdown-it plugins |
mistletoe | Yes | Fast | Yes | Limited |
pandoc (viapypandoc) | Yes (wrapped) | Slow (subprocess) | Yes | Pandoc filters |
If you want a quick answer:
- Default pick:
markdown-it-py— fast, GFM-compatible, plugin ecosystem - Maximum speed:
mistune - Maximum compatibility with non-Markdown formats:
pandoc - Already invested in Python-Markdown: stay there
Library 1:markdown (Python-Markdown)
The oldest and most-used. Conservative, well-tested, slow.
pip install markdown
import markdown
html = markdown.markdown(text)
# With extensions for tables, fenced code, etc.
html = markdown.markdown(
text,
extensions=['tables', 'fenced_code', 'codehilite', 'toc'],
)
The extension system is its strongest feature. The ecosystem includes:
tables— GFM tablesfenced_code— triple-backtick fencescodehilite— Pygments-powered syntax highlightingtoc— auto-generated table of contentsmeta— YAML frontmatter parsingpymdown-extensions— a curated bundle of 30+ additional extensions
Used byMkDocs,Pelican, and many Django blog engines.
Best for: established Django/Flask projects, MkDocs sites, anyone with existing Python-Markdown extension code. SeeMkDocs: getting started for the MkDocs context.
Library 2:mistune
The fastest pure-Python Markdown renderer. CommonMark + GFM compatible.
pip install mistune
import mistune
markdown = mistune.create_markdown(
escape=False, # allow raw HTML (set to True for user input!)
plugins=['table', 'task_lists', 'strikethrough', 'url'],
)
html = markdown(text)
Benchmarks usually show mistune 2-5x faster than Python-Markdown on the same input. The plugin API is straightforward — you can write a custom renderer that emits whatever output you want (HTML, AST, plain text).
Best for: high-volume rendering pipelines, sites with thousands of pages built per minute, AI assistants that render Markdown on every response.
Library 3:markdown-it-py
A direct port of the JSmarkdown-it library to Python. CommonMark-compliant by spec, GFM via plugins, the most active development of the bunch.
pip install markdown-it-py
from markdown_it import MarkdownIt
md = MarkdownIt('commonmark', {'breaks': True, 'html': False})
md.enable(['table', 'strikethrough'])
html = md.render(text)
Used byJupyterLab (Jupyter notebook Markdown rendering),MyST (the technical-doc Markdown extension), and growing fast in the scientific-Python ecosystem.
Best for: anyone who wants CommonMark conformance, Jupyter integrations, the MyST docs ecosystem.
Library 4:mistletoe
Pure Python, fast, CommonMark-compliant, easily extensible via Python class inheritance instead of plugin APIs.
pip install mistletoe
from mistletoe import markdown
html = markdown(text)
The killer feature: custom renderers are just Python subclasses. If you want a LaTeX renderer, a plain-text renderer, a JSON-AST renderer, write the subclass. Less plugin surface than markdown-it-py, more direct hackability.
Best for: when you need to render Markdown to a non-HTML target (LaTeX, plain text, custom XML).
Library 5:pandoc viapypandoc
Not a Python library — a Python wrapper around the Pandoc CLI. Heavyweight (requires Pandoc installed system-wide) but unbeatable for format conversion.
pip install pypandoc
# Plus: brew install pandoc (or apt, choco, etc.)
import pypandoc
# Markdown → HTML
html = pypandoc.convert_text(md_text, 'html', format='gfm')
# Markdown → DOCX
pypandoc.convert_text(md_text, 'docx', format='gfm', outputfile='output.docx')
# Markdown → PDF (needs LaTeX install)
pypandoc.convert_text(md_text, 'pdf', format='gfm', outputfile='output.pdf')
Best for: scripts that need to produce DOCX, PDF, EPUB, LaTeX from Markdown. SeeMarkdown to PDF andMarkdown to Word for the format-specific stories.
Sanitization (the part everyone misses)
Most Python Markdown libraries don't sanitize HTML by default. If your Markdown source contains:
Some prose.
<script>
fetch('/admin/users').then(...)
</script>
…most libraries pass that<script> through to the output. If that HTML is then rendered in another user's browser, you've shipped XSS.
Two-layer defense:
- Configure the renderer to refuse raw HTML. Most have an option:
mistune(escape=True),MarkdownIt('commonmark', {'html': False}). With raw HTML disabled, the<script>tag becomes literal<script>in the output. - Sanitize after rendering. Usebleach on the output HTML before storing or displaying:
import bleach
clean = bleach.clean(
rendered_html,
tags=['p', 'a', 'em', 'strong', 'code', 'pre', 'ul', 'ol', 'li',
'h1', 'h2', 'h3', 'h4', 'h5', 'h6', 'blockquote', 'br',
'table', 'thead', 'tbody', 'tr', 'th', 'td', 'img'],
attributes={'a': ['href', 'title'], 'img': ['src', 'alt', 'title']},
strip=True,
)
Bleach gives you a fine-grained allowlist of tags and attributes. Anything not in the allowlist gets stripped.
SeeMarkdown to HTML for the broader sanitization story.
Jupyter notebooks (Markdown cells)
Jupyter notebooks render Markdown cells viamarkdown-it-py (in recent JupyterLab versions). The dialect is roughly GFM + LaTeX math ($…$ for inline,$$…$$ for display).
Common headings-in-jupyter-notebook questions:
- "Why aren't my headings clickable in the navigation panel?" — JupyterLab's outline panel only picks up
#headings, not bold-then-newline approximations. - "Why does my numbered list restart at 1 every cell?" — each cell is a separate Markdown document. Continuation between cells isn't supported.
- "How do I get a TOC?" — install
jupyterlab-tocextension; it generates one from the headings.
For LaTeX math support, both inline ($E=mc^2$) and block math ($$\int…$$) work in Markdown cells. SeeMarkdown syntax cheat sheet for the full math syntax.
Speed benchmark (rough)
Rendering a 100KB Markdown document on a 2026-vintage laptop:
| Library | Time |
|---|---|
mistune | ~15 ms |
markdown-it-py | ~30 ms |
mistletoe | ~40 ms |
markdown (Python-Markdown) | ~120 ms |
pandoc (subprocess) | ~300 ms |
Numbers vary with content complexity (lots of inline code is slower; lots of tables is slower). But the order is stable across input.
Picker
| Your situation | Pick |
|---|---|
| Building docs with MkDocs | markdown (Python-Markdown) — MkDocs uses it |
| High-volume render path | mistune |
| CommonMark + plugin ecosystem | markdown-it-py |
| Render to LaTeX / custom format | mistletoe |
| Need DOCX / PDF output | pypandoc |
| Render in Jupyter cells | markdown-it-py (Jupyter uses it) |
| HTTP-callable, no install | Markdown Tidy API |
Related
- Markdown to HTML — the language-agnostic story
- Markdown in React — JavaScript-side equivalent
- GitHub Flavored Markdown — the dialect most libraries target
- MkDocs: getting started — the Python static-site context
Related articles
README.md: a writing guide with examples (2026)
A README is your project's storefront. The structure that works, the sections that matter, and the patterns that make a README convert a curious visitor into a user.
Markdown to HTML: 3 methods compared (in-browser, CLI, library)
Three practical ways to convert Markdown to HTML — a web tool, a command-line one-liner, and library options for Node, Python, and Ruby. With sanitization gotchas.
Notion vs Obsidian vs Typora: which Markdown editor wins for which workflow
Notion, Obsidian, and Typora are the three biggest names in Markdown writing — but they're built for different jobs. Side-by-side comparison with a clear picker.
8 common ChatGPT formatting artifacts (and how to spot them)
Eight specific formatting tics that ChatGPT (and most AI assistants) leave in their Markdown output — with examples and the easiest way to remove each.