← Blog

R Markdown: a practical introduction for data scientists

R Markdown combines R code, Markdown prose, and computed output in one reproducible document. The minimum you need to know to write your first .Rmd file.

R Markdown is the workflow most data scientists who use R reach for when they need to combine analysis code, prose explanation, and computed output (tables, plots) in a single document that's reproducible, shareable, and exportable to PDF, HTML, or Word.

If you're new to R Markdown — or you've been writing Jupyter notebooks and someone on your team wants.Rmd files instead — this is the practical introduction.

What R Markdown is

A document format (file extension.Rmd) that mixes:

  1. YAML frontmatter at the top — title, author, output format
  2. Markdown prose — standard Markdown for narrative
  3. R code chunks — fenced blocks tagged```{r} that R actually executes when you "knit" the document
  4. Computed output — plots, tables, model summaries — that R inlines into the rendered document

Knit the.Rmd and you get a finished HTML / PDF / Word file with the prose, the code (optionally hidden), and the output, all in order.

Minimal example

---
title: 'Q3 sales analysis'
author: 'Data Team'
output: html_document
---

## Setup

The data lives in `data/sales.csv`. Load it.

`` `{r setup, message=FALSE} ``
library(dplyr)
library(ggplot2)
sales <- read.csv("data/sales.csv")
`` ` ``

## Headline number

`` `{r} ``
mean(sales$revenue)
`` ` ``

The chart below shows the trend.

`` `{r, echo=FALSE} ``
ggplot(sales, aes(x = month, y = revenue)) + geom_line()
`` ` ``

Knit it (RStudio: Knit button; CLI:rmarkdown::render("file.Rmd")) and you get a styled HTML report with the prose, the loaded data, the mean revenue value, and the chart inline.

R Markdown vs Jupyter notebooks

People who already use Jupyter often ask why bother with R Markdown:

  • Plain text source..Rmd is text —git diff shows real diffs..ipynb is JSON with embedded base64 images;git diff is a wall of noise.
  • One file in, one document out. Jupyter notebooks blur the line between "interactive REPL" and "shareable document". R Markdown is firmly the latter.
  • Pandoc under the hood. R Markdown uses Pandoc for the final rendering step, which means you get every Pandoc feature for free: footnotes, citations, references, custom templates.

People who already use Jupyter oftenstick with Jupyter because the interactivity is better for exploration. The clean split: explore in Jupyter, write the final report in R Markdown.

For more on Jupyter + Markdown, see headings in jupyter notebook + the widerMarkdown in Python write-up.

Code chunk options that matter

Every`{r} chunk accepts options that change how it's executed and rendered. The ones you'll use most:

  • echo = FALSE — run the code but don't show it in the output
  • eval = FALSE — show the code but don't run it
  • message = FALSE, warning = FALSE — hide library loading noise
  • include = FALSE — run silently; don't show code or output
  • cache = TRUE — cache the result so re-knits skip re-running this chunk
  • fig.width = 8, fig.height = 5 — control output figure dimensions
`` `{r model, echo=FALSE, message=FALSE, cache=TRUE} ``
fit <- lm(revenue ~ ad_spend + season, data = sales)
summary(fit)
`` ` ``

Output formats

Theoutput: field in the YAML controls what knit produces. The common ones:

  • html_document — HTML with styling, the default for fast iteration
  • pdf_document — PDF via LaTeX (needs a LaTeX install)
  • word_document — Word.docx
  • github_document — Markdown that renders nicely on GitHub
  • xaringan::moon_reader — slides
  • bookdown::gitbook — multi-page book

You can put multiple formats and toggle between them:

output:
  html_document:
    toc: true
    theme: cosmo
  pdf_document:
    toc: true

R Markdown to PDF

The PDF output requires LaTeX. Easiest install: thetinytex R package:

install.packages("tinytex")
tinytex::install_tinytex()

Thenoutput: pdf_document in your YAML works.

If you don't want to install LaTeX (it's 1-4 GB), an alternative path: knit to HTML, then convert that HTML to PDF. SeeMarkdown to PDF for the four-way comparison; the browser print-to-PDF orMarkdown Tidy options skip LaTeX entirely.

R Markdown to Word

Setoutput: word_document. Knit produces a.docx with the prose, the code (or hidden), the computed output, and tables/plots inline. For stakeholders who'll keep editing the report after you hand it off, this is the right format. SeeMarkdown to Word for the wider story on Markdown → DOCX.

Common R Markdown pitfalls

  • Caching gone wrong.cache = TRUE is fast but caches by chunk source code, not data. If your CSV changes, you need to invalidate the cache manually. Usecache.extra = file.mtime("data.csv") to invalidate on file change.
  • Plots not showing. Almost always afig.width/fig.height issue or a chunk option that suppressed output (include = FALSE).
  • Knit fails silently. Runrmarkdown::render("file.Rmd") from the R console (not the Knit button) and read the full error — RStudio's Knit button truncates.

R Markdown vs Quarto

In 2026, the heir to R Markdown isQuarto (.qmd files). Same idea, different file extension, broader language support (R, Python, Julia, Observable all in one document). Posit (the company behind R Markdown) is investing in Quarto. New projects should consider starting in Quarto; existing R Markdown work doesn't need to migrate.

Try Markdown Tidy free

Paste markdown, get a polished document — no signup required.