Building Accessible Tables in PDFs

Building Accessible Tables in PDFs

Tables are one of the most common accessibility failures in PDFs. Learn how to structure header cells, scope, and spanned cells so screen readers cope.

PDF Compliance TeamMarch 27, 20268 min read
Share

Tables are one of the most common places PDF accessibility breaks down. A table that looks perfectly clear on screen can become an incomprehensible stream of numbers for someone using a screen reader, because the visual grid that lets a sighted reader connect a value to its row and column header doesn't exist in the underlying tags. To make a table accessible, you have to encode that structure — which cell is a header, which cells it governs, and how the rows and columns relate — into the document's tag tree.

This guide walks through how accessible tables work in a tagged PDF: the difference between data and layout tables, the required tag structure, header cells and the Scope attribute, the Headers/ID mechanism for complex tables, and how to handle spanned and nested cells. It builds on the broader principles in our WCAG 2.2 and PDF/UA accessible PDF guide and the fundamentals in PDF tags and reading order.

Data tables vs layout tables

The first question to ask is whether your table is actually a table. In a data table, the cells hold related information — figures, dates, statuses — and the position of each cell within a row and column carries meaning. A budget table, a schedule, a price list: these are data tables, and they must be tagged as tables so assistive technology can expose the relationships.

A layout table, by contrast, is a grid used purely to position content on the page — two columns of body text, an image beside a caption, a header banner. The grid has no informational meaning; it's a visual convenience.

The guidance here is simple: avoid layout tables in accessible PDFs. When a screen reader hits a Table tag it announces "table," reports the dimensions, and switches into a cell-by-cell navigation mode — confusing for content that isn't really tabular. If you've used a table only for layout, either re-tag the region with ordinary structure elements (paragraphs, figures, headings) or mark the table as an artifact so assistive technology ignores it. Reserve real Table tags for real data.

The required table structure

A correctly tagged data table mirrors the structure you'd recognize from HTML. The tags are:

  • Table — the container for the whole table.
  • TR — a table row. Every row, header or body, is a TR.
  • TH — a header cell.
  • TD — a data cell.
  • THead, TBody, and TFoot — optional row groups that separate the header row(s), the body, and any footer rows.

Every cell must live inside a TR, and every TR inside the Table (directly or in a THead/TBody/TFoot group). Cell counts should be consistent across rows once you account for spans. A common failure is tagging cells as TD when they should be TH, or leaving stray text inside the Table but outside any TR — both break the grid.

Here is a minimal, well-formed structure for a simple table with a header row and two data rows:

Table
├── THead
│   └── TR
│       ├── TH (Scope=Column) "Region"
│       ├── TH (Scope=Column) "Q1"
│       └── TH (Scope=Column) "Q2"
└── TBody
    ├── TR
    │   ├── TH (Scope=Row) "North"
    │   ├── TD "120"
    │   └── TD "135"
    └── TR
        ├── TH (Scope=Row) "South"
        ├── TD "98"
        └── TD "110"

Notice that the first cell in each body row is a TH, not a TD — the region name is a row header that labels the figures beside it.

Header cells and the Scope attribute

Marking a cell as TH tells assistive technology "this is a header," but it doesn't say what the header applies to. That's the job of the Scope attribute, which takes one of these values:

  • Column — the header labels the cells in its column (the typical top-row header).
  • Row — the header labels the cells in its row (a left-hand row label, like the region names above).
  • Both — the header labels both its row and its column (occasionally needed for a corner cell).

In the example above, "Region," "Q1," and "Q2" carry Scope=Column, while "North" and "South" carry Scope=Row. With scope set correctly, a screen reader can announce the relevant header before each data cell as the user moves around — which is what turns a wall of numbers back into meaning.

Scope is enough for the great majority of tables: any table where each data cell is governed by exactly one column header and, optionally, one row header. Set it on every header cell and most of your work is done.

Headers and IDs for complex tables

Scope assumes a clean grid. Some tables aren't clean — multiple levels of headers, headers outside the first row or column, or a data cell governed by more than one header that scope alone can't express. For these, PDF supports the explicit Headers/ID association, the same model as HTML's headers and id attributes:

  1. Give each header cell a unique ID (a TH with an identifier).
  2. On each data cell, set a Headers attribute listing the IDs of every header that applies to it.

This spells out the relationships cell by cell instead of inferring them from position. It's more work, so use it only where Scope genuinely can't do the job — a financial statement with grouped column headers, say, or a matrix where a value depends on two header levels. For a plain table, Headers/ID is overkill.

Spanned and merged cells

Merged cells are tagged with span attributes that tell assistive technology how many rows or columns a cell covers:

  • ColSpan — the number of columns a cell spans (a header straddling several columns).
  • RowSpan — the number of rows a cell spans (a label running down several rows).

Set the span on the cell that visually occupies the merged area, and do not add empty placeholder cells for the covered positions — the span already accounts for them. A grouped header like "2024" sitting above "Q1" and "Q2" would be a TH with ColSpan=2. Getting spans right is what keeps the grid's cell count consistent and lets the reader's table navigation stay aligned with what's on the page.

Nested tables

A table can contain another table inside a cell, and that's valid as long as the structure is clean: the inner Table lives wholly within a TD (or TH) with its own complete TR/TH/TD tree. Nesting is hard to follow by keyboard, though, so treat it as a last resort — often a nested table is better split into two or flattened into one. If you must nest, make the inner table's headers and scope just as complete as the outer table's.

How a screen reader announces a well-tagged table

When the structure is right, the experience is predictable. On entering the table the screen reader announces it and reports the number of rows and columns. As the user moves cell by cell, it reads the relevant headers before each cell's content — for the value 135 above, it would announce something like "South, Q2, 135," giving both row and column context. The user can jump to the start of a row or column, skip the table, or read it straight through. None of that works without correct TH tags, Scope values, and consistent rows.

How to verify

Don't assume the tags are right because the table looks right. To check:

  • Inspect the tag tree. Confirm the Table contains only TR groups, that header cells are TH (not TD), and that every TH has a Scope.
  • Use a checker. A PDF/UA validator that implements the Matterhorn Protocol will flag missing scope, irregular row lengths, and cells outside a TR.
  • Navigate with a screen reader. Move through the table cell by cell and listen for the correct headers. If the wrong header is announced, your scope or Headers/ID mapping is off.
  • Test the table in isolation. Read just the table from its first cell and confirm the announced dimensions match what you see.

Tables sit alongside the other structural pieces of an accessible document — reading order, alternative text, and interactive content. For the form equivalent of this structural work, see accessible PDF forms.

Key takeaways

  • Tag only real data tables as tables; avoid layout tables, and mark them as artifacts or re-tag them with ordinary structure if you've used them for positioning.
  • The required structure is TableTRTH/TD, optionally grouped with THead/TBody/TFoot, with every cell inside a row.
  • Set Scope (Column, Row, or Both) on every header cell — this handles most tables.
  • Use Headers/ID only for complex tables that scope can't describe, and use RowSpan/ColSpan for merged cells without empty placeholders.
  • Verify by inspecting the tag tree, running a Matterhorn-based checker, and navigating the table with a screen reader.

Keep reading