Complete Guide to PDF Accessibility: Tagged PDFs, Screen Readers, Heading Structure, Tables, Forms, Fixing Scans, and Publishing Operations
Executive Summary (Key Points First)
- PDFs are convenient, but they have many accessibility “pitfalls.” A PDF without tags (structure), correct reading order, alternative text, and proper table associations is difficult to read aloud and hard to reuse.
- The top priority is not “never publish PDFs,” but to provide equivalent information in HTML when needed, using the PDF as a supplemental document. If a PDF is unavoidable, make Tagged PDF (structured PDF) the standard.
- Scan-only PDFs (image PDFs) should, in principle, be converted to text (OCR) + tagged, or an HTML version must be provided.
- This guide summarizes concrete improvements for common business PDFs: tables, forms, figures, page numbers, bookmarks, and more—plus how to operate and publish them sustainably.
Who this is for (specific): Local governments/public agencies, corporate PR/IR/recruiting, educational institutions, general affairs departments publishing internal rules, production agencies, documentation teams
Target accessibility level: WCAG 2.1 AA (treat PDFs as part of web content and evaluate through 1.1.1 / 1.3.1 / 2.4 / 3.1 / 4.1)
1. Introduction: PDFs Are “Easy to Distribute,” But Not Always “Readable”
PDFs are strong for printing, archiving, and fixed layouts, so they’re widely used in official documents in government, education, and business. However, when a blind or low-vision user relies on a screen reader, a PDF becomes effectively inaccessible if it:
- has no tags (structure such as headings/paragraphs),
- has a broken reading order,
- lacks alternative information for figures,
- treats tables as mere aligned text.
This affects not only users with disabilities, but also people reading on mobile, people searching for specific sections, and anyone who needs to copy/reuse content.
This article explains, from a practical standpoint, how to transform PDFs from “documents you merely hand out” into “documents everyone can read,” covering mindset, production, conversion, publishing, and operations.
2. Essential Accessibility Elements in PDFs: Start with “Tags (Structure)”
2.1 What Is a Tagged PDF (Structured PDF)?
A tagged PDF contains internal logical structure such as:
- Headings (H1/H2…)
- Paragraphs (P)
- Lists (L)
- Tables (Table / TR / TH / TD)
- Figures (Figure)
With this structure, screen readers can understand section boundaries and document hierarchy.
2.2 Reading Order
In PDFs, visual layout often differs from the order that assistive technologies read, especially with:
- two-column layouts (left → right),
- multiple sidebars/columns,
- captions vs. body text.
Correct reading order requires tagging and explicit reading-order configuration.
2.3 Bookmarks
For long PDFs, bookmarks are navigation. Generating bookmarks from heading structure makes it much easier to jump to sections.
3. Images, Charts, Tables: Fixing the PDF’s “Weak Spots”
3.1 Alternative Text for Figures and Images
Like the web, figures in PDFs need alternative text. Because page space is limited, keep alt text short and provide details in the body text.
Example:
- Short alt: “Sales trend 2020–2024 (upward trend)”
- Body supplement: “Sales increased from 120 in 2020 to 210 in 2024…”
3.2 Tables Require “Tagged Relationships”
A table that looks fine visually (with borders) can be a meaningless blob without tags. You must define:
- Header cells (TH)
- Data cells (TD)
- Row/column relationships
So screen readers can convey “which row/column this value belongs to.”
3.3 Charts Should Include “Narrated Numbers”
Charts are hard to interpret via screen reading, so always summarize key points in text.
Example:
“Sales increased every year from 2020 and reached the highest level in 2024.”
4. Scan PDFs (Image-Only PDFs): The Highest-Priority Fix
4.1 Why They’re Risky
Scan PDFs are essentially “paper turned into images,” meaning:
- not searchable,
- not copyable,
- not screen-readable.
From an accessibility standpoint, this is the most serious issue.
4.2 Recommended Fix Options (in order)
- Provide equivalent information in HTML (most reliable)
- OCR to convert to text + add tags
- If that’s still difficult, publish an accompanying text version (Word/text/HTML)
4.3 Notes After OCR
OCR will always produce errors. Proofreading is mandatory, especially for:
- proper nouns,
- numbers,
- tables,
- ruby/furigana.
5. PDF Form Accessibility: Making Documents Fillable
Application forms and surveys become impossible to submit if their PDFs aren’t accessible.
5.1 Field Names (Labels)
Set clear names for each field:
- “Full Name,” “Address,” “Phone Number”
- Include “(Required)” for mandatory fields
5.2 Tab Order (Input Sequence)
If tab order is wrong, keyboard users can’t operate the form. Ensure the logical order is:
- top to bottom,
- left to right.
5.3 Errors and Instructions
PDFs are less flexible than web forms for validation messages. Reduce mistakes by stating before fields:
- what to enter,
- formatting rules (e.g., “no hyphens”).
6. Text, Color, Layout: “Readable Design” Is Still Possible in PDFs
- Base font size: 12pt or larger (12–14pt is safer for long text)
- Avoid overly tight line spacing (packing text harms readability)
- Maintain sufficient contrast (same principle as the web)
- Two-column layouts often break reading order—prefer single-column when possible
- Keep important information out of footnotes; place it in the body text
7. Decide Whether You Should Publish a PDF: Best Practices for Formats
PDFs are suitable for:
- print-first application forms,
- contracts requiring fixed layout,
- materials whose main purpose is archiving/distribution.
But for frequently referenced web information (procedures, FAQs, updates), HTML is usually more accessible because it’s:
- easier to search,
- easier to read aloud,
- easier to read on smartphones.
Conclusion: Provide key information in HTML, use PDFs as supplements—this is the most realistic approach.
8. Testing: How to Check PDF Accessibility
8.1 Minimum Check (5 minutes)
- Text is selectable (not image-only)
- Heading structure exists (bookmarks are ideal)
- Reading order is natural (no two-column “lost in space”)
- Images/figures have alternatives
- Tables are understandable when read aloud (headers are recognized)
- For forms: tab order and labels are correct
8.2 Assistive Technology Testing
PDF viewing environments vary. At minimum, validate on:
- Windows (NVDA + a PDF reader)
- macOS (VoiceOver + Preview, etc.)
9. Operations: A Realistic Roadmap for Organizations with Many PDFs
9.1 How to Prioritize
Prioritize PDFs that are:
- frequently used,
- required for procedures/applications,
- legally/strategically important (high impact).
9.2 “Make All New Publications Accessible First”
Fixing legacy assets takes time. Standardizing new documents as tagged PDFs (or HTML + PDF) prevents future debt.
9.3 Alternative Access Pathways
If a PDF can’t be fixed quickly, connect to an accessibility statement and clearly provide:
- text versions on request,
- phone/email support.
This creates an escape route so people aren’t left behind.
10. Common Failure Patterns and Fixes
| Failure | What Happens | Fix |
|---|---|---|
| Scan-only PDF | Not screen-readable | Provide HTML or OCR + tags |
| Untagged PDF | No structure | Add tags + bookmarks |
| Two-column order breaks | Screen reader confusion | Single-column or fix reading order |
| Table turned into an image | Relationships lost | Tag the table + summarize in text |
| No figure explanation | Meaning missing | Alt text + body explanation |
| Unnamed form fields | Not fillable | Add labels and tab order |
11. Value by Audience (Specific)
- Blind/low-vision users: Tags, order, and alternatives allow comprehension.
- Users with cognitive differences: Bookmarks and headings clarify structure and enable navigation.
- Older adults: Proper contrast and font sizing improve readability and zoom behavior.
- Local governments/companies: Fewer inquiries, stronger accountability, fairer procedures.
- Production teams: Standardization stabilizes quality and reduces rework costs.
12. Evaluating Accessibility Level (What This Guide Aims For)
- Key WCAG 2.1 AA perspectives applied to PDFs
- 1.1.1 Non-text Content: alternatives for figures
- 1.3.1 Info and Relationships: tagged structure, table relationships
- 1.3.2 Meaningful Sequence: reading order
- 1.4.3 Contrast: text/background
- 2.4.x Navigation: bookmarks, heading structure
- 3.1.x Readability: terminology and writing structure
- 4.1.2 Name, Role, Value: form field naming and tab order
- Instead of treating PDFs as isolated artifacts, this assumes operations that include HTML equivalents and alternative access support to approach AA-level user experience.
13. Conclusion: From “Handout PDFs” to “Readable Information”
- PDFs are not readable without tags (structure), reading order, and alternatives.
- Provide key information in HTML, and use PDFs as supplements.
- Scan PDFs are the top priority: OCR + tags, or provide a text/HTML version.
- Pay special attention to tables, figures, and forms so meaning is conveyed via screen reading.
- Standardize new publications first; fix legacy assets in priority order.
- Even when you can’t fix quickly, provide an alternative pathway so no one is left behind.
A well-prepared PDF can be a highly valuable information asset.
Let’s “grow PDFs that anyone can read,” through both production and operations—carefully and consistently.

