Skip to content

PDF/A vs PDF: Understanding the Difference and When You Need Compliance

PDF/A is a specialized subset of the PDF standard designed for long-term digital archiving. While a regular PDF can contain dynamic content like JavaScript, multimedia, and external references, PDF/A strips all of that away to guarantee that the document will render identically decades from now. PdfBroker.io's WeasyPrint service generates PDF/A-compliant documents from HTML and CSS via a simple REST API call — no local libraries or desktop tools required.

This guide explains the practical differences between PDF and PDF/A, when compliance actually matters, and how to generate PDF/A documents programmatically.

What Is PDF/A?

PDF/A is an ISO-standardized version of PDF (ISO 19005) built for one purpose: ensuring that a document can be reliably reproduced in the future without depending on external software, fonts, or resources. It achieves this through a set of restrictions on what a PDF file is allowed to contain.

A PDF/A file must:

  • Embed all fonts — no references to system fonts. Every character must render without relying on the viewer's environment.
  • Include a color profile — typically sRGB or a calibrated CMYK profile, so colors display consistently.
  • Contain XMP metadata — standardized, machine-readable metadata embedded in the file.
  • Be self-contained — no external content references, no linked images, no remote resources.

A PDF/A file must not contain:

  • JavaScript or executable code
  • Audio or video content
  • Encryption or password protection
  • Transparent layers (in PDF/A-1)
  • References to external fonts or resources

PDF/A Conformance Levels

PDF/A comes in several versions, each building on the last. The version you need depends on your industry, regulatory environment, and document complexity.

PDF/A-1 (ISO 19005-1)

The original standard from 2005. Based on PDF 1.4.

  • PDF/A-1b ("basic") — guarantees visual reproducibility. The document will look the same, but text extraction and accessibility are not guaranteed. This is the most common conformance level for archival.
  • PDF/A-1a ("accessible") — adds requirements for logical document structure and Unicode character mapping, enabling reliable text extraction and basic accessibility.

PDF/A-2 (ISO 19005-2)

Released in 2011, based on PDF 1.7. Adds support for:

  • JPEG 2000 compression
  • Transparency (which PDF/A-1 prohibited)
  • Embedded PDF/A files as attachments
  • Digital signatures

Available as PDF/A-2b, PDF/A-2a, and PDF/A-2u (adds Unicode mapping to the "b" level).

PDF/A-3 (ISO 19005-3)

Released in 2012. Identical to PDF/A-2 but allows embedding any file type as an attachment — not just other PDF/A files. This is critical for electronic invoicing formats like ZUGFeRD and Factur-X, where an XML data file is embedded alongside the human-readable PDF.

Available as PDF/A-3b, PDF/A-3a, and PDF/A-3u.

When Do You Actually Need PDF/A?

PDF/A compliance is not just a technical nicety — it is a legal or regulatory requirement in many contexts.

You need PDF/A when:

  • Government submissions — many national archives and government agencies require PDF/A for official records. The US National Archives, the German Federal Archives, and EU institutions all mandate or recommend PDF/A.
  • Legal documents — court filings in several jurisdictions require PDF/A to ensure long-term readability. The European e-Justice portal specifies PDF/A for electronic document exchange.
  • Financial and tax records — tax authorities in Germany (GoBD), France, and other EU countries require or recommend PDF/A for digital invoice archival.
  • Electronic invoicing — ZUGFeRD (Germany/EU) and Factur-X (France) standards are built on PDF/A-3, with structured XML data embedded in the PDF.
  • Healthcare records — medical records retention requirements often specify PDF/A for patient documents that must remain accessible for decades.
  • Insurance — policy documents, claims records, and regulatory filings frequently require PDF/A.

You probably don't need PDF/A when:

  • The document is transient — receipts, email confirmations, or temporary reports that won't be stored long-term.
  • You're generating documents for immediate viewing or printing and don't need archival guarantees.
  • Your documents contain interactive elements (forms, multimedia) that PDF/A would strip away.

PDF/A vs PDF/UA: What About Accessibility?

PDF/A ensures a document is visually reproducible. PDF/UA (ISO 14289) ensures a document is accessible — readable by screen readers and assistive technology. They solve different problems:

Concern PDF/A PDF/UA
Long-term archival
Visual reproducibility
Screen reader support Only PDF/A-1a/2a/3a
Tagged document structure Optional (except "a" levels) Required
European Accessibility Act Not sufficient alone Required

The European Accessibility Act (EAA), which applies from June 2025, requires digital documents to be accessible. PDF/UA is the relevant standard for this. If you need both archival and accessibility, you can generate documents that meet both standards, or produce separate PDF/A and PDF/UA versions.

PdfBroker.io's WeasyPrint service supports both pdf/a-3b and pdf/ua-1 variants. See Getting Started with PDF/A and PDF/UA Compliance via API for implementation details.

Generating PDF/A Documents with PdfBroker.io

PdfBroker.io's WeasyPrint API generates PDF/A-compliant documents from HTML and CSS. You write your document template in HTML, send it to the API with the desired pdf-variant parameter, and receive a compliant PDF in return. The service handles font embedding, color profile inclusion, and XMP metadata automatically.

Prerequisites

  • A PdfBroker.io account with a Basic plan or above (WeasyPrint is a premium service)
  • Your API credentials (Client ID and Client Secret) from the members area

C# with PdfBroker.Client

using PdfBroker.Client;
using PdfBroker.Common.RequestObjects;

var client = new PdfBrokerClientService("YOUR_CLIENT_ID", "YOUR_CLIENT_SECRET");

var html = @"<!DOCTYPE html>
<html lang='en'>
<head>
  <meta charset='utf-8'>
  <style>
    body { font-family: Arial, sans-serif; margin: 2cm; }
    h1 { color: #333; }
  </style>
</head>
<body>
  <h1>Annual Report 2025</h1>
  <p>This document is archived as PDF/A-3b for long-term preservation.</p>
</body>
</html>";

var request = new WeasyPrintRequestDto
{
    HtmlBase64String = Convert.ToBase64String(
        System.Text.Encoding.UTF8.GetBytes(html)),
    WeasyPrintToPdfArguments = new Dictionary<string, string>
    {
        { "pdf-variant", "pdf/a-3b" }
    }
};

byte[] pdfBytes = await client.WeasyPrintAsByteArrayAsync(request);
await File.WriteAllBytesAsync("annual-report-pdfa.pdf", pdfBytes);

cURL

# Authenticate
ACCESS_TOKEN=$(curl -s -X POST https://login.pdfbroker.io/connect/token \
  -d "grant_type=client_credentials&client_id=YOUR_CLIENT_ID&client_secret=YOUR_CLIENT_SECRET" \
  | jq -r '.access_token')

# Encode your HTML
HTML_BASE64=$(echo '<!DOCTYPE html><html lang="en"><head><meta charset="utf-8"></head><body><h1>Annual Report</h1><p>Archived as PDF/A-3b.</p></body></html>' | base64 -w 0)

# Generate PDF/A
curl -X POST https://api.pdfbroker.io/api/pdf/weasyprint \
  -H "Authorization: Bearer $ACCESS_TOKEN" \
  -H "Content-Type: application/json" \
  -d "{
    \"htmlBase64String\": \"$HTML_BASE64\",
    \"weasyPrintToPdfArguments\": {
      \"pdf-variant\": \"pdf/a-3b\"
    }
  }" \
  --output annual-report-pdfa.pdf

Python

import requests
import base64

# Authenticate
token_resp = requests.post(
    "https://login.pdfbroker.io/connect/token",
    data={
        "grant_type": "client_credentials",
        "client_id": "YOUR_CLIENT_ID",
        "client_secret": "YOUR_CLIENT_SECRET",
    },
)
access_token = token_resp.json()["access_token"]

# Your HTML document
html = """<!DOCTYPE html>
<html lang="en">
<head><meta charset="utf-8"></head>
<body>
  <h1>Annual Report 2025</h1>
  <p>Archived as PDF/A-3b for long-term preservation.</p>
</body>
</html>"""

html_b64 = base64.b64encode(html.encode("utf-8")).decode("ascii")

# Generate PDF/A-3b
response = requests.post(
    "https://api.pdfbroker.io/api/pdf/weasyprint",
    headers={"Authorization": f"Bearer {access_token}"},
    json={
        "htmlBase64String": html_b64,
        "weasyPrintToPdfArguments": {"pdf-variant": "pdf/a-3b"},
    },
)

with open("annual-report-pdfa.pdf", "wb") as f:
    f.write(response.content)

Supported PDF/A Variants

PdfBroker.io's WeasyPrint service supports the following pdf-variant values:

Value Standard Use Case
pdf/a-1b PDF/A-1b Basic archival, maximum compatibility
pdf/a-2b PDF/A-2b Archival with transparency and JPEG 2000 support
pdf/a-3b PDF/A-3b Archival with arbitrary file attachments (ZUGFeRD, Factur-X)
pdf/ua-1 PDF/UA-1 Accessibility compliance

Tips for PDF/A-Ready HTML

Writing HTML that produces good PDF/A output is mostly about being explicit and self-contained:

  1. Always set lang on the <html> tag — this becomes the document language metadata in the PDF.
  2. Use web-safe fonts or embed custom fonts — PdfBroker.io embeds fonts automatically, but referencing obscure system fonts may produce unexpected results. Use the resources object to include custom font files.
  3. Avoid external resources — don't reference images via URL. Either embed images as base64 data URIs in your HTML, or pass them via the resources object.
  4. Use CSS for layout, not tables — CSS Paged Media properties (@page, page-break-before, margins) give you precise control over the printed layout.
  5. Set a color profile — WeasyPrint handles sRGB embedding automatically. If you have specific CMYK requirements, note that WeasyPrint operates in the RGB color space.

Summary

PDF/A is the ISO standard for long-term digital document archival. It guarantees visual reproducibility by requiring embedded fonts, color profiles, and self-contained content — while prohibiting JavaScript, multimedia, and external dependencies. You need PDF/A when regulations, legal requirements, or retention policies demand documents that will render identically years from now.

PdfBroker.io's WeasyPrint service generates PDF/A documents from HTML and CSS with a single API parameter. No local PDF libraries, no desktop tools, no manual validation — just send HTML and receive a compliant PDF.