Sustainability of Digital Formats: Planning for Library of Congress Collections

Introduction | Sustainability Factors | Content Categories | Format Descriptions | Contact
Format Description Categories >> Browse Alphabetical List

PDF/A-2, PDF for Long-term Preservation, Use of ISO 32000-1 (PDF 1.7)

>> Back
Table of Contents
Format Description Properties Explanation of format description terms

Identification and description Explanation of format description terms

Full name ISO 19005-2. Document management - Electronic document file format for long-term preservation - Part 2: Use of ISO 32000-1 (PDF 1.7)
Description

PDF/A-2 is a constrained form of Adobe PDF version 1.7 (as defined in ISO 32000-1) intended to be suitable for long-term preservation of page-oriented documents for which PDF is already being used in practice.

See PDF/A_family for more information about the PDF/A family of standards. See PDF/A-1 for information about the previous version of the standard. PDF/A-2 was not intended to replace PDF/A-1. See press release from the PDF/A Association, which states, "PDF/A-1 remains in effect without limitation, and where PDF/A-1 features meet current requirements sufficiently well, there is no immediate reason to switch to PDF/A-2." The primary difference between PDF/A-2 and its predecessor PDF/A-1 is the move to a later underlying version of PDF (PDF 1.7 as opposed to PDF 1.4). Added capabilities were through compliance with PDF 1.7 (as defined in ISO 32000-1) and include support for:

  • Improvements to tagged PDF (for enhanced accessibility)
  • Compressed Object and XRef streams (for smaller file sizes)
  • PDF/A-compliant file attachments, portable collections and PDF packages, allowing archiving of sets of documents as individual documents in one file
  • Transparency for graphical elements
  • JPEG 2000 compression
  • Application of digital signatures signatures in accordance with the PAdES (PDF Advanced Electronic Signatures) standard.

The PDF/A-2 standard defines three levels of conformance: level A satisfies all requirements in the specification; level B is a lower level of conformance, "encompassing the requirements of this part of ISO 19005 regarding the visual appearance of electronic documents, but not their structural or semantic properties." An intermediate level of conformance was introduced for PDF/A-2; level U conformance represents level B conformance with the additional requirement that all text in the document have Unicode equivalents.

The support for JPEG 2000 in PDF/A-2 is based on the support for JPEG 2000 in the parent PDF 1.7 specification (ISO 32000-1:2008). See Notes below for information about how the JPXDecode filter used in PDF 1.7 relates to the JPEG 2000 specifications. ISO 19005-2:2011 adds a few restrictions for PDF/A-2 to increase compatibility with PDF/X and PDF/E, e.g., constraining the number of color channels (to 1, 3, or 4).

Production phase A final-state format for delivery to end users and long-term preservation of the document as disseminated to users.
Relationship to other formats
    Subtype of PDF_family, Portable Document Format
    Subtype of PDF_1_7, PDF, Version 1.7 (ISO 32000-1:2008)
    Subtype of PDF/A_family, PDF for Long-term Preservation
    Has earlier version PDF/A-1, PDF for Long-term Preservation, Use of PDF 1.4
    Has subtype PDF/A-2a, PDF for Long-term Preservation, Use of ISO 32000-1 (PDF 1.7), Level A Conformance
    Has subtype PDF/A-2u, PDF for Long-term Preservation, Use of ISO 32000-1 (PDF 1.7), Level U Conformance
    Has subtype PDF/A-2b, PDF for Long-term Preservation, Use of ISO 32000-1 (PDF 1.7), Level B Conformance
    Has extension PDF/A-3, PDF for Long-term Preservation, Use of ISO 32000-1, with Embedded Files
    Has later version PDF/A-4, PDF for Long-term Preservation, Use of ISO 32000-2 (PDF 2.0)

Local use Explanation of format description terms

LC experience or existing holdings See PDF/A_family.
LC preference See PDF/A_family.

Sustainability factors Explanation of format description terms

Disclosure Open standard, approved in October 2010 and published by ISO in 2011. Developed by the working group (WG 5) under ISO/TC 171 SC2, Document Imaging Applications, Application Issues. It is a Joint Working Group, including participation from ISO/TC 46 SC11, Archives/records Management, ISO/TC 130, Graphics, and ISO/TC 42, Photography. At the time, AIIM (The Association for Information and Image Management) was acting as secretariat for the working group under the auspices of ISO. See PDF/A_family for information about the secretariat since 2017.
    Documentation

ISO 19005-2:2011. Document management -- Electronic document file format for long-term preservation -- Part 1: Use of ISO 32000-1 (PDF/A-2).

The standard cannot be used without ISO 32000-1. Document management -- Portable document format -- Part 1: PDF 1.7, which it uses as a normative reference. An equivalent to the first edition of ISO 32000-1 is also available at https://opensource.adobe.com/dc-acrobat-sdk-docs/pdfstandards/PDF32000_2008.pdf.

Adoption

The compilers of this resource are unable to determine the degree to which different versions and conformance levels of PDF/A are created and used. See PDF/A_family for a discussion of adoption of PDF/A in general. Comments welcome.

PDF/A-2 was not intended to make PDF/A-1 obsolete, but to add support for features that would be valuable in some circumstances. The addition of support for JPEG 2000 compression has led to recommendations for use of PDF/A-2b when scanning from paper, because of reduction in file size. See, for example, How to Pick the Right Version of PDF/A from PDFtron, a company producing software tools to enable digital document processing capabilities in enterprise and commercial applications. In Digitization of Textual Documents Using PDF/A, Yan Han and Xueheng Wan describe their methodology, based on open source software, for converting page images scanned from historical materials as uncompressed TIFFs (see TIFF_UNC) to use JPEG 2000 compression (see J2K_C) and combine images for all pages in a document with page-level metadata into a single PDF/A-2b file. They were able to recover raster images with data streams identical to the original TIFFs. Based on their experience with over 600,000 page images, they argue that the resulting PDF/A-2b file has advantages as an archival master over the use of a set of separate TIFF or JPEG 2000 images.

    Licensing and patents See PDF/A_family.
Transparency See PDF/A_family.
Self-documentation See PDF/A_family.
External dependencies See PDF/A_family.
Technical protection considerations See PDF/A_family.

Quality and functionality factors Explanation of format description terms

Text
Normal rendering See PDF/A_family.
Integrity of document structure See PDF/A_family.
Integrity of layout and display See PDF/A_family.
Support for mathematics, formulae, etc. See PDF/A_family.
Functionality beyond normal rendering See PDF/A_family.

File type signifiers and format identifiers Explanation of format description terms

Tag Value Note
Filename extension pdf
The standard does not indicate that a different extension should be used to distinguish PDF from PDF/A.
Internet Media Type See related format.  See PDF/A_family.
Magic numbers See related format.  See PDF/A_family.
Indicator for profile, level, version, etc. See note.  The standard specifies that the PDF/A version and conformance level of a file shall be specified using the PDF/A Identification extension schema defined in the standard. This schema has two mandatory elements: pdfaid:part (integer) and pdfaid:conformance (closed list of text values). A PDF/A-2 file should have the integer value 2 for pdfaid:part. See PDF/A_family for an example of the identification markup.
File signature See related format.  See PDF/A_family.
Pronom PUID See note.  There is no PRONOM entry specifically for PDF/A-2. See https://www.nationalarchives.gov.uk/PRONOM/fmt/476 for conformance level PDF/A-2a; https://www.nationalarchives.gov.uk/PRONOM/fmt/478 for PDF/A-2u; and https://www.nationalarchives.gov.uk/PRONOM/fmt/477 for PDF/A-2b.
Wikidata Title ID See note.  There is no Wikidata Title ID specifically for PDF/A-2. See https://www.wikidata.org/wiki/Q26545877 for conformance level PDF/A-2a; https://www.wikidata.org/wiki/Q26547266 for PDF/A-2u; and https://www.wikidata.org/wiki/Q26546575 for PDF/A-2b.

Notes Explanation of format description terms

General

Support for JPEG 2000 in PDF/A-2: is based on the support in PDF 1.7 (ISO 32000-1:2008). This is covered in subclause 7.4.9 JPXDecode Filter, which states:

  • The JPEG 2000 specifications define two widely used formats, JP2 and JPX, for packaging the compressed image data. JP2 is a subset of JPX. These packagings contain all the information needed to properly interpret the image data, including the colour space, bits per component, and image dimensions. In other words, they are complete descriptions of images (as opposed to image data that require outside parameters for correct interpretation). The JPXDecode filter shall expect to read a full JPX file structure—either internal to the PDF file or as an external file.

    NOTE 5: To promote interoperability, the specifications define a subset of JPX called JPX baseline (of which JP2 is also a subset). The complete details of the baseline set of JPX features are contained in ISO/IEC 15444-2, Information Technology—JPEG 2000 Image Coding System: Extensions (see the Bibliography). See also <https://www.jpeg.org/jpeg2000/>.

    Data used in PDF image XObjects shall be limited to the JPX baseline set of features, except for enumerated colour space 19 (CIEJab). In addition, enumerated colour space 12 (CMYK), which is part of JPX but not JPX baseline, shall be supported in a PDF.

PDF/A-2 imposes a few additional constraints on number of color channels, bit depth, and colorspaces for compatibility with versions of PDF/X and PDF/E current in 2010.

See also PDF/A_family.

History PDF/A-1 was published in 2005. The specification for PDF/A-2 was published as ISO 19002-2:2011 in July 2011. PDF/A-2 was not intended to replace PDF/A-1. PDF/A-3 was published in 2012; it is identical to PDF/A-2 except that it allows files in any format to be embedded with the PDF. For more of the history of PDF/A, see PDF/A_family and https://en.wikipedia.org/wiki/PDF/A.

Format specifications Explanation of format description terms


Useful references

URLs


Last Updated: 07/29/2022