Digital content must be formatted in order to be usable. The data—whether text, image, sound or video—must be given a structure and stored in a file. There are now vastly larger amounts of information created in a greater variety of formats than ever before, making it increasingly difficult for libraries to identify what is of value and ensure its longevity over time.
The fluidity of digital formats has been a concern for digital library specialists for some time. Formats can change rapidly as designers alter features, and individual file formats can be very complex. For example, the widely used Portable Document Format (PDF) generally represents page-oriented documents, but these documents can be laden with images, graphics and multimedia content such as video and audio. The current PDF format specification is more than 1,300 pages long. The complexity and fluidity of format families like PDF make a strong case for the importance of a resource like the Formats Web site to provide information about digital formats.
To help its staff plan for the future, the Library of Congress created the Sustainability of Digital Formats Web site. Since a wide range of formats may be proposed for acquisition by the Library, the Web site identifies and describes formats that are promising for long-term sustainability and also those that are unsuitable. At this writing, the Formats Web site provides information on nearly 400 file formats, including variants and subtypes.
The Formats site is intended to provide a descriptive and informative counterpoint to the Library's Recommended Formats Statement, which lists the formats that are preferred for the Library's acquisitions processes, analog and digital.
Although initiated to serve Library of Congress staff, the Formats Web site is accessible to the public as a free and open resource. Comments and suggestions about the site are welcome.
Updated August 24, 2015