Sustainability of Digital Formats: Planning for Library of Congress Collections

Introduction | Sustainability Factors | Content Categories | Format Descriptions | Contact
Content Categories >> Still Image | Sound | Textual | Moving Image | Web Archive | Datasets | Geospatial | Generic

Sound >> Curator's View - ARCHIVED INFORMATION

NOTE: This content was last reviewed and updated in 2004 and remains here for background information only. Beginning in 2015, the Library of Congress has published its format preferences as the Recommended Formats Statement, updated annually.

The illustrative tables presented here are intended to suggest how a curator of sound recordings would determine format preferences.  The first table illustrates a planning matrix that would result from analyzing the significant or essential characteristics for subcategories of sound recordings.  The second table illustrates how this analysis of significant features would be combined with technical information about formats to produce a set of format-preference statements for the various content subcategories.


Table 1: Significant characteristics of sound content subcategories

  Description Fidelity (resolution) [1] Sound field (beyond stereo) [2] Rendering expectations beyond normal [4] Special functionality required by custodians [5] Special functionality expected by end users Effect of technical protection [6]
S1 Audio, surround sound Very important Retain with minimal change [3] Surround sound, multiple speakers Downsample, take excerpts, etc., without artifacting.    Must not affect fidelity or normal rendering
S2 Audio, mono or stereo
Includes most audio content, such as:
• Audio content via copyright
• Audio downloaded from the Web, acquired from collectors, producing organizations
• Home-produced sound, oral histories
• Music in note-based form (e.g. MIDI) preserved as audio
Very important or important, depending on item     Downsample, take excerpts, etc., without artifacting   Must not affect fidelity or normal rendering
S3 Audio, streamed webcast, harvested in bulk Less important         Must not affect normal rendering
S4 Audio incidental to Web harvesting (e.g., background audio) [7] Not important         Not important
S5 Note-based representations (e.g., MIDI) when special functionality is required; may include waveform samples[8] Retain precision of original, retain samples N/A Through specialized performance software Retain functionality of original via performance & composition software [8] Retain functionality of original via performance & composition software Must not affect functionality for end users
S6 Synthetic encoding for non-music (e.g., sound effects, voice for telephone assistance) [For future consideration.]            
S7 Recorded books [9]         Various [10]  

Notes:

1. For example, digital audio in the widely adopted linear PCM format associates fidelity (resolution) with sampling frequency and bit-depth.

2. Generally speaking, this characteristic is associated with surround sound, although it may also pertain to multi-channel audio (e.g., narration available in English and French).  There is related interest in metadata that offers a map of the channels. 

3. Reduction to two-loudspeaker rendering should be feasible with appropriate software.  Normalization to stereo may be appropriate for content that has a more complex sound field originally but where this particular characteristic is not deemed necessary for retention.

4. Normal rendering means playback in mono or stereo through one or two speakers (or equivalent headphones) using software providing user control over volume, balance, fast forward, go-to-track, etc.  Normal rendering would also allow playback through software that allows sound analysis and excerpting.  Normal rendering must not be limited to specific hardware models or devices and must be feasible for current users and future users and scholars.

5. Normal functionality for custodians includes the ability to preserve digital content and provide service to users and designated communities now and decades.  Thus custodians must be able to replicate the content on new media, migrate and normalize it in the face of changing technology, and disseminate it to users at a resolution consistent with network bandwidth constraints.

6. Technical protection must not prevent custodians from taking appropriate steps to preserve the digital content and make it accessible to future generations.  See Notes 4 and 5.

7. In contrast, audio files harvested from the Web through a program targeted specifically at sound capture would be considered as S1 or S2.

8. For music composed using digital composition systems, guidelines will be necessary (by custodians and user communities) as to when the functionality inherent in a note-based representation is an essential characteristic and when the composition should be preserved as audio.  The files of composers at the leading edge of digital composition will often be in non-standard note-based representations.  These will require special consideration.

9. Recorded books in this table refers to commercially published recordings and not the specialized digital talking books for the blind and vision-impaired as specified in ANSI/NISO standard Z39.86.

10. Desired functions include bookmarking; holding last position (where play left off); display of time elapsed, time remaining, and the ability to go to a specified time; support for navigation, e.g., to chapters, sections, or illustrations; display of descriptive information, e.g., title, name of author, name of narrator, name of chapter titles; and re-read capability (repeat last sentence or paragraph).

Back to top

Table 2: Format preferences for sound content subcategories

  Description Preferred formats [1] Acceptable formats
  Encoding type File type, subtype Encoding type File type, subtype
S1 Audio, surround sound [2] 5.1 or 7.1 surround, high-quality lossy [3] AAC_ADIF 5.1 or 7.1 surround, high-quality lossy AAC_MP4
QTA_AAC
WMA_WMA9_PRO
S2 Audio, mono or stereo
Includes most audio content, such as:
• Audio content via copyright
• Audio downloaded from the Web, acquired from collectors, producing organizations
• Home-produced sound, oral histories
• Music in note-based form (e.g. MIDI) preserved as audio
Linear PCM (i.e. uncompressed) WAVE-LPCM-BWF
WAVE-LPCM
AIFF-LPCM
High-quality lossy [3] or Low-quality lossy [3] MP3 with ID3
AAC_ADIF
AAC_MP4
QTA_AAC
WMA_WMA9_PRO
WMA_WMA9
S3 Audio, streamed webcast, harvested in bulk Low-quality lossy [3] MP3
AAC_ADIF
AAC_MP4
QTA_AAC
WMA_WMA9
   
S4 Audio incidental to Web harvesting (e.g., background audio) As available • Any    
S5 Note-based representations (e.g., MIDI) when special functionality is required; may include waveform samples MIDI, Downloadable sounds SMF
XMF
RMID
From tracker software MODS
S6 Synthetic encoding for non-music (e.g., sound effects, voice for telephone assistance) [For future consideration.]   None established    
S7 Recorded books Markup language, may contain LPCM sound and other elements DTB    

Notes:

1. Other device-independent digital formats for sound exist and may be added as preferred or acceptable in the future.  For example, another proposed approach to encoding audio bitstreams, sometimes referred to as Direct Stream Digital and claiming to result in higher fidelity, uses pulse density modulation (PDM) instead of pulse code modulation (PCM).

Some of the formats listed here include elements for technological protection. If implemented, such protections may defeat preservation. Thus the assignment of preferred or acceptable status to these formats assumes that protections have not been implemented or that the Library is in a position to overcome them.

This table excludes formats limited to particular tangible media as inappropriate as part of a general strategy for long-term preservation of digital content.  Hence, the regular Audio CD is not listed; nor are surround-sound formats intended for home-theater use that are limited to certain tangible media (e.g., DVD-Audio & Super Audio CD).

2. Surround sound is important to retain when it is an important element of the artist's intent (e.g. sounds intended to move around in the performance space).  However, for much conventional audio, reduction to stereo is appropriate.

3. The degree of compression and specific encoding algorithms applied to audio produce files at varying levels of quality.  For example, in a current prototyping project, the Motion Picture, Broadcasting, and Recorded Sound Division sees an MP3 file (derived from a PCM bitstream at 44.1 kHz sampling) compressed for a 128Kbps per channel data rate as a reasonable "high quality" service version.  The fidelity offered by such a file is roughly comparable to an audio CD played on normal consumer equipment in routine circumstances.

Back to top


Last Updated: 04/23/2019