October 25, 2010 -- Digital storage users and providers from a variety of cultural and research institutions met for two days of presentations and discussions as part of a continuing initiative to find and develop common understandings and possible solutions to common problem sets.
The Library of Congress National Digital Information Infrastructure and Preservation Program sponsored the Designing Storage Architectures for Digital Preservation meeting, held on September 27-28, 2010, in Washington DC. The conference brought together technical industry experts, vendors, IT professionals, owners and managers of digital collections, government data specialists and other practitioners of digital preservation. The presentations and a short summary are online
Each presentation was limited to five minutes so that more time could be spent in exchanges among users, providers and researchers. The subsequent discussions were lively and direct, and included topics such as data integrity, the increasing use of the concept of "resilience," access, risk and loss, scaling storage for small and large collections and the reliability and stability of various media.
The meeting began with a comparison of the expectations and storage trends in cultural institutions (e.g, libraries, archives and research entities) and the high performance computing world. This provided a context for continuing conversations to try to build bridges among communities of storage users.
Users focused on current and future storage requirements, and on what they have learned from multiple iterations and growth in their storage systems. Two presentations, one from the Academy of Motion Picture Arts and Sciences and one from Major League Baseball, described new system architectures that are being used to provide both long-term storage and access.
Vendors spoke about their technological advances in storage media and their forecasts for developments in the near future. Occasionally there was a clear reminder from both vendors and users that storage companies are commercial businesses whose advancements in technology were based on financial incentives. One industry representative suggested that there are two requirements at play in the room: 1) preservation and 2) profit from technological advancements. He said, "Votes are always made with budgets for advancements in technology." Another participant pointed out that consumers drive the storage market, not enterprise-level requirements.
Participants discussed the non-technical aspects of preservation, such as how much loss is acceptable, whether the concept of "permanence"" is useful, and what the incentives are for creating and improving technical solutions.
One part of the agenda focused on new vendor technologies in data de-duplication and compression. One participant challenged the value of these technologies in this community, and argued that compressed data is at greater risk of loss. This was typical of exchanges that led to meaningful dialogues between stakeholders.
A few of the participants ackowledged that there is reluctance on the part of both users and vendors to report data loss and storage failures. Several participants described past and current efforts to do deliberate testing and to create incentives to share reporting on failures.
One researcher described an ongoing study of system usage patterns, using logs. He requested that participants share any data and insights on access, with the hope of being able to remove the "probably" issue of building solutions.
Martha Anderson, director of NDIIPP Program Management, closed the meeting by describing the importance of being able to change assumptions over time, as we all learn more about how the users of the present and future will want to access and manage digital content.