June 10, 2010 -- The National Digital Information Infrastructure and Preservation Program is a long name. All those nouns are packed together to convey the program’s broad engagement in promoting enduring access to significant digital content. The term "infrastructure" appears in the middle, which is appropriate given the central role that tools, services and other underlying systems play in digital preservation.
Leslie Johnston recently joined NDIIPP as manager of technical architecture initiatives and has a fresh perspective on the program’s infrastructure work. The first thing she mentions is working with others. "Collaboration is a key concept for NDIIPP," she said. "The Library knew from the start that it had to work with a network of partners to make progress with a scalable digital preservation architecture and it is great to see how many institutions are active in our partnership network."
In terms of work over the next few years, Johnston sees a focus on tools and technology needed to enable digital preservation across a variety of communities through a distributed infrastructure. Johnston explained, "Every institution that collects content is looking at building capacity in terms of software, storage and other technical components—these are issues we all have in common."
Johnston has been with NDIIPP since March of 2010, but has focused on managing and preserving digital information for years—"Since the beginning," she laughed. "When we started out, no one knew what digital preservation was, or what it might be, or what we even meant by preservation."
Changes in user expectations and a migration to digital formats within scholarly communication have both been among the driving forces behind NDIIPP. Johnston described the mounting pressure on collecting institutions to acquire and make available digital content—as well as to store that data long-term. "Even ten years ago, no one thought they’d need the skills and resources that they do now," remarked Johnston. "Supporting digital information is a new operational mandate for libraries, and it’s almost completely unfunded. So people are looking for opportunities to collaborate, and that’s the role NDIIPP has been playing."
The NDIIPP website currently hosts a growing directory of tools and resources related to digital preservation, an effort that Johnston credits as a major success of the project thus far. Describing more broadly the goals of the project, Johnston explained, "I think we can make digital preservation more approachable and less frightening to people. A lot of institutions are so daunted by the prospect of saving their digital data that they’re frozen, and aren’t preserving anything as a result."
Among the many complicating factors involved with digital preservation is the software that is used to create and store files; as recently highlighted by the Planets Project, file formats eventually become obsolete and render files inaccessible after a relatively short period of time. NDIIPP’s response to this threat is to support the development of open-source software for meeting long-term preservation needs. Johnston remarked that open-source software "is easier to preserve and customize for individual purposes," compared against proprietary software. She continued, "Open-source software can be developed more collaboratively between different groups to address a shared need."
The spirit of collaborative development has already been witnessed in other NDIIPP projects, such as the partnership with the California Digital Library that led to the creation of BagIt.
BagIt is a packaging specification for large file transfers between two institutions, or even within a single institution, which was developed jointly by the Library and the California Digital Library in response to difficulties the groups had with large digital file transfers—initially, web capture files. Since releasing the software on Sourceforge (external link), Johnston commented, "I keep hearing from people who are finding new ways to use it."
The DuraCloud project likewise promises to see a wide variety of application methods across different fields of study. The idea behind the project is to use offsite processing and data storage—in short, cloud computing technology—to help institutions manage and preserve their expanding digital collections. Participants are drawn not only to the storage solution that DuraCloud offers, but also to their file conversion services.
One of the initial three participants in the project is the New York Public Library (external link) , whose extensive image collections (and related digitization efforts) have proven to demand additional flexible computing power, software services and storage space. As far as the appeal to preservationists goes, Johnston explains, "One of the best ways to mitigate your preservation risks is to store your data in more than one location."
In addition to building upon the directory of preservation software tools, Johnston emphasized the continued catalytic role that the NDIIPP program will have over the coming years in regard to inspiring more collaboration among its partner projects nationwide. Regarding the rapid pace of technological change, Johnston admitted, "It’s hard to imagine where the project will be in five years. No one can imagine what technologies we’re going to be dealing with at that point. But our core activities won’t change. We’ll still be working to develop best practices for preservation. We’ll still be needed."