June 2007: Digital Preservation Partners Engage in Some Public Relations
The partners in the Library's National Digital Information Infrastructure and Preservation Program (NDIIPP) have been coming to Washington to meet since January 2005, when they held the first of their semiannual get-togethers. They came again this year, albeit in June, and this time they not only held their meeting but also provided some public relations to attendees at the Annual Meeting of the American Library Association, who were also in Washington.
On June 25, following a presentation on the Web Capture project, Laura Campbell, associate librarian for Strategic Initiatives, introduced "the interesting projects" and their representatives. (Webcasts of the presentations can be viewed here). "We are delighted to have these partners as trusted agents in the NDIIPP program."
Tracy Seneca from the California Digital Library spoke about "the Web at risk" and what CDL is doing about the problem. CDL is in the process of developing "tools to allow librarians to capture and curate Web publications," said Seneca, who added that 30 curatorial partners are working with CDL to build its Web archive.
Seneca noted that among the lessons learned on this project is that "curators don't always want [to capture] the entire Web site…. They have to be able to selectively capture Web information." Much of the work of CDL and its partners is devoted to capturing government Web sites, which, Seneca noted, "can be extremely vulnerable to political change . . . and small agencies have limited infrastructure for preservation."
As for Web crawlers in general, "they case a wide net. You don't know what you will get." However, if the sites are not captured, "you may not know what was at risk until it's gone."
Martin Halbert of Emory University, who is leading the MetaArchive of Southern culture project, focused on the content the project is gathering. He spoke of four steps to selecting and collecting content: scoping, describing, inventorying and harvesting. Like NDIIPP itself, Halbert's project has formed a collaborative network. "The first question to ask yourself in forming a digital preservation network is, ‘Is there a subject area that the members share in common?'" In order to determine the scope of what was to be collected, team members had to determine "What is Southern?" The team decided to use "as inclusive a definition as possible."
Next on the agenda was Steve Morris of North Carolina State University, which is leading a project to collect and preserve digital geospatial data resources, including digitized maps, from state and local government agencies in North Carolina. Although this project focuses solely on North Carolina, it serves as a demonstration project for other states.
Micah Altman of Harvard University represented the Data-PASS (Data Preservation Alliance for the Social Sciences) project. According to Altman, "The social sciences have the best questioins because they are all about people: family, culture, folklore, economics, labor, attitudes, stereotypes, politics, justice. … But we don't have very reliable answers, and the answers change, which makes preserving this data even more important."
Julie Sweetkind-Singer of Stanford University spoke for the National Geospatial Digital Archive project, which is led by the University of California at Santa Barbara. She noted that "nearly all map creation is digital. … and thus geospatial data is often at risk of loss."
"There are no cheap, easy solutions" for saving this material.
Preserving records from the so-called dot-com era is the project of David Kirsch of the University of Maryland's Robert H. Smith School of Business, who was the final presenter. "What can you do?" he asked the audience: "Encourage others to do the same."
The following morning, Kirsch and the University of Maryland generously hosted the meeting of partners in the NDIIPP network. He introduced the business school's senior associate dean, Arjang A. Assad, who he said "has recognized the value of the history of business."
Assad noted that NDIIPP is "making the digital age concrete" by collecting and preserving at-risk content.
Martha Anderson, acting NDIIPP director, said she welcomed the opportunity to listen to the partners' discuss their current work as well as the "face-to-face time together in between sessions."
Campbell thanked all in attendance for their support. "Thanks to you, we are on Congress' radar screen." She was referring to the fact that the House and Senate versions of the legislative branch appropriations bills both contain line items for dedicated NDIIPP funding. (At this time, the final bill had not been passed.)
"The good work that you have been doing has not been lost on any of our congressional committee members," who have oversight of the Library of Congress.
William LeFurgy, an NDIIPP program manager, introduced Doug Robinson, executive director of the National Association of State Chief Information Officers (NASCIO). LeFurgy has been working with representatives from the states, which also face various digital preservation issues, such as the retention of court, property, tax and other records. Representatives from all 50 states came to Washington for a series of workshops in which they could discuss their common needs and concerns. Get the complete report here. (PDF, 877KB)
According to Robinson, "We have a digital preservation gap. It is clearly not in the top 10 in terms of" issues that state CIOs are engaged in. One of the reasons may be that the median term in office of state CIOs is a mere 23 months – not enough time to develop policies for long-term preservation of digital state records.
Although digital preservation is far from the top of the list for most state spending budgets, Robinson is optimistic: "It is moving up the list of priorities," he said. A goal for NASCIO is to "create a common language for federal, state and local governments to communicate" about which records need to be saved and how to save them.
Robinson believes that the "hair on fire model" can be very effective in motivating state legislators to channel more resources to digital preservation. Disaster recovery efforts following Hurricane Katrina have forced states to consider the issues more strategically. "Fear will bring this around," he said. "It is a wonderful motivator."
Following a networking break, Mike Wash, chief technology officer for the Government Printing Office (GPO), addressed the group. Wash joined GPO in 2004 to build a technology management program at the agency and develop the Future Digital System (FDsys). FDsys is a digital content system that will allow federal content creators to easily create and submit content that can then be preserved, authenticated, managed and delivered upon request. FDsys will form the core of GPO's future operations; it is expected to be launched in early 2008.
GPO's biggest challenge, says Wash, is that "the public expects that access to all government information be electronic." GPO is charged with managing the publications of the federal government.
Ken Thibodeau, director of the Electronic Records Archives system for the National Archives and Records Administration, faces similar issues with the content his agency manages: the records of the federal government. When ERA launches in 2010, users will expect access to the government's electronic records. Thibodeau said ERA will require 250 petabytes of storage space. "There isn't a system anywhere that [currently] has that capacity." In 2005, the Archives awarded Lockheed martin a $308 million contract to construct such a system.
On the morning of June 27, John Spencer, president of BMS/Chace LLC, spoke of preservation issues inherent with digital recordings. BMS/Chace is one of NDIIPP's newest partners in the Preserving Creative America project.
"Current recording techniques have made the management of our assets very complex," he said. "None of the major record labels uses existing standards" for preservation. "There needs to be control of these assets from creation through preservation."
Following Spencer's presentation, Anderson remarked that "you gave us some insight into our dream of born-archival digital objects."
The remainder of the meeting consisted of breakout sessions and a promise from Anderson that the Library and its partners would "continue to work to share common goals and a vision for the future."