Library of Congress

Digital Preservation

The Library of Congress > Digital Preservation > News Archive > Archive-It Tuned Into Customers
Kristine Hanna

Kristine Hanna

July 26, 2010 -- Kristine Hanna, the director of web archiving services at the Internet Archive, has no illusions about the nature of web content. "One of our favorite expressions is, 'the web is a mess,'" she laughed. "It truly is, and it's not getting any better. We have these librarians and archivists who are interested in highly curated collections of born-digital content. So, the question is, how do we get from 'the web is a mess' to highly curated collections?"

The Internet Archive attempts to combat this issue with their web archiving service, Archive-It. Archive-It is a subscription service that allows participating organizations to designate collections of web pages for long-term digital preservation. Via a web-based computing application, subscribers can identify, catalog, manage and access their digital collections, which users can easily browse with full-text search. The Internet Archive issued its latest release of the application, Archive-It 3.5 (external link), on July 8, 2010.

The previous release of the Archive-It application was in late 2009, and Hanna said that they aim to have a new release roughly every six months to best respond to user needs. She explained, "We try to do usability testing throughout the year with two subsets," those being their current partners in addition to non-users of the application. "We try to work with them to understand how we can improve the web application."

Hanna elaborated on the usability issues that have come up with the application, explaining, "Our user base has grown to over 130 partner institutions, and they're a very diverse group. We want to make sure that the added flexibility of the application doesn't mean it's not focused enough to be useful."

Information about the added features of the new release is available on the Archive-It Release Roadmap page (external link). New features include quality assurance reports after each collection effort (or "crawl") displaying errors and out-of-scope web addresses, more precise crawl options such as one-page captures and user-added custom metadata fields for fine-grained collection management.

The team will focus its efforts for its next release on metadata integration — both inside the web application and integrating with other catalog systems. Hanna explained that the current metadata functionality in the application is "not nearly as robust as we'd like it to be," but noted that an update to the 3.5 release allowing custom field entry was a step in the right direction. 

For routine user testing, the Archive-It team organizes webinar walkthroughs of the application along with surveys. Hanna said that they also hope to begin a series of focus group discussions over the summer. "We have a special user interface focus group, where we sit down with librarians and archivists, both users and non-users, and we go over every little thing on the site to see if it makes sense to them and is easy to work with."

Hanna recommended that anyone interested in Archive-It services and future developments should feel free to contact them at (external link).

Release notes are available on the Archive-It wiki (external link). Recorded videos of online training sessions for the application are also available.