Lead Partner: Old Dominion University, Department of Computer Science
Additional Partner: Los Alamos National Library Digital Library Research and Prototyping Team
This project seeks to integrate preservation capabilities into standard Web practices. The project assumes that the core technologies for creating a “preservation-ready” web are in place; what is needed is a concerted, high-profile effort to instantiate the technologies in simple protocols, methodologies and software.
A main current effort of the project is Memento. Memento wants to make it as straightforward to access the Web of the past as it is to access the current Web. If you know the URI of a Web resource, the technical framework proposed by Memento allows you to see a version of that resource as it existed at some date in the past, by entering that URI in your browser like you always do and by specifying the desired date in a browser plug-in.
Objectives
- Support enhancements to a technical framework aimed at better integrating the current and the past Web. The framework adds a time dimension to the HTTP protocol and, inspired by RFC 2295, introduces the notion of transparent content negotation in the datetime dimension.
- Promote the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) as a mechanism for discovering resources in the deep web.
- Continue to explore the benefits of using the mod_oai Apache module to support both efficient discovery of updates and resources to support normal web crawling, as well as preparing "preservation-ready" resources for harvest.
- Apply the OAI-PMH to a variety of new applications, including the creation of an Apache module that allows for the harvesting of events directly from the Apache logs themselves.
More detailed project information can be found at the Project Web site (external link)
Highlights
- The release of MementoFox, a plugin for the Firefox browser (external link)
- Webcast: Memento: Time Travel for the Web
- Paper: Memento: Time Travel for the Web (2009) (external link)
- Webcast: Thinking Differently about Web Page Preservation
- Dissertation: Integrating Preservation Functions Into the Web Server (PDF, 3.83MB) (2008) (external link)
- Presentation: Tools for a Preservation-Ready Web (PPT, 1.43MB) (2008)
- Paper: A Quantitative Evaluation of Dissemination-Time Preservation Metadata (Proceedings of ECDL 2008) (PDF, 322KB) (external link)
- Paper: Site Design Impact on Robots: An Examination of Search Engine Crawler Behavior at Deep and Wide Websites (D-Lib Magazine, March/April 2008) (external link)
- Paper: Integrating Preservation Functions Into the Web Server (PDF, 4.4MB) (2008) (external link)
- Paper: Lazy preservation: Reconstructing Websites by Crawling the Crawlers (Proceedings of the Eighth ACM International Workshop on Web Information and Data Management, November 2006) (external link)
- Michael Nelson of the TPRW project is a digital preservation pioneer