Library of Congress

Digital Preservation

The Library of Congress > Digital Preservation > Partners > MetaArchive Interview

Back to MetaArchive Project

Is creation of the MetaArchive a direct result of the NDIIPP award or was this a project that was already in the development stages?

While developing a digital archive and cross-institutional scholarly portal service had long been a goal and while research had already been conducted on the utility and feasibility of such a project, the creation of the MetaArchive was a direct result of the NDIIPP award.

What are some examples of at-risk materials that you will preserve?

The MetaArchive project participants are considering a range of at-risk materials to preserve, from Web-based exhibitions to datasets to creative works that were born digital. One born-digital resource being considered for preservation is Southern Spaces, Emory University’s peer-reviewed Internet journal and scholarly forum that provides open access to essays, gateways, events and conferences, interviews and performances, and annotated Weblinks on the real and imagined spaces of the American South. This journal publishes multimedia scholarship that avails itself of the possibilities of an online journal that cannot be accurately conveyed through print journals. As such, the scholarly articles in this journal are born digital and thus without analog counterparts. The fact that these at-risk digital materials also represent Southern Studies scholarship further recommends them for preservation through the MetaArchive project.

Virginia Tech’s collection of electronic theses and dissertations (ETDs) helped to surface the valuable data gathered and published by graduate students and has become a global resource for accessing primary research. Preserving this collection, currently stored in a broad range of formats, will help to ensure the continued viability of this important resource. Virginia Tech’s ETDs also help promote e-publishing by allowing students to convey their work through a range of multimedia. In recent years many theses and dissertations have been born digital; digital preservation is the only way to protect these at-risk materials.

Among the other at-risk collections initially identified for harvesting and preservation is Georgia Tech’s institutional repository, SMARTech (or Scholarly Materials and Research at Georgia Tech). Because this repository contains self-archived copies of pre-print publications, conference proceedings, research data, honors theses and a range of other unique scholarly materials in many different electronic formats, it represents at-risk digital collections of scholarship. Such collections are not simply valuable to the scholarly communities that access them; they are also records of the intellectual heritage of the institution.

Auburn University’s digital copies of its yearbook, Glomerata (1897 to the present), likewise are unique records of institutional history as well as at-risk digital materials. The digitized images of the Glomerata pages are stored on a single server at Auburn, with backup copies on multiple DVDs. Both methods have proven to be unreliable within the past year, and their deficiencies resulted in hours spent redoing scanning, cropping and layout. Keeping redundant copies in distributed caches would appear to be the most reliable preservation method currently available. Florida State University, which will be contributing a digital image collection of its campus architecture (the FSU Historic Photograph Collection, 1940-1990), likewise faces the problem of maintaining the integrity of these digital masters. The images are in a transitional location, awaiting streamlining into a stable digital preservation environment. They are at risk of degradation and loss because there are currently no systematic, periodic integrity verification processes.

Image masters from the Kentucky Quilt Project at the University of Louisville provide the foundation for online exhibitions and databases and, given the nature of their analog counterparts, constitute a more flexible medium for research and education in this subject area. While these materials were not born digital, they are highly valuable components of derivative digital works. Further, as David Seaman (director of the Digital Library Federation) has pointed out, digital masters are typically overlooked in preservation efforts, despite the fact that these materials are typically stored on CD-ROMs with little to no systematic checks of the data’s integrity over time.

How will these materials complement your analog archives?

The at-risk materials we are considering for preservation provide unique digital perspectives on Southern culture and history. Even those that were digitized from analog materials serve a distinct role in research and education by extending the reach and the interactive possibilities of the original materials. As an illustration, Web exhibitions may often be viewed as mere online reproductions of previous physical exhibits, when in fact these Web sites can function as classroom teaching tools, image databases for research and catalogs for planning archive visits. In this sense, the digital materials we will be preserving will extend the reach and the use of analog archives.

These at-risk materials also complement the content focus of each institution’s analog archives. For example, as part of its collection policy, Emory University’s Manuscript, Archives and Rare Book Library (MARBL) gathers original works on Georgia authors; African American history and culture; the histories of Emory University, Atlanta, Georgia and the South; and the history of higher education in the South. One of the at-risk collections Emory University has already targeted for digital preservation is Southern Changes, the quarterly journal of the Southern Regional Council that for the past 20 years has served as an alternative and groundbreaking news outlet for stories on social justice in the South. A number of the digital collections that Georgia Tech is considering for preservation continue their institutional focus on architecture and urban planning: The Buildings of Georgia Tech from 1888 to1908; Photographs of the Historic American Buildings Survey Georgia; and "Splendid Growth": Architectural Drawings of the A. French Textile Building.

How do you define "Southern" and how do you define "culture" for the purposes of this project? Is geography the most important factor in this definition?

The definition of Southern culture and history used in this project is constructed with broad strokes, with an eye toward capturing not simply people, places, events and folkways commonly associated with the South but also cultures and histories elided in traditional notions of the South and the Southern way of life. The Content Committee responsible for this definition owes a debt of gratitude to the editors of the Encyclopedia of Southern Culture on whose introduction we relied heavily.

A discussion of Southern culture and history must always begin with clarification of the terms. "Southern" is a term that, to most, brings to mind a particular region. However, upon closer inspection, the South and its boundaries are not so easily mapped. One could begin and end with the 11 former Confederate states, though that excludes the four other slave states that remained part of the Union. One could consider the "census South": the Confederacy with the addition of Delaware, Maryland, West Virginia, Oklahoma and the District of Columbia. There is also the Gallup organization’s South that includes the Confederate 11 plus Oklahoma and Kentucky, and the National Endowment for the Humanities includes Puerto Rico and the Virgin Islands in its South Atlantic Humanities Center. We understand the South as not simply a monolithic region but instead composed of many diverse sub-regions, such as the Black Belt, the Appalachian Mountains, the Piedmont and the Gulf Coast — a multitude of distinct Souths that resist easy encapsulation. Materials that shed light on these regions or that help to constitute the cultures and histories of those regions are of great interest to this project.

The South is also an identity. Southerners who move outside of the region, however defined, retain much of their culture and infuse their new locales with vestiges of their former homes. Conversely, people born outside of the South who come to live within the region find that their work and lives are influenced by their adopted home and themselves become a part of the evolving South.

As the Encyclopedia’s editors and authors did, we will rely on a cultural definition of the South more inclusive than not, focusing largely on the former states of the Confederacy but without excluding the margins of the region where different cultures of the South are evident. After careful contemplation of the meaning of "culture," the editors of the Encyclopedia planned their work "to carry out [T.S.] Eliot’s belief that ‘culture is not merely the sum of several activities, but a way of life.’" This project will preserve materials documenting change over time in all aspects of the Southern "way of life" and encompassing the multitudinous and co-existing cultures and histories in the South — all of which are valuable contributions to our collective sense of what it means to be Southern and to belong to the South.

What methodologies are you using to determine which materials to preserve?

We preserve items that meet criteria for both at-risk materials and Southern cultural heritage materials; we prioritize preservation efforts based on (1) the fitness of the materials to our collection focus and (2) the urgency of preserving the at-risk materials.

At-risk materials are understood as those materials that, according to the NDIIPP parameters, are Web-based or born-digital materials without analog counterpart. Southern cultural heritage materials represent or capture the distinctly Southern way of life. This includes materials that originated from or describe those areas geographically identified as part of the South; however, this also includes materials describing or originating from areas not typically associated with the South but nevertheless capturing or referencing Southern-ness, broadly construed.

As a way of helping partner institutions identify and prioritize objects and collections relating to Southern culture, we asked them to consider the range of themes outlined in the Encyclopedia of Southern Culture’s Table of Contents:

  • Agriculture, Business, and Industry
  • Art and Architecture
  • Education
  • Environment
  • Ethnicity
  • Folk Art
  • Folklife
  • Foodways
  • Gender
  • Geography
  • History, Manners, and Myth
  • Language
  • Law and Politics
  • Literature
  • Media
  • Music
  • Race
  • Recreation
  • Religion
  • Science and medicine
  • Social Class
  • Transportation
  • Urbanization
  • Violence

Partner institutions first identify materials relevant to the project’s focus on the cultural heritage of the American South. Then the institutions consider whether these materials meet the definition for at-risk materials and make archiving decisions based on other aspects of these materials (e.g., their storage medium, copyright restrictions, online availability and so forth). For example, objects and collections only available offline are considered at high risk and would therefore receive highest priority for preservation. In deciding what to preserve and the priority for preservation, MetaArchive partner institutions also consider materials’ and collections’ potential long-term value for scholarship. For example, such primary resources as creative works, datasets and video or audio recordings may provide the source material for new scholarship and thus are important to preserve because of their long-term value. As they make these decisions, partner institutions also attend to other factors that work against the likelihood that certain materials will be preserved. Such factors include the absence of clear ownership or support; a broad range of digital formats comprising the digital object or collection as a whole; content-rich Web resources with dynamic components; and materials based on older or outmoded technology.

Will the materials draw only from Emory and its partners or will you be seeking materials from throughout the South, as you define it?

Expediency dictates that most of the materials we preserve will come from our own and our partner institutions’ collections. However, we have not limited our efforts to preserving only those materials, as the larger goal of collaboratively preserving the digital cultural heritage of the American South necessitates looking beyond our own materials to locate objects and collections that capture the Southern way of life and that are in danger of being lost to future scholars.

For example, the Emory University MetaScholar Initiative recently undertook the preservation of digital-audiotape (DAT) recordings of original interviews used in the radio program American Routes. These original and unduplicated recordings constitute both at-risk materials and Southern cultural materials: they capture the character and voices of musicians and music of the coastal South; and the DATs of these interviews, which were rescued from a building in New Orleans’ French Quarter in November 2005, are "at risk" not only because they have no duplicates but also because digital-audiotapes are a discontinued medium, for which players are no longer being manufactured. Poor storage conditions following Hurricane Katrina threatened to destroy these tapes and with them, voices and sounds that have helped distinguish Gulf Coast cultures and Southern music. Over the next year Emory’s MetaScholar Initiative plans to digitize more than 250 hours of these DATs.

You will be using a "distributed preservation network infrastructure based on the LOCKSS software."  How did you determine to use this approach?

The ease and minimal cost of using this system, combined with its ability to ensure the integrity of copies, recommended it to our project. LOCKSS software allows for a peer-to-peer decentralized approach to protecting, sharing and providing persistent access to digital resources. What this means in this context is that there is no one central location in which these things are stored. Instead, in the LOCKSS model, one literally makes "Lots of Copies" to "Keep Stuff Safe," and those copies are kept at the various nodes of the partner institutions. So, instead of having things locked down at a storage facility, we have things LOCKSSed at six institutions: Emory University, Auburn University, Georgia Tech, Virginia Tech, the University of Louisville and Florida State University.

Each replicates the collections of all others. If a disaster struck, it would have to encompass all of those schools and locations in order to cause a catastrophic loss of our digital heritage. By creating multiple copies of the same content, continually referencing these copies to each other to repair incongruities and storing these copies in several different locations rather than one central location, we better ensure the long-term preservation and accessibility of digital content.

In the course of this project, we will expand the size of digital objects that the LOCKSS system has preserved to date, essentially pushing the frontiers of LOCKSS itself. For instance, Emory University’s Internet-only journal Southern Spaces, currently 160 billion bytes in size, exemplifies the types of born-digital materials that digital preservation efforts must learn to accommodate. In many ways this project will not simply maintain the future accessibility of digital cultural heritage materials on the American South but also help to enhance technology available for preserving the growing array of digital materials.

Do you have any estimates as to the size of the archive you will create during the three years of this project?

We have three terabytes of space allocated to the project, and expect to fill that within the project period.

Can you mention a few of the strengths that each of the partners brings to this project in terms of both content that will be contributed to the MetaArchive as well as technical competence?

MetaArchive project partners bring to this project a combination of content and technical strengths: some contribute more to the content of the archive, some more to its technical development, and others equally in both areas. The following descriptions of the participating institutions will focus primarily on their strengths and will highlight some individuals whose experience is particularly noteworthy, given the challenges of this digital preservation project.

The digital image collection that Auburn University is contributing to this project, the Alabama Cooperative Extension Service (ACES) Photographs, 1920s-1960s, depicts the faces and places of rural Alabama life in the last century. The college yearbook, Glomerata, captures the university’s own history and that of the surrounding community from the end of the 19th century to the present day. Director of Library Technology Aaron Trehub, who is helping develop the MetaArchive conspectus, has a distinguished background in organizational models for interinstitutional collaboration; this experience will contribute considerably to the functioning of this project.

Over the past five years Emory University’s MetaScholar Initiative has focused on supporting a range of scholarly work, from building collaborative partnerships between libraries and museums to developing an online journal for publishing multimedia Southern Studies scholarship – all part of an ultimate goal of realizing the possibilities for research and scholarship in the digital age. Its digital library projects regularly undertake the preservation of Southern cultural memory as part of their objective. Martin Halbert, lead PI on the MetaArchive project, has been executive director of the MetaScholar Initiative since its inception.

As one of the southeastern United States’ pioneers in the construction of digital institutional repositories, Florida State University offers such unique digital documentation as Honors in the Major Theses from 2004 and historic photographs of the university spanning the years 1940-1990. Florida State will also preserve its collection of Digitized Juvenile Literature, American children’s books published in the mid-19th to early-20th centuries. Florida State’s University Librarian Robert McDonald, who is serving as a MetaArchive Steering Committee member, has been an active participant in Internet II efforts to increase the capacity of Internet infrastructure. He brings to the MetaArchive project a strong background in networking and will advise on the structural capacity of each site, ways of improving both network bandwidth, and replication strategies within this shared infrastructure.

A number of digital image collections at Georgia Tech contribute to its institutional history and by extension to the histories of Atlanta, Georgia and the South: The Buildings of Georgia Tech from 1888 to 1908; Photographs of the Historic American Buildings Survey Georgia; Georgia Tech Photograph Collection; Georgia Tech Publications; and Georgia Tech Advertisements. Additionally, the university brings its experience with institutional repositories to bear on issues of digital preservation. As a technologist and as an archivist, Tyler Walters, currently associate director for Technology & Resource Services for Georgia Tech’s libraries and serving on the MetaArchive Content and Preservation Committees, bridges both sides of the digital preservation field.

The University of Louisville’s Jean Thomas collection, Kentucky Quilt Project and Bernheim Foundation oral history interviews enrich the cultural content of the MetaArchive through their documentation of Appalachian and Southern folklife. Delinda Buie, curator of Rare Books at the University of Louisville and serving on MetaArchive’s Content Committee, will bring her understanding of archival community concerns to bear on issues of digital preservation. Dwayne Butler, who is well known for his background in copyright law, makes an invaluable contribution to this project by offering his legal expertise combined with his experience as a librarian and currently as the Evelyn J. Schneider Endowed Chair for Scholarly Communication at the University of Louisville Libraries.

Virginia Tech has long been a leader in the development of digital libraries, particularly the collection of electronic theses and dissertations (ETDs). Gail McMillan, director of the university’s Digital Library and Archives, offers the MetaArchive her unique perspective as a global leader in the development and storage of ETDs. This background promises to contribute considerably to the conceptual process of adapting LOCKSS in new ways to digital archives. In addition to this technical contribution, Virginia Tech’s broad array of digital collections — from recipe books and church minutes to online exhibitions and Web sites on Virginia and Southern history — will enrich the MetaArchive’s body of Southern cultural heritage materials.