Back to Digital Preservation Pioneers
On November, 3, 2008, election night, almost every television news station and Web news site displayed an interactive map of the United States with real-time voting results.
The maps shared a common feature: geospatial technology analyzing a range of geographic-based coordinates and data. The technology has evolved over the last three decades from specialized academic obscurity to the ultimate form of public acceptance: it is now just another information-age appliance running discreetly in the background and taken for granted in a Google world. Larry Carver is among the visionaries who enabled this transformation.
Carver began his career at the library at the University of California at Santa Barbara, where he helped build an impressive collection of maps, aerial photography and satellite imagery. His efforts resulted in the development of the Map and Imagery Laboratory (external link) in 1972.
As the MIL collections grew, Carver felt that geospatial data presented a unique challenge to the library. He believed that coordinate-based collections should be managed differently than book-based collections. But not everyone agreed with him. "It became apparent that handling traditional geospatial content in a typical library context was just not satisfactory and another means to control that data was important," he said. "It wasn't as easy at it sounds. I was in a very conservative environment and they were not easily convinced that this was something a library should do."
Carver and others spent years developing an exhaustive set of requirements for building a geospatial information management system. The system had a number of innovative ideas. "We included traditional methods of handling metadata but also wanted to search by location on the Earth's surface," Carver said. ""The idea was that if you point to a place on the Earth you could ask the question, 'What information do you have about that space?' as opposed to a traditional way of having to know ahead of time who wrote about it."
An opportunity to develop that system arrived in 1994 when the National Science Foundation funded Carver and his team to build the Alexandria Digital Library. "We produced the first operational digital library that was based on our research," Carver said. "Our concentration was to be able to develop a system that could search millions of records with latitude and longitude coordinates and present those results via the Internet."
In the process of system development, Carver accumulated a great deal of national spatial content, much of which came from government agencies. NASA donated its Landsat satellite data collection. Other content came from the U.S. Geological Survey, Department of Agriculture, and National Oceanic and Atmospheric Administration and private corporations. Some of the information was on film. "A lot of original film was supposed to go to the EROS Data Center (external link), and some of it did, but a lot of it was destroyed," said Carver.
The ADL also digitized Digital Orthophoto Quadrangles (external link) from the USGS. "The most recent acquisition has been the Citipix (external link) collection, which is a collection of images shot in 2001 and 2002: 67 major cities at a 6" resolution. It is basically a historical snapshot of the entire urban United States." In all, the UCSB libraries have between 6 and 7 million images and over 500,000 maps in their collections.
The geospatial collections are not limited to images. "The ADL engine is agnostic when it comes to data with geospatial coordinates," said Carver. "It doesn't have to be a map or a photograph. ADL doesn't care what the object is as long as it has a geographical footprint of some kind." This enables searches for any data that includes a coordinate system, such as demographic data, census and socio-economic data, real estate and even biological systems. The possibilities are almost limitless; any information that can be related to a spot on the Earth can be searched through the use of an online map client such as Globetrotter (external link).
In 2004, UCSB became an NDIIPP partner along with Stanford University in the National Geospatial Digital Archive Project and later, in a second phase, with the University of Tennessee at Knoxville and Vanderbilt University. This latter work related to development of the Federated Archive Cyberinfrastructure Testbed.
"FACIT creates a whole series of distributed nodes that know about each other," said Carver. "And as information is added in one node, any one archive can decide how many replicas it wants. That has two advantages. It provides an almost fool-proof backup system for your own archive and it also enables faster searching through distribution of data copies."
Awareness of the value of geospatial data has been slowly spreading beyond the scientific community for the past decade. Businesses are especially interested in data that displays comparative changes over time, and Carver figured out a way to meet these needs through a "fee for service" arrangement. He said, "A lot of the materials that we collected were unique and they got the interest of environmental companies, lawyers and just about anybody who was trying to manage land resources. So we set up a model for charging for access to those collections."
The basic concepts behind ADL have been widely adopted by Google Earth, Wikipedia and others. And Carver could not be more delighted.
"I think it's wonderful," he said. "We weren't trying to be the only game in town; we were just trying to raise consciousness way back in the early 1980s that this was a viable way of handling geospatial material. This approach lets people interact with data in a realistic way without having a great deal of knowledge about an individual object. It was a new way of dealing with massive amounts of information in an environment that made finding and accessing information much easier."