Back to Digital Preservation Pioneers
December 2010 - Michele Kimpton has played a pivotal role in some of the most significant advances in Internet-related digital preservation. Her contributions have also had an international impact on saved content and the tools used for preservation.
Kimpton was a mechanical engineer early in her career. Eventually she got her MBA, which helped her jump from engineering to product management to sales management and into a leadership position. She said, "I was able to take a product from an idea through to developing a market and a sales channel…basically all the aspects of running a business."
She worked in Hong Kong and Paris, where she learned how to adapt, negotiate and get things done in vastly different cultures. Kimpton said, "I had to figure out how to motivate people to do what I hoped they would do and question what was in it for them and what we could provide that was of value to them."
Her Hong Kong clients were stunned that a woman was actually representing her company and giving technical presentations. Kimpton found that to be fun, especially since her Asian clients were also eager to understand new technology and appreciative of her support.
She had very different types of clients in Europe. "They considered Americans much less experienced and cultured, so I had to take a different approach," Kimpton said. "I couldn't talk as the expert. I had to be humble, gracious and take a subtler approach than with the Asian clients." She became adept at adjusting her goals and strategies depending on the culture she was working with.
By the time she returned to Silicon Valley at the end of 1996 she had resolved to make a major life change. Internet businesses were booming and venture capitalists were throwing staggering sums of money at every plausible business idea. It was a hub of great technological potential, so Kimpton switched her focus to software engineering. She also wanted to work for a non-profit organization and give something back to society.
But she did not dive into the Internet right away. "I quit the corporate world. I had been abroad for two years and I wanted to do something totally different for while," Kimpton said. So she opened a brew pub.
Kimpton and a partner created the Ross Valley Brew Pub in an idyllic California town on the rural western edge of Marin County, bordered by coastal mountains and the Pacific Ocean. She built the business from the ground up and she attributes that as her real learning experience for running a start-up.
Kimpton ran the brew pub for two years before entering the dot com arena. Among the start-ups she founded, most notable was a photo-sharing site, which she sold just as the dot com era started to collapse. And then she met Brewster Kahle.
They hit it off right away. Kimpton was moved by Kahle's passionate vision of saving the Web for future generations. Kahle's Internet Archive (external link)was at that point a lightweight operation and just the non-profit high-tech company she was looking for. Kahle asked Kimpton to help make IA a major player in collecting and preserving cultural heritage content on the Web.
National libraries and other large cultural institutions were potential clients and Kahle wanted Kimpton to help develop mutually beneficial relationships with them. But it was not easy. Kimpton said, "It quickly became apparent that even though national libraries were interested in harvesting the Web, to some degree they all wanted their own control. They didn't want to say to Brewster, 'Here's $1 million. You do it. We trust you'"
Kimpton set about getting national libraries to collectively care about and support Web archiving, and from that challenge she helped found the International Internet Preservation Consortium (external link).
Her business experiences in Europe and Asia helped her to work effectively with the multiple cultures in the IIPC. "Each organization had different cultural backgrounds and opinions, which they brought to the party with their strengths and weaknesses," Kimpton said. "And I had to take all of that into account. There's no one solution or agreement that satisfies everybody."
And since IA is not a national library it had little clout in the IIPC formation process. So even though Kimpton was involved at the start of the IIPC – and despite the value of what IA was offering – she had to be tactful and unobtrusive.
By the early 2000s, Silicon Valley was ailing and IA attracted talented refugee engineers and software developers to San Francisco from "down the peninsula," people who were burned out on 70-hour work weeks, unstable stock options and failed businesses. Like Kimpton, they decided to make a social contribution. As they joined IA, Kimpton gained access to some very bright minds, which proved to be crucial as she developed a remarkable tool: a new Web crawler that would eventually be used by major cultural institutions all over the world.
When Kimpton started at IA, its Web crawls were done by Alexa Internet (external link), a search engine and page-ranking company that Kahle co-founded and later sold to Amazon. One of Alexa's features was archiving Web pages as they were crawled. Alexa would give a copy of those pages to IA, which IA would archive and make available to the public after a six-month waiting period.
Kimpton questioned IA's use of Alexa for crawls. "It was pretty risky," she said. "If Alexa went away or if that agreement fell through we were going to be stuck. And if archiving the Web was our mission, we needed a tool to do it properly."
IA worked with the IIPC to gather requirements for what they would like a Web-archiving tool to do. Kimpton put together an extraordinary team of technologists from the IA – and attracted European engineers from the IIPC who traveled to San Francisco to help – and together they built Heritrix (external link), a flexible, scalable and multifaceted crawler.
"Heritrix was adopted by all the members of the IIPC," she said. "After all, it came from their requirements and they participated in its development."
But it isn't just the usefulness of the tool that impresses Kimpton; it was also its quick design cycle. "It was one of these software projects that went the way I like to run them," she said. "Within three months of receiving the requirements we had the software and we were crawling. You don't write a heavy design requirement and take a year to build it because by the time you build it your requirement is obsolete." Kimpton carries this attitude forward today: get a version out into the world as soon as possible and refine it as you go along.
Around 2005, Kimpton and her family moved from California to New England to be closer to relatives. It wasn't long before Kimpton was offered the job of turning the Massachusetts Institute of Technology DSpace (external link)project into a nonprofit business. "They wanted me to develop its own funding model and build its community outside of the MIT leadership," said Kimpton. She accepted their offer.
While developing the project, Kimpton realized that DSpace needed to create awareness about its user community and what they were using it for, where those institutions were located and how you find them. A feature she helped develop is on the DSpace website; you can click the button labeled "Who is Using DSpace (external link)"and see every institution in every country, and more information. That became a key feature for creating and expanding the community because people wanted to see who else was using it. When she started at DSpace there were 300 institutions using it; today there are almost 1,000.
After about three years it became clear to Kimpton that Fedora Commons was doing something similar to DSpace, providing a repository platform to manage, provide access to and preserve digital collections. Kimpton said, "They used slightly different technology but we really had a lot of crossover in the use cases and communities."
Kimpton and Sandy Payette (external link), executive director of Fedora Commons, compared goals, brought their teams together and saw that there was benefit in collaboration. They could be more efficient, raise more funding and look at more succinct technology packages. By July, 2009 they were one organization, called DuraSpace (external link).
Early on in their relationship, they brainstormed about what their communities needed. Kimpton said, "It was obvious that commodity, scalable and virtualizable cloud infrastructure – compute and storage – was emerging in a big way."
They saw that most members of their communities didn’t have the resources to take advantage of cloud technology and they set about developing a technology, Duracloud (external link), to address the need. DuraCloud is a platform that enables digital preservation services and storage across multiple cloud providers providing a simple end to end solution from content repository to cloud store. And, true to Kimpton’s quick design-cycle approach, as soon as DuraSpace had something running in the cloud they began a pilot program so people could start uploading content and providing feedback.
DuraSpace is currently running a large-scale pilot with support from the National Digital Information Infrastructure and Preservation Program. "We've launched the platform as open source so anyone can download it and test it and use it," she said. "We're launching it as a service, which we'll host and manage, and hope to start in Spring 2011."
In weighing the benefits and risks of all the different architectures for managing and preserving content, Kimpton sees great promise for the cloud. "Cloud infrastructure is going to be ubiquitous," she said. "Compute and storage are going to be commodity things just like your electricity. Twenty years from now there'll be a bunch of data centers that provide commodity infrastructure and you just buy what you need to use." She anticipates that it will be a huge paradigm shift for universities because they won't have to build data centers; they will provision some percentage of their IT infrastructure outside of the campus.
Kimpton believes cloud infrastructure will enable collaboration and reuse of content more easily across institutions. DuraSpace is getting more frequent requests from its partners to set up a common infrastructure across institutions for a particular project or set of content where many institutions can easily access and participate in the development of that content.
As evidence of this type of collaboration, Kimpton points to Galaxy Zoo (external link) as an example of collaborative work on a massive amount of content.
On a local scale, she cites the growing number of desktop cloud-backup utilities. "These technologies will become simple to use," she said. "A lot of people don't backup their hard drive because it's a pain in the neck. But if you have a folder sitting on your desk top and you click on it and you see a mirror copy of your existing desktop and you know that it's someplace out there but you don't care where….when people see the value of cloud technology and that it's drop-dead easy, then it will take off."