UP CLOSE | A digital library?

gorman_upclose-53
Photo by Zoe Gorman.

Earlier this month, library digitization assistant Kelly Perry placed an 1845 German medical history book on a machine three times her size.

Perry can choose to turn the pages herself, but on this occasion, she switched on a large metal arm affixed to the device — called a Kirtas machine — to leaf through the book as cameras hovering roughly four feet above the book snapped photographs.

Once Perry captured an image of each page, she sent the book back to the medical school library. It takes her between 30 minutes and an hour to digitize a 250-page book. Factoring in time for processing and uploading the book to the library’s website, it would take about a month for the digital volume to reach users at a total cost of about $150.

Multiply that by 12.5 million volumes.

Though Yale library administrators and staff interviewed said preserving and expanding access to Yale’s holdings is a fundamental goal of any library, the University system has no plan for how to tackle the gargantuan task of digitizing its complete collection.

Yale’s libraries have made an effort to expand some online collections. Scholars now travel to campus and visit Yale’s library facilities simply to log onto University computers that provide access to digital archives, said Ann Okerson, former head of library collections.

Still, University Librarian Susan Gibbons said Yale’s libraries are struggling to determine what proportion of their resources should be poured into expanding online collections. With the exception of the Beinecke, the University libraries have yet to allocate a portion of their budget towards digitizing current holdings, instead relying on outside donors who provide funding to digitize a specific set of works.

Librarians worldwide are adjusting to serve users in an age dominated by online resources and e-readers — and students such as Michael Zucker ’12, who reads and annotates books for class using Kindle, represent the challenges libraries face if they wish to remain relevant.

“If I could download it all on my Kindle, I wouldn’t care about the library,” Zucker said.

But could digital collections ever truly replace Yale’s 12.5 million print volumes?

THE COST OF EASY ACCESS

Many books in Yale’s collection are delicate and require special care that increase the labor and material costs of digitization.

“I guess my answer is if money was no consideration and you were willing to accept a pretty modest level of quality control, and you were prepared to accept some percentage damage to the collections, it could be done,” said George Miles ’74 GRD ’77, curator of western Americana at the Beinecke Rare Books & Manuscripts Library.

The Beinecke devotes a set portion of its budget to digitizing its rare materials and cataloguing them online. The library establishes its own list of digitization projects, and also accepts requests from scholars. Though scholars pay a fee to access those digitized materials, the fee does not cover the full cost of producing a digital copy.

Beinecke Librarian E.C. Schroeder said the Beinecke underwrites digitization for its patrons to help the library build an accessible, online collection.

“We see ourselves as providing a research service,” Schroeder said. “It feeds our mission of providing access to the collections. We don’t charge people to use the library itself.”

From 2009 to 2010, the number of digital pages the Beinecke produces has nearly doubled. The Beinecke has produced 45,000 images so far this calendar year.

Miles said the Beinecke hires professional photographers to staff who scan, photograph and write descriptions of fragile materials in a special studio in the library’s basement.

Chris Edwards, digital studio manager for the Beinecke, said photographers tailor their digitization technique to the size, shape and condition of every item. A Kirtas machine can capture images faster, but its automated air suction arm is more likely to damage older books than a trained librarian’s own hand.

Yale was able to keep all of the three Kirtas machines it received for free through an aborted partnership with Microsoft intended to expand digital collections, Gibbons said.

The Medical School library digitizes some of its collections using a Kirtas machine and funds from the library’s general budget, said Regina “Kenny” Marone, director of the medical library, though many digitization projects are driven by outside donors. Still, Marone said finding a place for digitization within the library’s budget is difficult.

“You’ve got so many demands pressing upon you,” she said. “How do [you] squeeze it into the current budget? It’s a challenge.”

SUSTAINABLE FORMATS

While some print books have lasted hundreds of years, newer formats have not fared so well.

The University Library’s preservation department is working to move outdated material forms — such as audio cassettes and VHS tapes — to digital formats.

Regardless of how they are captured, Gibbons said electronic files can deteriorate faster than print materials and may cost more to maintain. Digital librarians must continuously analyze files for signs of corruption, she said, since even a small change in the code could render the file inoperable.

Roberta Pilette, director of preservation for the University Library, said she is concerned her department will again have to convert these newly digitized files in the coming decades.

Digital technology is changing constantly, she said, and it is not clear whether the programs used to store digital materials will still be used 50 years from now.

“I want to make sure that the file that was done 20 years ago digitally can be opened on your latest Apple,” Pilette said of her current digitization projects. “And I don’t know that we’re really there yet.”

But Edwards said he is not concerned with losing the photographs he takes for the Beinecke when digital platforms are updated. The high quality TIFF (tagged image file format) platform is so widely used that he feels confident the Beinecke will have a means to convert photographs stored as TIFF files to updated forms in the future.

Converting a file to a new format, however, is an expense for libraries, and a risk, Reese added, since some information could be lost in the process.

“Experience has demonstrated that transferring information from one storage platform to another frequently is an imperfect science,” he said. “Things get left out.”

In the 1980s, many libraries began converting print files to microfilm, Reese said, but brittle, erosive and hard to read, the platform proved to be a disaster. Gibbons and Butler said the University Library never discards print materials after they have been digitized.

Although digital materials can expand the audience of a book, some of a volume’s character is lost when the text is removed from its original physical format — and in some cases, that character can be vital.

Gibbons said that the digital copy does not convey the same context for a work that is apparent in the physical book, and Former Acting University Librarian Jon Butler added that using only one uniform, digital copy fails to convey variants among different print copies that might be important to historical scholars.

For example, the size of a book, its cover design and the placement of information on the page can provide scholars with insight into the work and its author. The way a book is bound can help date the work, and indentations on a page can explain how a manuscript was written, Reese said.

The descriptions that accompany digitized holdings contain as many details as possible about the physical text, said Clifford Johnson, a catalog assistant for digital projects at the Beinecke, but describing every detail is impossible. Edwards said he can photograph these elements on the physical volume if a patron requests them.

“The camera technology has really advanced to the point where we are capable of easily and quickly capturing high-resolution images that are very true to the original with very little effort,” Edwards said.

Viewing a text online will suffice for most people, he said, and might prompt some users to visit the library for a look at the real book.

AUTHORIZING ACCESS

Gibbons said she thinks the largest obstacle to digitizing the library’s collections is copyright law: Libraries can digitize holdings at will only if the works in question are in the public domain. Current law stipulates that a work falls out of copyright after an author has been dead for 70 years.

The legal copyright period has only grown longer over the years, Gibbons said — and it is Disney’s fault.

“Disney doesn’t want Mickey Mouse to go into the public domain,” she said, adding that the Disney Company has successfully lobbied to ensure that copyrights on its most famous characters, including Mickey Mouse, do not expire.

If a library wishes to digitize a work written after 1923, its employees can contact the copyright holder to negotiate terms for putting the work on the web, Gibbons said. The copyright holder can request a fee, or ask the library to limit access to members of its educational community alone.

Aside from a few special projects, Gibbons said the University Library is mostly digitizing works written before 1923.

“There is no feasible way for us to contact [owners who own copyright] on an individual basis,” Gibbons said.

Some books printed after 1923 have no known owner. These “orphan works” are a grey area for digitization projects: If a library can prove that it attempted to contact the copyright holder, they may be granted legal permission to reproduce the work. So far, Miles said, American courts have been wary of designating an author’s work an “orphan.”

“How we’re going to work on [orphan works] is going to be an interesting challenge for the whole United States,” Miles said.

The Authors Guild — an advocacy organization for published authors — is suing Hathi Trust, a digital book repository which has partnered with Yale to share University Library holdings, for copyright infringement. That organization has argued that aggressive open access projects that use orphan works are unfairly profiting on the life’s work of unknown authors, Miles said.

Google Books has managed to resist lawsuits so far, Miles said. He attributed this, in part, to authors’ inability to develop a united complaint against Google.

A GLOBAL DIGITAL COMMUNITY

For now, the cost and time required to digitize all of Yale’s volumes exceed the library’s capacity, Gibbons said. But a growing digital community of libraries can help the University move towards its goal of expanding access to its holdings.

“Companies that were doing big microfilming projects are now doing big digitizing projects,” Reese said. “If the question was could all the books in the Yale library be digitized, the answer is probably yes … Yale won’t necessarily have to be the one to digitize them.”

If another institution has digitized a work also found in Yale’s print collections, for example, the institutions might share access through the Hathi Trust.

Based in Ann Arbor, Mich. and operating in conjunction with the University of Michigan, the Hathi Trust offers libraries across the country access to one another’s collections, Pilette said. Yale became a member of the Trust in August 2010, when it became the second Ivy League institution after Columbia University to join.

Jeremy York, project librarian for Hathi Trust, said most libraries choose to allow unrestricted access to their materials in the repository, but Pilette said that in some cases, copyright law dictates that only the affiliates of the library that owns the print edition may access digital copies of those materials.

Yale has also contributed 27 of the 2,500 items in the World Digital Library, an online collection curated by the Library of Congress and launched in 2009.

Michelle Rago, product manager for the World Digital Library, said that the project is intended to provide users everywhere with a more global view of current events and cultural history. Modern researchers in general, she added, must change their definitions of “going to the library” to include accessing online databases.

Yale has also taken steps to support open access policies. In May, Yale became the first Ivy to unveil a new policy that allows any internet user to access a catalog of millions of images from University museums, libraries and archives. The University does not restrict how these images are used.

“Yale is unique in the way that it opens its doors to the public,” said Meg Bellinger, director of the University’s Office of Digital Assets and Infrastructure. “Part of our mission is to have parallel access to content in the same unobstructed way and hopefully at a very low cost.”

Still, Gibbons said she is concerned that the proliferation of digital resources is discouraging students and scholars from taking advantage of resources that have yet to reach the internet.

“We are fighting the presumption that all information exists on the web,” Gibbons said. “When you look at a collection like Yale’s, because there’s that false impression, [researchers] often limit their [work] to just what’s on the Web, and therefore are missing entire collections that are extremely valuable.”

For many, however, the University’s digital resources may be the next best thing.

By allowing access to those who cannot afford to make a trip to New Haven, Edwards said, Yale’s digitization projects level the playing field for academics.

“Yale has things in its collection that you aren’t going to find anywhere else in the world,” Edwards said. “Digitization, I think, is really leading towards the democracy of research. It’s allowing anybody to research what used to be only accessible to the few.”

For now, most of Yale’s treasures will remain on the shelf.

Comments