Follow us on Instagram
Try our daily mini crossword
Subscribe to the newsletter
Download the app

University and Google Books move forward with digitization

Around 70 percent of the 1 million books that will eventually be included in the Google Books digital archive have already been digitized, University Librarian Karin Trainer said in an e-mail. 

The initiative for digitization began in early 2007, when the University Library and Google agreed to a six-year contract to make less than one-tenth of the University’s 11 million holdings — which include manuscripts and periodicals as well as books — available online through Google Book Search.

ADVERTISEMENT

With 12 million books in more than 300 languages digitized so far, Google is moving forward with its project by soliciting research proposals, the Chronicle of Higher Education reported last week.

The Internet search engine launched Google Book Search in 2004 with the goal of scanning and making available every book ever published within the following decade. The project allows users to search for certain terms within the full texts of books that have been scanned into the database, much like the main Google search engine searches websites for certain phrases. Google intends for the website to serve primarily as a research tool.

Every month, the University matches Google’s database with its own library database and sends a shipment of books that have not yet been digitized to Google, Trainer explained, adding that the sequence in which the books have been digitized so far has been “more or less random.”

Recently, Google has begun reaching out to scholars to offer grants of $50,000 for one year of humanities text-mining research, according a call for proposals obtained by the Chronicle. The grants initiative focuses on improving Google’s digital library and book-search metadata.

Details of the grants do not appear on the Google research website, and research proposals have been requested only from “select researchers and faculty members,” the Chronicle reported. University of Michigan and the University of Illinois at Urbana-Champaign have confirmed solicitations from Google. Submissions are due by April 15.

“As far as I know, no one among our humanities professors has been approached by Google in connection with their upcoming grant program,” Trainer said. She explained that the Google grants are not for research on the digitization process but on text-mining, adding that “there may be a few humanities faculty members at Princeton whose work involves such analysis, but I’d say the number is small.”

ADVERTISEMENT

Google did not respond to a request for comment. 

The University’s partnership with Google has been headed by Trainer, Deputy University Librarian Marvin Bielawski and University Provost Christopher Eisgruber ’83.

At the time of the contract, Princeton was the 12th library to partner with Google in the project, joining Stanford, Harvard, Oxford the University of Michigan, the University of California, the University of Texas and the University of Wisconsin. To date, Google’s Library Project now has 28 partners, including seven international libraries.

Debate about copyright infringement has been prominent in the Google Book Search project. In 2005, two lawsuits were brought against the company by the Authors Guild and a group of book publishers partnering with Google in the project, alleging that Google violates publishers’ rights by scanning the full text of books that are still under copyright. 

Subscribe
Get the best of the ‘Prince’ delivered straight to your inbox. Subscribe now »

Google responded by changing the Google Books website so that only a few pages of copyrighted books that are fully scanned are available. The full texts of the copyrighted books that have already been scanned remain in the Google database, though they are not available to users.

Because the University’s contract with Google only allows it to scan texts from the library that are in the public domain and no longer under copyright protections, the University has avoided involvement in the lawsuits, Trainer said. The libraries of Stanford, Harvard and Oxford also only allow Google to access books in the public domain. The universities of California, Michigan, Texas and Virginia, however, have permitted Google to digitize copyrighted materials.