Open Access News

News from the open access movement


Thursday, November 10, 2005

OCA book-scanning at the U of Toronto

David Kesmodel and Vauhini Vara, Building an Online Library, One Volume at a Time, Wall Street Journal, November 9, 2005 (free online this week only). (Thanks to ResourceShelf.) A close-up look at OCA book-digitizing at the University of Toronto. Excerpt:
Ms. Ridolfo is part of a massive undertaking to digitize the world's books. She is one of about a dozen scanners employed by the Internet Archive, a San Francisco nonprofit group that is spearheading the Open Content Alliance, a consortium of business and educational groups that includes Microsoft Corp., Yahoo Inc., Hewlett-Packard Co., Adobe Systems Inc. and several university libraries. The group wants to build an online library of millions of old books and hopes to make a big batch accessible through Web searches as early as next year. For all its technical sophistication, the group needs the manual work of people like Ms. Ridolfo to make digitization a reality....The Internet Archive's effort to get books online is still in its early stages. In the little more than a year since the group started scanning books, it has digitized just 2,800 books, at a cost of about $108,250. Funding has come largely from libraries that have paid to have their texts digitized. Work will likely speed up now that Microsoft and Yahoo are on board; both companies joined the effort in October. Microsoft has pledged to pay for the scanning of about 150,000 books from collections at the U.K.'s British Library and elsewhere, and Yahoo will fund the scanning of 18,000 American classics at the University of California. Mr. Kahle estimates it costs about 10 cents a page to get a book online, taking into account equipment, labor and the cost of hosting the pages on the Internet Archive's Web servers. The funding from Microsoft and Yahoo will be used to expand the scanning operation. So far, books are being scanned at just two locations, in San Francisco and Toronto. Participating libraries ship their books to those scanning centers, where a total of eight scanning machines are in use. The group hopes to use new funding to buy more machines, which cost $20,000 to $40,000 each (the more expensive machines can work faster, and can accommodate larger books)....The Internet Archive closely tracks each book that has been scanned, and a computer alerts employees if they try to scan a book that has already been digitized.