Open Access News

News from the open access movement


Thursday, November 03, 2005

More on Brewster Kahle and the OCA

Quinn Norton, Off the shelf and on to the web, The Guardian, November 3, 2005. Excerpt:
Imagine a library where you can find all the books in the first place you look. Imagine you can search, Google-style, over their text, and then feel the pages between your fingers, or see the tea splotches of the first readers, long dead. And imagine doing all of this in your own home. The plan is a book lover's dream; and the particular book lover intent on creating this Open Library is Brewster Kahle, known as the digital librarian of the internet. Kahle made his name indexing and storing the web in his Internet Archive. His non-profit organisation, stationed in an unassuming colonial home in San Francisco's Presidio, has moved on to grab and upload all kinds of media: public domain films, audio archives, and amateur endeavours such as Project Gutenberg, which has been painstakingly hand-typing public domain texts since the 70s. Now he has taken the idea of digitising the text of books one step further, and is storing not just the text, but, incredibly, high-resolution snapshots of book pages, good enough to reproduce every fold, blotch and texture of the world's catalogue of public domain works on your screen. It is an ambitious project, but he has allies among other technologists, and the support of large companies such as Microsoft and Yahoo. A consortium of tech companies, libraries and academic institutions has formed the Open Content Alliance, working together to create the Open Library, the future home of these works....Kahle divides the existing literary world into strata of copyright protection. In-print books are the ones you can buy and often read snippets from via Amazon. Out-of-print publications are harder to reach. What Kahle calls "orphaned works" come next: these book are out of print, and their copyright owner is un-contactable. Generally, these books are found in libraries or not at all. Finally, there is the pre-1926 world of the public domain. These are books that copyright law allows everyone to reprint, rework and convert into pristine digital formats as they see fit. The majority of works are in the first three categories but the public domain itself remains huge. This is where the Open Library initiative is focused. And that may be why the big boys are so interested. When the impetuous Google Print project set about scanning the very top strata, books still within copyright, it provoked a fire-storm of protest. But Kahle ducks that controversy, and has come up with something more impressive. Not just text, but real books that are free to use, and unladen with lawsuits and licences. Kahle hopes to begin moving up to the next strata, orphaned works. These remain in a legal limbo for now, but Kahle and his supporters hope that future legislation in the US could open up more of these often disregarded works to be used in new ways. He sees Amazon's "search inside the book" and Google Print as moving down to meet him, both burrowing to his ultimate aim. This, he defines, with a slightly tired smile, as: "Universal access to all human knowledge - one page at a time."