Open Access News

News from the open access movement


Thursday, October 26, 2006

Full-text cross-archive search from OpenDOAR

OpenDOAR has created a Google Custom Search engine for the 800+ open-access repositories in its directory.  From today's announcement:

OpenDOAR - the Directory of Open Access Repositories - is pleased to announce the release of a trial search service for academics and researchers around the world....

OpenDOAR already provides a global Directory of freely available open access repositories that hold research material: now it also offers a full-text search service from this list of quality-controlled repositories. This trial service has been made possible through the recent launch by Google of its innovative and exciting Custom Search Engine, which allows OpenDOAR to define a search service based on the Directory holdings.

It is well known that a simple full-text search of the whole web will turn up thousands upon thousands of junk results, with the valuable nuggets of information often being lost in the sheer number of results. Users of the OpenDOAR service can search through the world's open access repositories of freely available research information, with the assurance that each of these repositories has been assessed by OpenDOAR staff as being of academic value. This quality controlled approach will help to minimise spurious or junk results and lead more directly to useful and relevant information. The repositories listed by OpenDOAR have been assessed for their full-text holdings, helping to ensure that results have come from academic repositories with open access to their contents.

This service does not use the OAI-PMH protocol to underpin the search, or use the metadata held within repositories. Instead, it relies on Google's indexes, which in turn rely on repositories being suitably structured and configured for the Googlebot web crawler. Part of OpenDOAR's work is to help repository administrators improve access to and use of their repositories' holdings: advice about making a repository suitable for crawling by Google is given on the site. This service is designed as a simple and basic full-text search and is intended to compliment and not compete with the many value-added search services currently under development.

A key feature of OpenDOAR is that all of the repositories we list have been visited by project staff, tested and assessed by hand. We currently decline about a quarter of candidate sites as being broken, empty, out of scope, etc. This gives a far higher quality assurance to the listings we hold than results gathered by just automatic harvesting. OpenDOAR has now surveyed over 1,100 repositories, producing a classified Directory of over 800 freely available archives of academic information.

Comment.  This is a brilliant use of the new Google technology.  When searching for research on deposit in OA repositories, it's better than straight Google, by eliminating false positives --though straight Google is better if you want to find OA content outside repositories at publisher or personal websites.  It's potentially better than OAIster and other OAI-based search engines, by going beyond metadata to full-text --though not all OA repositories are configured to facilitate full-text Google crawling.  If Google isn't crawling your repository, consult OpenDOAR or try these suggestions.