I've previously discussed the Directory of Open Access Journals (DOAJ) on this blog. DOAJ is truly an incredibly valuable collection of resources....But it's hard to browse that fantastic directory without wishing for a magic search box that could perform a search across that entire collection. DOAJ has actually started down that road on their own -- of the 2459 journals in the directory today, 721 can be searched at the article level using the Find Articles tool on the DOAJ site.
But with the advent of CSE [Google Custom Search Engines], it seemed to me that we might be able to create that "magic search box" another way. So I went to the DOAJ site and grabbed the journal metadata in CSV format as described in the FAQ (note that it's licensed CC-BY-SA-1.0). Then with a few Excel hacks, I was able to parse out a list of 1604 domains from that list that hosted English-language DOAJ journals. I carved out the domain names, added a slash and asterisk at the end to indicate I wanted everything in that domain, and dropped them in the batch upload box for CSE. And just like that -- we have what you might call an early prototype of a "magic search box" for English-language DOAJ journals....
Pretty cool, no? Now, of course, this needs a lot of fine-tuning, which I have already started. The domain-level addresses I used are too broad. In one case, there was actually a journal hosted on geocities.com , so I was initially pulling in everything from that domain -- of course, I fixed that quick. In a lot of cases, journals are hosted on their university publishers' domains, so I'm pulling in all content from that university's site right now. Fortunately, Google Co-op CSE offers some pretty cool capabilities to focus your search on certain pages or portions of a site....
It may take a little time to work through 1604 domains to check or fine-tune them all, but the nice thing about CSE is that it's part of Co-op, which is designed for collaborative projects. So if you'd like to help fine-tune the DOAJ CSE, please let me know at lukethelibrarian at gmail dot com and I can send you an invitation to join the crew.
Once that's starting to shape up, I'll work up a CSE that will cover DOAJ's Spanish-language journals. Then, after that... who knows? ...
Posted by
Peter Suber at 11/10/2006 11:46:00 AM.
The open access movement:
Putting peer-reviewed scientific and scholarly literature
on the internet. Making it available free of charge and
free of most copyright and licensing restrictions.
Removing the barriers to serious research.