Open Access News

News from the open access movement


Saturday, January 14, 2006

New tool makes OA data more useful

M.L. Baker, Gene Mining Strikes Gold, ExtremeNano, January 1, 2006. Excerpt:
There's a lot of scientific data going to waste. Much of it has been painstakingly gathered through timely and costly experiments and is freely available in public databases. But researchers have been hard-pressed to use existing data to ask new questions, because they lack reliable descriptions and computational tools. Now, scientists at Harvard and Stanford have created a software application that overcomes some of these barriers. The program, called Genotext, trolled through publicly available data and came back with genes implicated in aging, leukemia and injury, as described this month in the journal Nature Biotechnology. The program automatically analyzes text descriptions of different experiments. It then identifies which genes were turned on or off, up or down, in various diseases or environmental conditions. That's no easy task, since a single experiment can collect millions of data points and descriptions of very similar experiments can vary widely. "This is a real advance," said John Wilbanks, head of Science Commons, a nonprofit group dedicated to helping scientists find productive ways to share data. "The use of annotation and knowledge to understand functional relationships between genes is where the field has to go."...Scientific journals often require researchers to deposit their microarray data in publicly available databases. Though data formats to describe which genes are turned up to what level are fairly standard, the same can't be said for descriptions of the conditions and tissues in which the genes are measured. That makes it difficult to compare experiments that probe how the environment might change gene activity or how gene activity differs between sickness and health. "We've all agreed on how to represent the genes, but we haven't agreed on how to represent what we actually did in the experiments," [Atul] Butte [Stanford bioinformatics specialist and the study's lead author] said. That's one of the problems that Butte, along with Isaac Kohane, a bioinformaticist at Harvard, set out to solve by creating Genotext....Though Genotext is available for free over the Web, researchers need some programming experience to use it. Plans to create a user-friendly query interface are underway.

More on the author and publisher suits against Google

Mike, How Jealousy Could Destroy The Internet, techdirt, January 13, 2006. Excerpt:
Jealousy is a very powerful emotion. It can make for some great stories in books or movies, but it has no place in the board room -- yet, that's where it is these days. Fred Wilson pointed this out quite clearly last week, when he correctly said that all this talk by the Baby Bells that Google, Microsoft, Vonage and other successful web companies should pay the telcos extra was simple jealousy. Wilson tells the Bells to "dream on," and while we hope he's right, he may be underestimating the destructive power of jealousy. And, it's not just the Baby Bells who are acting this way -- but plenty of online businesses. If they keep it up, they're going to destroy a good thing, just because they can't stand the thought of someone else being successful....Considering the powerful position Google has these days online, it's no surprise that Google is often at the center of this jealousy. It's the main company the Baby Bells want to pay up. It also explains the Google Print controversy -- as authors and book publishers are upset, even though Google is making their content more useful by making it searchable. The latest case of Google jealousy comes courtesy of online publishers, with a Business Week writer suggesting the idea of having major publishers completely cut their content off from Google. It is, as the cliche goes, cutting off your nose to spite your face. People are so upset that Google is successful, they don't seem to notice that it's helped make them more successful too. Google is successful because it's adding value, not just for users, but to the sites it directs its traffic. Cutting Google off makes the content less useful and serves no purpose other than giving in to destructive jealousy. Part of the power of the internet is that it has been able to avoid most of this, by making it easy to link, embed, modify and copy -- to offer new and different ways to manipulate and view the information that's out there. It's that openness that makes the whole thing valuable. Throwing up walls, tollbooths and blockades for the sake of jealousy harms the entire system. Those who support these moves claim they do so to try to maximize their own profit -- trying to take a piece of the cut from these other services that make their product valuable. However, that's a short-sighted view. If they want to do that, they should make their products more valuable on their own without cutting off others. The "profit" they believe they're maximizing, they're actually shrinking by cutting off the added value that others provide. The more this jealousy continues, the more harm it does to the overall value presented by the internet.

Quaero update

James Niccolai, Europe's 'Google killer' goes into hiding, Info World, January 13, 2006. Excerpt:
A project to develop advanced multimedia search technologies led by France's Thomson SA has gone into hiding in the face of intense publicity this week that it is building a "Google killer" that will help to improve Europe's standing in the high-tech world. The project, called Quaero, found itself in the spotlight following remarks last week by French President Jacques Chirac in a speech laying out his agenda for France in 2006. "We must take up the challenge posed by the American giants Google and Yahoo," Chirac said, discussing the importance of technology to Europe's economy. "For that, we will launch a European search engine, Quaero." His remarks prompted some commentators to describe Quaero as Europe's next Airbus SA, the aircraft maker that competes with The Boeing Co. in a contest symbolizing the economic rivalry between Europe and the U.S. There was talk of a coming out party next month where Quaero's goals would be described in more detail, although a spokeswoman for the project said no event has been planned. The scrutiny was apparently too much for Thomson's chairman, Frank Dangeard, who imposed a "news blackout" Thursday on Thomson's media staff and ordered the project's Web site to be taken offline. "There's been a lot of noise and our chairman decided we should stop making any comments until a more official press event," said Thomson spokesman Philippe Paban...."Probably politically what's behind it is an uncomfortable feeling of having all access to knowledge and information filtered or provided through a search engine that (comes from) abroad," said Alex Waibel, director of the InterACT Center at Germany's University of Karlsruhe, which is developing Quaero's speech and language processing technologies. "Having said that," he added, "there's also a wish to make search, in a way, much richer, and in particular that involves multimedia and multilingual information."

A primer on OA journals

Mark Funk, Open Access – A Primer, undated. (Thanks to the Krafty Librarian.)

Comment. This is OA-friendly and well-done. I hate to pick nits but there are a few small ways in which Mark could improve the document. A date would help, since the OA scene is always changing. The primer focuses on OA journals and neglects OA archives, but it admits this upfront. (For a two-sided introduction, see my OA Overview.) The only factual mistake is the claim that most OA journals use the "author pays" model. The Kaufman-Wills report of October 2005 showed that only 47% of OA journals charged author-side fees --fewer than half and even a smaller percentage than the subscription-based journals charging author-side fees. In any case, this model should not be called "author pays" since, as Mark acknowledges, authors rarely pay out of pocket.

More STM publisher blogs

Richard Ackerman has taken up the challenge to find more blogs by STM publishers. His list includes Nascent, Action Potential, and Free Association, all from Nature, as well as TheScientist blog, the Library Journal TechBlog, Lost Boy by Leigh Dodds at Ingenta, and From the Hart by Factiva CEO Clare Hart.

The OCA work agenda for 2006

The Open Content Alliance has released its Work Agenda for 2006. It has launched six working groups to select works for digitization, solicit new contributing institutions, and advise on preservation, book formatting, workflow, and data transfer to the Internet Archive. Each of the groups welcomes public comments and suggestions. Excerpt on the rest of the agenda:
Our focus has been on the OCA's first year. A key milestone will be a public event we are planning for October 2006 to demonstrate the power of collaborative and open efforts to build joint collections. That focus informs the agenda for the coming year. The OCA will initially concentrate on digitally reformatted monographs and serials which represent diverse times, regions and subjects which are in the public domain or available under a Creative Commons license. In other words, the OCA is initially interested in the broad range of digitized documents that are in our libraries and archives. For an October 2006 event, we would like to focus on materials that reflect the history, people, culture, and ecology of North America. This decision is in part a practical one. It establishes essential priorities for the OCA while emphasizing collection depth as a means of encouraging the development of value-added services. It also reflects the general orientation of the initial collections that have been offered to the OCA. (At this stage, OCA is not harvesting metadata.)... The Internet Archive is continuing its role in administering the Open Content Alliance (OCA), but Rick Prelinger, interim OCA director, will unfortunately follow through with his plan to return to the world of moving images. He will help with recruiting OCA staff, and any suggestions for a great Executive Director would be most welcome. The Alfred P. Sloan Foundation has indicated that it may initially help support this position.

Librarians in the age of digitization and OA

Barbara Quint, The Home Guard, The Searcher, January 2006. Excerpt:
What would you do if you had a personal home library numbering in the thousands or even hundreds of thousands of books? Hire a librarian, right?! Well, that’s just what every Web user has as the mammoth book digitization projects by Google, the Open Content Alliance (OCA), Microsoft, Yahoo!, et al., open up their public domain collections. Project Gutenberg has offered tens of thousands of such texts for years. The U.S. Government Printing Office continues to load documents born in public domain, promising eternal archives for them. The open access movement has put masses of scholarly content, similar to what one would expect to find in an academic library’s periodical collection, into the line of sight of Yahoo! Search, Google Scholar, Scirus, and other free Web search engines. And that’s only the material that resembles the traditional content formats that people expect librarians to handle — books and magazines. Then there’s all the content out there on the open Web from authoritative or semi-authoritative or hit-or-miss Web sites. How is a user to tell the wheat from the chaff, the plums from the prunes, the true from the false? Hire an information professional, right?! Well, we know they need us, but do they?...If we information professionals, we librarians, want to serve users, we have to bring our services to where and when the user needs us....Let’s start with three basic principles and one overall goal. Principle One: Our solutions operate on Web time and in Web calendars, i.e., 24/7/365 (366 in leap year). Principle Two: Our solutions conserve our time, energy, and expertise by solving problems as Web-wide as possible. Principle Three: Moving a vendor to provide a solution constitutes a successful solution for us. Goal One: We need to get credit for our solutions, if only in order to get enough influence and resources to make more. Time to roll up our virtual sleeves and get to work.

Washington University joins the OCA

Answering the Tribune's December editorial

Mary M. Case, Health information, Chicago Tribune, December 30, 2005. A letter to the editor.
This is regarding your Dec. 19 editorial "To your e-health." The lack of competition in scientific publishing leads to extraordinarily high prices of research journals. The high prices mean that only health-care professionals and researchers affiliated with well-to-do institutions are able to obtain access to the vast array of relevant published research results. The average citizen faces significant barriers. Parents of children with rare genetic diseases who are active and engaged advocates for their children's health find themselves sneaking into research libraries, hiring students in large medical schools to go to the stacks for them or "borrowing" others' IDs and passwords to search electronic databases--all to read the results of research that is funded with their taxpayer dollars. The National Institutes of Health understands that not only does it have the responsibility of distributing billions of dollars in federal funds to support research, it is incumbent upon it to make sure that the results of that research are widely available to scientists, physicians and the public. Over decades, publishers have clearly demonstrated that their mission to disseminate information is not as important as their opportunity to make money. Through PubMed Central, the NIH is providing the trusted, integrated database that researchers have demanded and the public deserves. For the parents of sick children, the DC Principles Coalition proposal of linking to publisher Web sites, rather than depositing articles in PubMed Central, is one more version of the run-around. It is time for the publishers to stop protecting their own financial wealth and start focusing on our citizens' health.

(PS: An excellent letter! Also see my 12/19/05 response to the Tribune editorial.)


Friday, January 13, 2006

Another journal policy on NIH-funded authors

Judith Gedney Baggs, Open access, Research in Nursing and Health, January 10, 2006. An editorial. Not even an abstract is free online for non-subscribers, at least so far. Excerpt:
A new term has appeared on the horizon of nursing researchers related to publication, it is open access. What is it? Open access, in principle, means publication in a form that allows anyone to have access to the material, so that people are not constrained by having to use a library or to pay for a subscription to a journal....Why would open access be appealing? There are a number of reasons. Librarians, who are troubled by ever-increasing costs for journal subscriptions, believe this would be a good solution. People in rural areas, both in the US and abroad, who do not have access to a good library, would be able to access research that they currently cannot. The National Institutes of Health (NIH) and the U.S. Congress like the idea because it seems reasonable that research that was supported by public funding (e.g., NIH funding), should be accessible to members of the public without their having to pay an additional fee. Open access has the potential to expand the visibility and impact of research by increasing the number of people who can read about it. Why not have open access? Journal publishers make their living from subscription fees....While most peer reviewers are unpaid, there is a complex system supporting editors, editorial boards, and management of the peer review process that is costly....In light of this discussion, and to be open with our authors, I want to share Wiley’s policy related to open access. The entire policy is available [here] at the bottom of the For Authors page. Wiley, publisher of Research in Nursing & Health has agreed to deposit the article, in its final form, in PMC at the time of publication, with the stipulation that it be made available for public access 12 months later. This will be done for any article with an NIH grant mentioned in any part of the manuscript. Authors may request that it not to be posted. With regard to posting manuscripts in an internal website, Wiley’s copyright transfer agreement allows posting unfinished versions of the manuscript on such sites. The final version can be posted to an "electronic reserve room" at their own institution that is for student use. As researchers desiring to publish, it behooves us all to be aware of the policy of any journal we submit our work to with regard to open access. I understand the logic of open access, but I also appreciate the manuscript review process. I have enormous faith in and respect for the reviewers for this journal in assuring that what is published is the best, and, although the charges for some publications are outrageously high, I would not want to see an end to private and societal publication and the peer review process.

Comment. It's one thing for a journal to try to protect its revenue stream, although there's no evidence to date that OA archiving jeopardizes that revenue. But it's quite another to imply that OA is about bypassing peer review when it's about removing access barriers to peer-reviewed literature.

Freeing users to use OA literature

Valerie, Why on earth? The Return of Lady GovDocs, January 12, 2006. Excerpt:
there are some things i just don't get about my library....part of the process for 'acquiring' or providing access to 'open access' or no-fee materials is...contacting the person responsible for the site & getting written notice from them that a license isn't required. (that's right - it's not enough that they're just putting it out there...our legal counsel needs to know that they're not going to sue us for LINKING TO THE SITE.) grr.

Comment. University lawyers are paid to make sure that we err on the side of caution, which in the case of fair-use judgment calls often means that we err on the side of non-use. The hard way to fix this problem is to reform copyright law. The easy way is to make sure that OA content carries some kind of label that it's OA, even if the label isn't as formal as a CC license.

More on the new BMJ access policy

Fiona Godlee, Swept along by the tide, BMJ, January 14, 2006. A short elaboration on the new BMJ access policy. Excerpt:
One unwelcome change for some readers has been the closure of access to the BMJ's non-research articles, which up until now were free for the first week of publication. The change was necessary to maintain subscription revenues. The peer reviewed research articles remain open access (free from the day of publication on bmj.com as well as being on PubMed Central), and the whole journal remains free to most countries in the developing world (those on the HINARI list). Non-research articles become free to all after a year of publication. It is always hard to be asked to pay for something that has been free, but we hope that those readers who don't get the BMJ free through their institution will see enough value in it to pay £20/$37/€30 for a year's full online access.

Trove of OA from American Museum of Natural History

The American Museum of Natural History has launched an institutional repository through which it's poviding OA for its past and present scientific publications. The program includes its journal, Bulletin of the American Museum of Natural History, which is now OA up to its most recent issue from 2005, and its monograph series, Anthropological Papers of the American Museum of Natural History, which is now OA up to what seems to be the most recently published volume in 2002.

DOAJ reaches milestone, plans changes

This morning the DOAJ reached the major milestone of listing 2,000 OA journals. From today's press release:
As of today the Directory of Open Access Journals (DOAJ) contains 2000 open access journals, i.e. quality controlled scientific and scholarly electronic journals that are freely available on the web. The goal of the Directory of Open Access Journals is still to increase the visibility and accessibility of open access scholarly journals, and thereby promote their increased usage and impact. The directory aims to comprehensively cover all open access scholarly journals that use an appropriate quality control system. Journals in all languages and subject areas will be included in the DOAJ. The selection criteria have been updated based on feedback from users to be more understandable.

The database records are freely available for reuse in library catalogues and other services and can be harvested by using the OAI-PMH, and thereby increase the visibility of the open access journals....New titles are added frequently and to ensure that the holding information is correct you have to update your records regularly. We also have to remove titles from DOAJ if they no longer lives up to the selection criteria e.g. during the last 6 months of 2005 50 titles where removed. We are working with publishers of hybrid journals (subscription based journals where authors /institutions for a publication charge can publish articles in open access) in order to include even these articles in the DOAJ. It is our intention to be able to inform about this in the near future.

Feedback form the community tells us that the DOAJ is an important service. In order to be able to maintain and further develop the service we have decided to launch a Donation Programme that makes it possible for all users/institutions to contribute to the continued maintenance and development of DOAJ....DOAJ is or has been supported by the Information Program of the Open Society Institute, along with SPARC (The Scholarly Publishing and Academic Resources Coalition), SPARC Europe, BIBSAM, the Royal Library of Sweden and Axiell.

IEEE provides OA to editorials and book reviews

The IEEE is now offering free online access to the "non-indexed, ancillary content (often called ephemera) from IEEE publications" such as editorials and book reviews. (Thanks to ResourceShelf.)

New business models for open content

Intelligent Television is working on the economics of open content --not at all limited to television. (Thanks to Open Business.) From the site:
With the support of the Hewlett Foundation in 2005 and 2006, Intelligent Television is bringing together business and industry leaders and culture and education stewards to explore new business collaborations between libraries, museums, archives, universities and commercial media and technology enterprises. Intelligent Television is also commissioning and publishing a working paper on the economics of open content, a vital subject; publishing a report, based on a summary of its public-private meetings and drawing on this working paper, highlighting the emerging economic relationships in this field; and developing and producing two new models for commercial-noncommercial media collaborations around cultural heritage and educational materials. Intelligent Television’s Open Production Initiatives serve as one sort of new model for the distribution of open content and open educational content in particular to the broader interested public—a model based in video and film media, produced in the best traditions of documentary television, and meant to be distributed in various complementary ways. The two Open Production Initiatives for this project are being developed in association with Columbia University Center for New Media Teaching and Learning and the Massachusetts Institute of Technology Open Courseware project.

Blogs by STM publishers

Rafael Sidi, STM Publishers and Blogs, Really Simple Sidi, January 11, 2006. (Thanks to Issues in Scholarly Communication.) Excerpt:
I am kind of surprised that we haven't seen any major STM (Elsevier, Thomson, Wiley, Springer, IEEE etc) publishing companies' senior execs embracing blogging and officially blogging. Here is what David Weinberger said in "Talking from the inside out: The rise of Employee Bloggers" (pdf) a white paper by Edelman and Intelliseek:
"Many corporations are afraid of Weblogs because they are afraid of the sound of the human voice. But that voice-the unfiltered sound of an actual person writing about what she cares about, sounding like herself-is actually the most important way of connecting with customers and partners"
If you see one STM publishing exec official blog, let me know.

(PS: Here are two. Jan Velterop is the Director of Open Access at Springer and writes a blog called The Parachute. Chris Leonard writes a blog called Computing Chris, and wrote it while he was a Publishing Editor for Elsevier, though he's now left the company. I hope his departure was not blog-related.)

Jonathan Band updates his Google Library analysis

Jonathan Band, The Google Library Project: The Copyright Debate, American Library Association Office for Information Technology Policy, January 2006. Updating and extending his earlier pieces (this from 9/05 and this from 10/05). Excerpt:
The Google Library Project has provoked newspaper editorials, public debates, and two lawsuits. Much of the press coverage, however, confuses the facts, and the opposing sides to the controversy often talk past each other without engaging directly. This paper will attempt to set forth the facts and review the arguments in a systematic manner.

(PS: This is the most comprehensive defense to date of the legality of Google's opt-out Library poject.)

Technical criteria for OA repository software

Andy Powell, Notes about possible technical criteria for evaluating institutional repository (IR) software, UKOLN, December 2005. Excerpt:
This document attempts to identify some of the technical criteria that might be used to evaluate the different institutional repository (IR) software platform options, particularly in terms of the ‘machine’ interfaces that the repository offers. The list of issues is not intended to be exhaustive, and the approach is based on the assumption that other, non-technical, criteria such as usability and configurabilty have already received detailed consideration in other documents....Three of the most popular IR software platforms are DSpace, ePrints.org and Fedora (though there are others of course). Trying to compare these three is a little like comparing apples and oranges. DSpace is a Java-servlet application that runs under Apache Tomcat. EPrints.org is written in Perl and typically runs under Apache, using mod-perl to improve performance. Both applications provide the basis for an IR ‘out of the box’, including an end-user Web interface and so on. Both offer similar functionality to the end-user. Fedora on the other hand is more like a software toolkit. It provides the underlying IR framework, but requires custom development of a user-interface, either by layering an existing suite of user-interface tools on top of the Fedora APIs, or by building from scratch. Any decision about which IR software platform to choose must be based not only on the technical and functional capabilities of the system but also in determining best fit with organisational IT strategy and with the availability of local software development effort. However, as a way of helping with that decision making process, it may be sensible to ask the developers of these software platforms to respond to the issues raised in the sections below. Some potential questions are suggested in each section.

Teaching students that not everything is OA

Marylaine Block, Information Literacy: Food for Thought, January 13, 2006. Good teaching exercises for students who "believe everything they need to know is available for free with a simple Google search -- and, if they don't find it there, that it doesn't exist at all."

Comment. I wholeheartedly endorse these teaching exercises. But I have a two-sided response to the belief that if it isn't online [or free online], then it doesn't exist [or isn't worth reading]. On the one hand, on most topics today it's wishful thinking and may remain so for a long time. Don't let students indulge in it and don't fail to teach them what else exists and how to find it. On the other, we should work on making this belief true tomorrow, not just criticize it for being false today. Don't expect students to overlook the spectacular convenience of free online access to scholarship and information. For research authors as well as research readers, it's better to move peer-reviewed research literature into this basket than to keep blaming students for looking first in the basket closest to them.


Thursday, January 12, 2006

The UK text mining center

Julie Nightingale, Digging for data that can change our world, The Guardian, January 10, 2006. Excerpt:
Research tools able to swiftly analyse masses of data could soon bring about advances that scientists up to now can only dream of...Scientific research is being added to at an alarming rate: the Human Genome Project alone is generating enough documentation to "sink battleships". So it's not surprising that academics seeking data to support a new hypothesis are getting swamped with information overload. As data banks build up worldwide, and access gets easier through technology, it has become easier to overlook vital facts and figures that could bring about groundbreaking discoveries. The [UK] government's response has been to set up the National Centre for Text Mining, the world's first centre devoted to developing tools that can systematically analyse multiple research papers, abstracts and other documents, and then swiftly determine what they contain. Text mining uses artificial intelligence techniques to look in texts for entities (a quality or characteristic, such as a date or job title) and concepts (the relationship between two genes, for example)....Initially, the centre is focusing on bioscience and biomedical texts to meet the increasing need for automated ways to interrogate, extract and manage textual information now flooding out of large-scale bio-projects....Text-mining tools in use include Cafetiere, an information extraction tool that annotates text with information about entities and the relationships between them. Termine, a tool for handling terminology, is being re-engineered by the centre so that it can deal with large volumes of data. The centre...will act as a repository for such tools, as well as developing its own. One key task will be plugging the number of different tools for different tasks into one coherent framework. "This infrastructure will allow many people's tools to work together in a mix and match way, the mix of which will depend on the intended application," says Barker.

More on science as collateral damage in the war on music copying

Pierre Baruch, Franck Laloë, and Françoise Praderie, La science, c'est aussi de la culture, Le Monde, January 12, 2006 (in French). (Thanks to Stevan Harnad.) French copyright reforms designed to crack down on file-sharing in music will inadvertently harm science; exceptions in the existing law for research and education are not enforced; and media discussion focuses on music to exclusion of other affected areas of culture.

Research-sharing limited less by DNA patents, more by Bayh-Dole Act

David Epstein, Good Business, Inside Higher Ed, January 12, 2006. Excerpt:
Tales about business interests in technology impeding the flow of academic information linger in the minds of many researchers like horror stories. But in most cases involving DNA patents, licensing concerns have not restricted sharing among colleagues in academe. A study conducted by LeRoy Walters, professor of bioethics at Georgetown University, and six colleagues — from academe and from private industry — found that, even when universities grant exclusive licensing rights to companies, they insist on the right to share technology for academic research. “The licensing of DNA patents by U.S. academic institutions: an empirical survey,” published this month in Nature Biotechnology, gathered data from 19 technology transfer offices at leading research institutions, some of which are among the most prolific DNA patent holders in the country. All of those respondents, according to the paper, generally retain the freedom to share technology for research purposes. The paper suggests that 1999 guidelines by the National Institutes of Health, which urge grant and contract recipients to share “research tools with all biomedical researchers who request them,” set the tone for academic cooperation, and are widely considered by academic researchers to be stipulations of receiving grants. “It was almost like a gentleman’s agreement when it became clear NIH wanted people to share,” Walters said....Rebecca Eisenberg, a patent law professor at the University of Michigan who specializes in biomedical research, said that...while some things are getting better, data hoarding between colleges and companies is still prevalent since the Bayh-Dole Act of 1980, which allowed colleges and companies to gain exclusive rights to government funded research. “It made companies more reluctant to allow universities to use information freely, because they view them as competitors,” Eisenberg said. “If you’re going to have a mixed system of public and private science, this is going to happen.”

Review of EconPapers

Péter Jacsó reviews EconPapers in the January issue of Gale's Reference Reviews. Excerpt:
[I]n my 2004 review I criticized the meager coverage by EconLit of working papers. The good news is that in 2004 and 2005, the publisher of EconLit added records for 59,000 working papers from Research Papers in Economics (RePEc), the outstanding open-access database specializing in Economics. There have been several applications developed for processing various subsets of the RePEc database. Others, such as the IDEAS database maintained by Christian Zimmermann at the Department of Economics at Connecticut University, process the whole RePEc data set. Also processing the entire data set is the EconPapers database, which I review here. EconPapers (and the RePEc source file) is one of the best examples for successful large scale, collaborative projects among scientists, researchers and their institutions. It has close to 358,000 records for working papers (170,000 items from 1,500 series), journal articles (185,000 items from more than 400 journals), books (600), book chapters (1,020) and computer programs (1,300). Although EconPapers has only about half as many records as EconLit, it makes up for it by the rich content of the individual records. More than one-third of the journal article records and more than two-thirds of the working paper records have abstracts. The majority of the working papers are linked and available online free of charge. Seventy-six percent of the journal article records have links to the full text of the source documents. Although these are not open-access documents, many users will have free access to them by virtue of subscriptions by their libraries....EconPapers is yet another worthy and impressive implementation of the excellent Research Papers in Economics (RePEc) database, proving the viability of efficient collaboration among researchers in providing open access to the full-text, or at least to the rich metadata, of their papers to users who otherwise would not have access to traditional indexing/abstracting tools, let alone to full-text journal archives.

Update on Gallica

Nate Anderson, France pushes creation of European Google killer, ars technica, January 11, 2006. Excerpt:
[T]he French have organized several initiatives designed to one-up the Yanks. You'll remember, of course, the digitization project undertaken by the French National Library which was designed to counter Google's own plan to index millions of English-language books. The project, dubbed Gallica, is great if you want to access manuscript images of Proust's À la recherche du temps perdu from the comfort of your living room, but not for much else. Gallica has only 80,000 images online so far, and none of these are searchable by content. While the idea has merit and may turn into an incredible resource, its current incarnation leaves much to be desired and has basically failed to enhance Europe's reputation as a digital pioneer.

Update (1/13/06). Klaus Graf writes to say that Anderson is wrong on every point. According to Gallica's page of Documents Available Online, "Today, this digital library includes more than 75,000 volumes of digitized texts, 70,000 still images, and 30 hours of sound recordings....About 1,250 works in text format have been placed online...." And from a January 10 story in PC Inpact, "On estime d’ores et déjà que dans le cadre d’une numérisation massive, ce sont entre 50 et 60 000 ouvrages qui seront traités fin 2006, estime Jean-Noël Jeanneney, Président de la Bibliothèque nationale de France." (Thanks, Klaus.)

Advice for the new Blackwell journals

John Blossom, Journal Publishers Huddle Under the Wings of Blackwell, ContentBlogger, January 11, 2006. Excerpt:
It's not the best of times for independent scholarly journal publishers, a fact that keeps them moving towards distributors with more marketing and distribution savvy. Blackwell has announced that it will begin 2006 with 39 new publishing partnerships and 59 journal titles added to its of more than 600 society publications. Not a bad short-term solution for journals challenged by open access publishing and lacking the marketing muscle to distinguish themselves via online search solutions....Blackwell offers a quality publishing solution for journals that provides cost-effective technology and marketing infrastructure that can help them to be more effective independent publishers. But in spite of its Synergy online search interface it's still a heavily print-oriented marketing solution....It's important for independent publishers to consider their options for improving online and print marketing through partnership very carefully for options that will carry them aggressively into online revenue streams as more of their audiences make the shift to online as a primary consumption channel. For many publishers the move to Blackwell will be a positive experience in the short run, but it's a move that won't eliminate to consider long-term marketing solutions carefully.

Criteria for OA government info

Kristin R. Eschenfelder and Clark A. Miller, What Public Information Should Government Agencies Publish? A Comparison of Controversial Web-Based Government Information, a preprint self-archived January 11, 2006.
Abstract: This paper develops a framework to assess the public information provided on program level government agency Websites. The framework incorporates three views of government information obligations stemming from different assumptions about citizen roles in a democracy: the private citizen view, the attentive citizen view, and the deliberative citizen view. The framework is employed to assess state Websites containing controversial policy information about chronic wasting disease, a disease effecting deer and elk in numerous U.S. states and Canada. Using the framework as a guide, the paper considers what information agencies should provide given the three different views of government information obligations. The paper then outlines the costs and benefits of fulfilling each view of government information obligations including issues of limited resources, perceived openness and credibility, press coverage, and policy making control.

Chile asks WIPO to protect the public domain

William New, Chile Urges WIPO To Act To Protect Public Domain, IPWatch, January 12, 2006. Excerpt:
The government of Chile this week submitted a proposal to an upcoming meeting of the World Intellectual Property Organisation’s new committee on the development agenda that calls for positive steps to protect information in the public domain. The first meeting of the new Provisional Committee on Proposals Related to a WIPO Development Agenda will be held in Geneva on 20-24 February. The committee created by the WIPO General Assembly in October reflects a compromise extension of discussions over a proposal to expand WIPO’s focus on developing countries’ needs (IPW, 3 October 2005). The original development agenda proposal was put forward at the 2004 General Assembly by Argentina and Brazil, supported by 12 other Friends of Development. Subsequent proposals have followed. In its proposal, Chile highlights the benefits to society of a rich base of freely available public information. The public domain is of “crucial importance” to researchers, academics, educators, artists, authors and enterprises, as well as all varieties of institutions, it said. Developing countries in particular have raised concern that WIPO’s emphasis on the protection of rights, rather than the protection of public knowledge, may reduce their ability to innovate since most rights belong to developed countries. The proposal, obtained by Intellectual Property Watch, mentions a series of previous documents negotiated by governments in various bodies such as the United Nations Educational, Scientific and Cultural Organisation, and the UN World Summit on the Information Society. Chile calls for an analysis of the implications and benefits of a substantive and accessible public domain, and elaboration of proposals and models for the protection and identification of and access to the contents of the public domain. It further calls for protection of the public domain to be considered in the making of policy at WIPO.

Xerox joins the OCA


Wednesday, January 11, 2006

Authors protest OA book plan at Memorial University

Memorial University of Newfoundland is planning to digitize the works in its library and put them on the web for free online access. When the works are under copyright, it will proceed only with the copyright holder's consent. Canadian authors are protesting anyway. Excerpt from a CBC News story yesterday:
Newfoundland and Labrador writers are fighting a plan to make their work available on the internet for free. Memorial University wants to make much of its library holdings available to the public over the web. However, the association that represents writers in Newfoundland and Labrador says the program could make it harder for its members to sell books. "It's very simple. If you make a work you own it and you should be paid – you should be remunerated for it," said Allison Dyer, president of the Writers' Alliance of Newfoundland and Labrador. Memorial University says the idea behind the project is to make Newfoundland's culture and heritage freely available to everyone. Richard Ellis, Memorial's university librarian, says the university will not post anything on the web without first negotiating for permission. "It ought not to interfere with the livelihood of those people who make a living from publishing whether it be the publishers or the authors," Ellis said. The plan, Ellis noted, is only in preliminary stages. Memorial hopes to begin by posting the library's Newfoundland Studies collection.

Comment. There has to be more to this controversy than we've heard so far. Do the authors understand that the university will respect the decisions of copyright holders? Do they believe the university is making this assurance in bad faith? Have they transferred copyright to publishers and fear that publishers will consent against author wishes? Are they trying to block OA to books in the public domain on a theory that copyright is eternal? I'll post more as I learn more.

Bibliography of OA in Hungary

Tibor Koltay and Erika Tóth have complied a Bibliography of Open Access in Hungary (in Hungarian). From Koltay's announcement:
Due to the relatively low number of original articles, the bibliography also contains data of abstracts and news-items related to OA, published in Hungarian library and information science periodicals. The bibliography will be constantly updated. Its goal is to raise awareness of OA among Hungarian information professionals and through them among researchers.

Five new OA repositories in Australia

Five Australian universities have launched OA repositories, all using the ProQuest/Bepress DigitalCommons service. (Thanks to Arthur Sale.)

21 more Oxford Open journals from OUP

Oxford University Press has added 21 journals to its Oxford Open program, exactly doubling the number of participating journals.

Library automation tools and IRs

Mark Chillingworth, Library automation market is tracking big IT vendors, Information World Review, January 11, 2006. Excerpt:
Pressure from university IT departments is driving the development and adoption of library automation (LA) tools, and fuelling the current spate of mergers and acquisitions in the LA market....Pressures closer to home are also driving the adoption and development of LA tools. “There will be more integration with e-learning and institutional repositories (IR), which will bring in a lot more publishing and workflow technologies to the library,” said [Rein van Charldorp, managing director of OCLC PICA]. Institutional repositories pose a challenge, he added. “Building the IR is easy, getting the information in and out is much harder. To keep up with this pace you have to invest,” he said.

Comment. I don't see the problem getting content out of an OA repository, if this means finding and downloading it rather than removing it. What will library automation tools do to make discovery and downloading easier? As for getting material in, will the tools streamline or even automate the deposit process? Will they change the culture of inertia? I really can't tell what van Charldorp has in mind.

More on the CURES Act

CURES Act Would Push NIH, Library Journal, January 11, 2006. A short, unsigned note.
The battle for free public access to government-funded research may heat up after Sens. Joe Lieberman (D-CT) and Thad Cochran (R-MS) introduced legislation to establish the American Center for Cures within the National Institutes of Health (NIH). Included in that bill, known as the CURES act, is an aggressive provision to help make taxpayer-funded biomedical research available to all potential users. Although Congress directed the NIH to draft a policy to achieve that goal in 2005, what resulted was a weak policy that simply requested NIH-funded research be deposited into PubMed Central within a year after publication. A provision of the CURES Act, however, if passed, would require research funded by a number of government agencies be made available within six months. In addition, the law would set penalties for non-compliance. SPARC director Heather Joseph said that library groups were "gratified" to see that Congress took universal access to research into account.

Comment. Two quick notes: (1) Why is it "aggressive" to give taxpayers access to the research for which they've alrady paid? The six-month embargo is a compromise with the public-interest that makes the policy even less aggressive. (2) Congress directed the NIH to adopt an OA mandate in mid-2004. See my procedural history of the NIH policy.

Chinese ban on Wikipedia in its third month

Geoffrey York, Chinese ban on Wikipedia prevents research, users say, Globe and Mail, January 10, 2006. Excerpt:
Chinese students and intellectuals are expressing outrage at Beijing's decision to prohibit access to Wikipedia, the fast-growing on-line encyclopedia that has become a basic resource for many in China. Wikipedia, which offers more than 2.2 million articles in 100 languages, has emerged as an important source of scholarly knowledge in China and many other countries. But its stubborn neutrality and independence on political issues such as Tibet and Taiwan has repeatedly drawn the wrath of the Communist authorities. The latest blocking of the website, the third shutdown of the site in China in the past two years, has now continued for more than 10 weeks [starting October 19, 2005] without any explanation and without any indication whether the ban is temporary or permanent. "What idiots these officials are!" said one message on a Chinese site. "They are killing our culture with censorship."

Review of Google Scholar

Rita Vine, Google Scholar, Journal of the Medical Library Association, January 2006. A review. Excerpt:
Although Google Scholar covers a great range of topical areas, it appears to be strongest in the sciences, particularly medicine, and secondarily in the social sciences. The company claims to have full-text content from all major publishers except Elsevier and the American Chemical Society, as well as hosting services such as Highwire and Ingenta. Much of Google Scholar's index derives from a crawl of full-text journal content provided by both commercial and open source publishers. Specialized bibliographic databases like OCLC's Open WorldCat and the National Library of Medicine's PubMed are also crawled. Since 2003, Google has entered into numerous individual agreements with publishers to index full-text content not otherwise accessible via the open Web. Although Google does not divulge the number or names of publishers that have entered into crawling or indexing agreements with the company, it is easy to see why publishers would be eager to boost their content's visibility through a powerhouse like Google....The inadequacies of Google Scholar have already been well documented in reviews. These reviews focused on three major weaknesses of the tool: lack of sufficient advanced search features, lack of transparency of the database content, and uneven coverage of the database. Henderson's review of Google Scholar demonstrated its significant limitations for clinician use. Tests conducted by Jacso showed that Google Scholar typically crawled only a subset of the full available content of individual journals or databases. In February 2005, Vine discovered that Google Scholar was almost a full year behind indexing PubMed records and concluded that “no serious researcher interested in current medical information or practice excellence should rely on Google Scholar for up to date information”. With a simple, basic search interface and only minimal advanced search features, Google Scholar lacks almost every important feature of MEDLINE. It does not map to Medical Subject Headings (MeSH); does not permit nested Boolean searching; lacks essential features like explosions, subheadings, or publication-type limits; and offers searchers no ability to benefit from the extraordinary indexing that the National Library of Medicine provides. Google Scholar's closest free Web competitor, the quasi-scientific search tool Scirus from Elsevier, crawls a defined subset of free Web pages plus full-text content from Elsevier journals, patents, preprints, and more. Unlike Google Scholar, the Scirus project team is quick, even eager, to disclose the content of the Scirus database and regularly feeds new partner content into the database in its “About Us” section....Google Scholar has some great features. It is cited by × feature, which links a result to other items in the Google Scholar database that reference the item, a quick and fast way to find citations. Although it is not comprehensive, no other citation-linking tool in the marketplace is....Cyber sleuths can also use Google Scholar to find a free Web version of an article that might have started out behind a publisher's authentication firewall but has been downloaded by someone and then put on a public Web server.

Update. Dean Giustini has written some comments on Vine's review.

ALPSP survey on self-archiving and journal cancellations

The ALPSP has launched a Library Survey on Self-Archiving and Journal Cancellation. From the introduction:
As you may be aware, some publishers are becoming concerned that if self-archiving of postprints, or even preprints, of journal articles becomes sufficiently widespread, this may lead to a decline in usage at journals’ own websites, and that this in turn may lead to cancellations. In order to understand whether or not our fears are well-founded, we would like to understand more about the process by which you make the decision to cancel journals, what the crucial factors are, and how you would rank them in importance, both now and in the future.

OA advocate Les Carr has criticized the survey for question-begging wording on some questions, confusing structure, and a scope limited to librarians, who are only part of the journal-cancellation process. Today I realized that I'd been following the controversy but hadn't yet blogged the survey itself. Sorry for the delay.

What is commercial use?

Mia Garlick, Discussion Draft - NonCommercial Guidelines, Creative Commons blog, January 10, 2006. Those who use CC licenses that prohibit commercial use will be interested in the new draft guidelines on what counts as commercial and non-commercial use.

OA law review from Utrecht

The Utrecht Law Review is a new peer-reviewed, open-access journal now in its second issue. Excerpt from today's press release:
Utrecht Law Review is an Open Access journal offering an international platform for cross-border legal research. It is a good practice of electronic publishing that has been developed by the DARE project [PS: in English] ‘Truth or DARE’ [PS: Dutch only], to show legal scholars the added value to deposit publications in digital repositories. The aim of the ‘Truth or DARE’ project was to establish a number of good electronic publishing practices for Dutch legal researchers. Specifically entailed are publications by legal scholars in digital repositories, resulting in added value to legal-academic communications as well as optimal user-friendliness for academics. The project focused mainly on added scholarly value, communication / information, the supply process, copyright and visibility and was intended to find the most effective method of creating commitment among the target group of authors....The Editorial Board of the Utrecht Law Review has committed itself until December 2007 to publish in Open Access and deposit the publications in the digital repository of the Utrecht University. In the meantime a sustainable business model for the journal is being investigated.

New members for the OCA

Simon Fraser University, the University of North Carolina at Chapel Hill and its School of Information and Library Science, and Washington University have joined the Open Content Alliance.

Open source, open content, open access in developing countries

Segun Oni and Bolaji Onibudo, Open source: The future of IT in Nigeria, Vanguard, January 11, 2006. Excerpt:
Nigeria has to move away from the Get Rich at the Expense of the Poor syndrome that has plagued many western corporations who continually siphon wealth from Nigeria to their countries. We want to promote home grown software built by Nigerians for Nigerians with wealth creation remaining within the shores of Nigeria....But, as William Gibson reminds us, the future is here, it’s just not well-distributed yet. The answer to our problems is not to redistribute wealth, it’s to redistribute the future. In very practical terms, that’s what open source is about....When intellectual problems become distributed, the search for solutions becomes collaborative and the research agenda is driven not by multinational shareholders but by the passions of the participants, you get not just better results, you get different results. The South-South scientific coalition is a sign that a few countries at least - namely Brazil, South Africa and India - get this. They’re working together, trying to educate more local scientists and allying themselves with open and non-commercial approaches, like the open access movement in scientific publishing (which demands that scientific papers be made freely available online, not published in expensive, limited-circulation hardcopy journals), precisely because they recognize that this makes possible a different kind of science. It makes possible a scientific research agenda based on what their people need, not on what will make Monsanto the most money....[T]here’s something very wrong with a world in which crops, energy systems, essential drugs, access to information, methods for providing clean water, and so on are priced outside the reach of billions simply because of the legacy of past development patterns. They are proprietary knowledge. The greatest strength of the open source model is that it is explicitly non-proprietary. It is a direct antidote to legacy ownership of key ideas, because the core concept is that no one should own core concepts. No corporation, no nation, no person can claim ownership over the core concepts in an open source project in order to demand royalties or restrict its use. No one using OS-built medicines, for example, would ever die of AIDS because some Big Pharma executive in New York or Berlin decided that distributing cheap drugs was too great a risk to their patents.

Tuesday, January 10, 2006