Open Access News

News from the open access movement


Saturday, October 08, 2005

Commercial v. non-commercial search engines

Bettina Fabos, The Commercial Search Engine Industry and Alternatives to the Oligopoly, MOKK (Media Research Center at the Department of Sociology and Communications of the Budapest University of Technology and Economics), n.d.

Abstract: This essay details the search engine industry’s transformation into an advertising oligopoly. It discusses how librarians, educators, archivists, activists, and citizens, many of whom are the guardians of indispensable noncommercial websites and portals, can band together against a sea of advertising interests and powerful and increasingly overwhelming online marketing strategies.

From the body of the paper:

Google, Yahoo and MSN...are skewing the nature of all online information in favor of commercial enterprise, and will have enormous impact and power over the direction of information access and, indeed, democratic discourse, in the years to come....Despite the considerable implications of search engine commercialization for knowledge access, the topic has not gained much attention in academic and library spheres....If we want to go beyond a mainstream, commercialized, sponsored online information repository we need to turn to a different structure that offers a more inclusive, democratic information environment. As it turns out, there is hope (although it comes with acronyms that are a lot harder to remember than catchy commercial search engine names like Yahoo! and Google). Numerous computer scientists and digital librarians have been developing open source technology, such as the Open Access Initiative for Metadata Harvesting Protocol (OAI-PMH), iVia, and Data Fountains, that offer (and enhance) a user’s ability to search across multiple (that is, thousands of) subject gateways. These digital repository harvesting services imitate the functions and interface of a search engine, but they can be moulded to search in specific academic areas. In other words, one can create completely noncommercial searching environments that offer the scope and feel of a search engine.

Making information scarce instead of open

David Bollier, Herman Daly on the Commonwealth of Nature and Knowledge, On the Commons, October 3, 2005. Excerpt:
Why do economists insist on treating information and creative works as scarce – while making the opposite mistake with respect to the depletable services of nature, which they treat as limitless by pricing at zero? Last week, in the inaugural presentation of the new Forum on Society Wealth lecture series at UMass, Amherst, economist Herman Daly tried to shed some light on these paradoxes....In the information commons...intellectual property law is used to make an essentially limitless resource – knowledge – scarce. The over-propertization of knowledge can have lots of unfortunate effects, from preventing universal access and benefit to inhibiting the development of new knowledge. Economists see the imposition of artificial scarcity on knowledge (via copyright and trademark law) as a necessary condition for enabling market exchange. But the upshot, said Daly, is that “we mistakenly think that scarcity increases public wealth.” In fact, its chief result is the creation of private wealth.

Daly lamented the fact that economics deals mostly with the allocation of a resource among competing users, but fails to deal with issues of scale and just distribution. Economists don’t really address the appropriate physical size of the economy relative to the ecosystem – and thus they ignore the environmental sustainability of the economy. Similarly, economists don’t trouble themselves with the issue of who gets property rights in the first place -- and therefore, whether the distribution of market results are legitimate and just. Neither of these problems – sustainability and just distribution – can be solved from within the market paradigm, Daly warned. They require pressure from outside of the market, from civil society and governments.

New OA journal on communication and action

Systems, Signs & Actions: An International Journal on Communication, Information Technology and Work is a new peer-reviewed, open-access journal sponsored by Linköpings and Aarhus Universities. For the topic and scope of the journal, see the editorial by Peter Bøgh Andersena and Göran Goldkuhlb in the inaugural issues. (Thanks to Marcus Zillman.)

Comparing Google Scholar, PubMed, and Scirus

Dean Giustini and Eugene Barsky, A look at Google Scholar, PubMed, and Scirus: comparisons and recommendations, a preprint forthcoming from the Journal of the Canadian Health Libraries Association. Excerpt:
In summary, information professionals have no choice but to recommend Google Scholar under certain conditions and caveats. Librarians should be prepared to teach GS and PubMed side by side and answer questions about it, especially how it compares to commercial tools like OVID. Clearly, GS provides an easy means to access the health literature. Health librarians should not dismiss it outright, especially for simple browsing, known-item searching, and linking to free materials on the open Web. Where literature reviews are required, i.e., grants, clinical trials, or systematic reviews, health librarians will continue to recommend MEDLINE, Cochrane (with Google for grey literature), and other trusted sources. Finally, clinical queries must be answered by replacing requests in context. Health professionals already search Googleand will continue to use it (responsibly, one hopes) to satisfy their basic information needs.

New OA journal of palliative care

The Indian Journal of Palliative Care is a new peer-reviewed, open-access journal from MedKnow Publications. (Thanks to D.K. Sahu.) From the announcement:
The Indian Journal of Palliative Care is an interdisciplinary, peer reviewed journal published biannually. The journal welcomes contributions on clinical research, psycho social, ethical and spiritual issues related to palliative care. The website of the journal allows immediate open access to articles published in the journal. The journal is published by Medknow Publications and similar to all other journals published by Medknow the open access is without article submission, processing or publication fee. Articles could be submitted in the following sections: original articles, review articles, clinical guidelines, case reports, case discussions, narratives, reports on important meetings, book reviews, short reports and letters to the editor. Electronic submission of articles via email is welcomed.

Friday, October 07, 2005

New & forthcoming journals from BMC

Journal of Ethnobiology and Ethnomedicine launched in August at BioMed Central. It is not yet mirrored at PubMed Central, but will be shortly.

Journal of Ethnobiology and Ethnomedicine - Fulltext v1+ (2005+); ISSN: 1746-4269.

In addition, I've listed eight forthcoming titles from BioMed Central. Forthcoming titles:

Biological Knowledge; ISSN: 1745-4743.

Biology Direct; ISSN: 1745-6150.

Diagnostic Pathology ; ISSN: 1746-1596.

International Breastfeeding Journal; ISSN: 1746-4358.

Journal of Biomedical Discovery and Collaboration ; ISSN: 1747-5333.

Philosophy, Ethics, and Humanities in Medicine; ISSN: 1747-5341.

Substance Abuse Treatment, Prevention, and Policy; ISSN: 1747-597X.

Synthetic and Systems Biology; ISSN: 1747-8332.

Two geology journals providing free access

Bulletin of Geosciences, from the Czech Geological Survey, and Revista Mexicana de Ciencias Geologicas, hosted by Universidad Nacional Autonoma de Mexico (UNAM), provide free online access to current geological research. RMCG has a focus on the IberoAmerican region. Bulletin of Geosciences focuses on the geology of the Czech Republic.

Bulletin of Geosciences - Fulltext v77+ (2002+); ISSN: 1214-1119.

Revista Mexicana de Ciencias Geologicas - Fulltext v3(2)+ (1979+); ISSN: 1026-8774.

SciELO releases two more Open Access journals

SciELO (Scientific Electronic Library Online) continues to promote the dissemination of Latin American research. The newest, previously limited circulation, print journals to launch Open Access editions are Tempo Social and Agora: Estudos em Teoria Psicanalitica.

Tempo Social - Fulltext v16(2)+ (November 2004+); Print ISSN 0103-2070.

Agora: Estudos em Teoria Psicanalitica - Fulltext v7(2)+ (July/December 2004+); Print ISSN: 1516-1498.

Useful data from OA repositories

Two recent sources have separately described a promising way to use OpenURL link resolvers and OA repositories to help libraries generate data on the journals that their users cite, search, read, and publish in. The first is the August 22 open letter from several UK scholars (Tim Berners-Lee et al.) rebutting the ALPSP objections to the draft RCUK policy. The second is an October 7 AmSci posting by Tim Brody.

We know that OA archiving can improve an article's citation impact, which can improve the impact factor of the publishing journal. What's new is that if we draw the right kinds of data from OA repository traffic, then libraries will be able to make more intelligent subscription decisions, not just more intelligent cancellation decisions. Journals can help generate these benefits by encouraging authors to deposit their work in OA repositories, not just on personal web sites. Finally, the data could help measure impact at the article level, which helps authors and readers more than measurements at the journal level.

Excerpt from the open letter:

"[P]ublishers and institutional repositories can and will easily work out a collaborative system of pooled usage statistics, all credited to the publisher's official version....The easiest thing in the world for Institutional Repositories (IRs) to provide to publishers (along with the link from the self-archived supplement in the IR to the official journal version on the publisher's website -- something that is already dictated by good scholarly practice) is the IR download statistics for the self-archived version of each article. These can be pooled with the download statistics for the official journal version and all of it (rightly) credited to the article itself....All these statistics and benefits are there to be shared between publishers, librarians and research institutions in a cooperative, collaborative atmosphere that welcomes the benefits of self-archiving to research and that works to establish a system that shares them among the interested parties. Collaboration on the sharing of the benefits of self-archiving is what learned societies should be setting up meetings to do -- rather than just trying to delay and oppose what is so obviously a substantial and certain benefit to research, researchers, their institutions and funders, as well as a considerable potential benefit to journals, publishers and libraries....Librarians' decisions about which journals to renew or cancel take into account a variety of comparative measures, citation statistics being one of them (footnote 2). Self-archiving has now been analysed extensively and shown to increase journal article citations substantially in field after field; so journals carrying self-archived articles will have higher impact factors, and will hence perform better under this measure in competing for their share of libraries' serials budgets...

Excerpt from Tim Brody's posting:

I can have a lot of fun with hypothetical scenarios, in particular how open access could provide more in depth usage and impact analytical tools for librarians. The situation now is that journal usage metrics are being used because it provides easy access to comparative institution-specific information for user's online 'reading'. (Although is there evidence to show cancellation based on usage stats?) So let us suppose that an institution's authors are self-archiving 90%+ their own material into the institutional archive. The librarian can then discover which journals their authors are : (1) publishing in, (2) editing, (3) citing. An OpenURL resolver is used that points users automatically at the journal version (where subscribed), or an author self-archived version (where available). That resolver will provide the institutional manager with information on usage, journal interconnectedness etc. The resolver naturally aggregates open access and subscription content usage. Perhaps attempts by users to access an unsubscribed journal will be drawn to the librarian's attention. The institutional archive records logs of who accesses its content, and provides those logs to a 3rd party service that aggregates data across commercial and open access sources. The service then provides summary reports, as a comparative 'web impact' metric for journals (/authors/institutions/publishers).

The last-mile problem for knowledge

AskPhilosophers is a new Q&A site for philosophy. Readers send in philosophical questions, a moderator screens out the cranks and pornographers, and philosophers from a hand-picked panel answer them. The questions are sorted into 20 categories or sub-topics, which users can follow individually or in a mix. All the content is OA. The site even has an RSS feed.

Comment. It's simple but it works. I like it, and not just because it's in my own field. I like it because it's the most promising format I've seen for solving what could be called the last-mile problem for knowledge. Lots of dedicated researchers do lots of difficult research, which is then written up, vetted, published, and disseminated. This knowledge makes it from the ether to the mind of the researcher, then to paper or disk, then to a publisher, and then to the library or internet. But it very rarely jumps the last gap to the curious person who wants to know what it's all about. There are lots of reasons, including the cost of access and the scarcity of time. But an important part of the problem is that this knowledge is usually intelligible only to specialists, excluding both lay readers and professional researchers from other fields. What we badly need is a service to connect these large bodies of knowledge to curious minds --to solve the last-mile problem. Research publications typically leave the gap unbridged. Listservs either let in too much spam at the query end or let out too much self-righteous gas at the answer end. General Q&A sites don't attract enough knowledgeable people as question-answerers. And lay summaries of cutting-edge research don't necessarily answer the questions that curious minds want to ask. What I like about AskPhilosophers is that it's question-driven, professionally staffed, and moderates both the input and the output. Serious questions will get through and, when they do, they will receive serious answers. I'd love to see every discipline set up something similar. Right now, I'd spend a lot of my time at AskGeologists, AskMathematicians, and AskNutritionists. Scholars: start your engines.

New OA research repository at University College London

From a University College London press release, dated today:
A new web database for entering and viewing UCL academic staff research publications details online is now available to view on the UCL website. The database [is] named MyOPIA (MySQL Online Publications Index Administration)....Research publications data has been formally collated across UCL since 1997 and MyOPIA allows both academic staff publications from before this date and those published while employed outside of UCL to be added to the system. This will allow academic staff to have a complete personal listing of all their publications on MyOPIA, which is also accessible to the public....MyOPIA is also connected to UCL’s Eprints system, managed by UCL Library Services, which allows researchers to submit full-text copies of their research papers to an online archive. Once submitted to Eprints, the papers will automatically appear in the research publications database.

More on the threat to OA in Canada

Nadya Bell, Amendments to copyright law could cost universities, Manitoban Online, October 7, 2005. Excerpt:
Universities could have to pay for students and professors to use free Internet sites if an amendment to the copyright act passes in the House of Commons. Bill C-60 is intended to adapt Canadian copyright to the Internet and regulate things like music sharing and website use. Under the proposed bill, Internet services that would be free to use at home would require copyright royalties to be used in the classroom or for homework. Opposition MPs and education advocates are calling on the government to allow schoolteachers and professors an exemption from copyright restrictions....“We’re already paying a lot of money to copyright,” said [Steve] Wills [manager of legal affairs for the Association of Universities and Colleges of Canada]. “Adding to the fees would be particularly galling in the case of publicly-available material on the Internet.”...Wills said in one case a professor was quoted $66 per minute for a video clip he wished to use in class, but under American law a professor would have free access to the same material.

More on the Rowman & Littlefield opt-out

Brad Hill has an interesting new detail on the Rowman & Littlefield opt-out decision: the publisher "is withdrawing from Google Print For Publishers, over the Google Print For Libraries policy."

(PS: Remember that Google Print for Publishers seeks publisher permission first while Google Print for Libraries doesn't.)

OA digital texts talking to each other

Gregory Crane, Reading in the Age of Google, Humanities, September/October 2005. Crane is the editor in chief of the OA Perseus Digital Library. Excerpt:
[I]n the Phaedrus, Plato has Socrates commenting that written words are like statues that may imitate life but have no life of their own. He stresses the inert quality of written language: "Writing says one single thing --whatever it may be-- the very same thing forever."...Twenty years ago, Marvin Minsky, a proponent of artificial intelligence, responded to this ancient challenge and imagined a time when people could not imagine a library in which the books did not talk to each other....Google, Amazon, and other companies mine data, analyzing our queries and making inferences about our goals, using as much information as they have to help us spend money. Many of these same techniques can, however, help us learn. In the Perseus Digital Library, we already have the beginnings of new reading environments that help us understand complex documents in a variety of languages. For people studying Plato, for example, the Perseus Digital Library can assemble a range of materials relevant to the Phaedrus, including a Greek text, English translation, and a list of documents that comment on the opening section of the dialogue. The reader can customize the display by explicitly asking for original source text in Greek, choosing a translation font, and making other decisions about what should be displayed. The reader can ask a question about a particular Greek word, and the system can personalize its response: it recognizes that the reader is looking at a dialogue of Plato and highlights all citations to Plato in the online lexicon entry. These electronic actions are simple in nature but profound in their implications. Many different books are, in effect, having a conversation among themselves and deciding how best to serve the human reader.

Jacob Neusner opts out

Vincent Kiernan, Academic Press and Prolific Author Tell Google to Remove Their Books From Its Scanning Project, Chronicle of Higher Education, October 7, 2005 (accessible only to subscribers). Excerpt:
A well-known scholar and his publisher have demanded that Google withdraw his books from the digital archive that the Internet-search company is compiling from the holdings of five university and research libraries. "The basic problem is copyright violation," said Jacob Neusner, a research professor of theology at Bard College, who has written [or edited] more than 900 books...In an interview on Thursday, Mr. Neusner said that he had asked Google to remove his works from its Google Library project, but Google had insisted that he fill out a separate form for each of his books. That was wrong, said Mr. Neusner, because under copyright law it is Google's responsibility to seek permission to use a copyrighted work. So the Rowman & Littlefield Publishing Group, which has issued many of his books, took up the banner and has insisted that all of its works be removed from Google Library as well. Jed Lyons, president of Rowman & Littlefield, said that his company had not requested a royalty from Google for using the works. Nor will he. "We think it's unfair and arrogant and disrespectful of publishers' and authors' rights, and we don't want to do business with an organization that thumbs its nose at publishers and authors," he said. But Google, he said, is seeking to change his mind about withdrawing the works. "They're trying to convince us it's a mistake."

(PS: It's Jacob Neusner's loss. We know from the Authors Guild lawsuit that he's not the only author who would rather dig in his heels than find readers and buyers. If any of our searches would have pointed to his work, then it's also our loss.)

More on the OCA

Max Chafkin, Yahoo Takes Friendly Approach to Book Digitization, Sidesteps Google Uproar, The Book Standard, October 06, 2005. Excerpt:
The consortium, says OCA founder Brewster Kahle, could eventually scan millions of books. “We’ve been trying to digitize materials for years,” said Kahle, adding that publishers will eventually be invited to submit copyrighted works, which will be made available on a more limited basis. “The breakthrough is that we are doing this in the open --everybody’s shoulders drop, the lawyers go back to their cubicles, and we’re free to get things done.” The other breakthrough, say librarians for both universities, is scale. Prior to linking up with Kahle, the UC library system estimated the cost of scanning and archiving documents at $20 per page, while the University of Toronto, which has already scanned small portions of its collection, estimates that cost at $1. By contrast, Kahle’s technology costs only 10 cents per page. “It’s the production level,” says University of Toronto chief librarian Carole Moore. “Before, we weren’t doing it on any mass scale.” While the project was widely interpreted in initial reports as a rejoinder to the Google project, Google Director of Content Partnerships Jim Gerber said..., “I don’t think [the projects] are competitive at all,” said Gerber, adding that he sees OCA “as additive and beneficial” both to the publishing community and to Google’s mission of indexing the world’s information....While Yahoo’s actual dollar contribution to OCA pales in comparison to the tens of millions of dollars Google is poised to spend on its scanning project, the softball approach to digitization could have two potential benefits. First, by partnering with an innocuous project like the OCA, Yahoo can sit back and let Google slog through the legal muck of digitizing copyrighted books, something that both sides agree will eventually become the norm. Second, the fact that the OCA will allow anyone to host and index public-domain works may serve to undercut Google’s effort to protect --and profit from-- its digital copies.

OA to articles and data increases impact

Kristina Fister, At the frontier of biomedical publication: Chicago 2005, BMJ, October 8, 2005. Excerpt:
Last month the fifth congress on peer review and biomedical publication was held in Chicago. The presentations highlighted that we still have plenty of room to improve the quality of published research....Smaller journals may have to adopt other strategies to raise their impact factor. A study by Sahu and colleagues suggested that open access might be a powerful means for small journals to increase their visibility, citations, and consequently impact factor. Citations of articles published in the Journal of Postgraduate Medicine between 1990 and 1999 rose significantly after the journal went open access in 2001. Half of the articles were first cited only after open access was introduced....Apart from improving the quality of published literature, better reporting should speed up the advent of trial banks --open access electronic knowledge bases that can capture in detail aspects of trial design, execution, and results in a form that computers can understand. Decision support systems can then use these data more selectively, providing clinician friendly computer assistance for critical appraisal and evidence based practice. Sim reported that trialists found it easier to enter their data into the trial bank than to write a traditional research paper, and that readers found it easier to extract information about the trial --surely a sign that the days of journals reporting trials are numbered.

Bringing peer review to preprint archives

Marko A. Rodriguez, Johan Bollen, Herbert Van de Sompel, The Convergence of Digital-Libraries and the Peer-Review Process, a preprint forthcoming from the Journal of Information Science. (Thanks to Charles W. Bailey, Jr.)
Abstract: Pre-print repositories have seen a significant increase in use over the past fifteen years across multiple research domains. Researchers are beginning to develop applications capable of using these repositories to assist the scientific community above and beyond the pure dissemination of information. The contribution set forth by this paper emphasizes a deconstructed publication model in which the peer-review process is mediated by an OAI-PMH peer-review service. This peer-review service uses a social-network algorithm to determine potential reviewers for a submitted manuscript and for weighting the relative influence of each participating reviewer's evaluations. This paper also suggests a set of peer-review specific metadata tags that can accompany a pre-print's existing metadata record. The combinations of these contributions provide a unique repository-centric peer-review model that fits within the widely deployed OAI-PMH framework.

Cost-recovery instead of OA at GPO and LOC

The U.S. Government Printing Office (GPO) and the and Library of Congress Cataloging Distribution Service (CDS) have decided not to allow open access to the latest edition of the Library of Congress Subject Headings (e-LCSH). From the announcement (October 5):
Note that the 28th edition (2005) of LCSH was distributed by FDLP in paper this year based on the current selection profiles. Please note that the Library of Congress Subject Headings is a CDS sale product for which costs must be recovered to sustain its continued availability. This electronic version is being made available to the Federal Depository Library Program with the condition that the files NOT be redistributed or made accessible outside the premises of participating FDLP libraries. If downloaded to a local server, the e-LCSH files must be placed on a location that is not accessible to Web crawlers or to users outside the premises of the FDLP library. To download the review copy of the e-LCSH, please go [here].

Thanks to James Jacobs at Free Government Information for the alert and also for this comment:

This is an excellent (though sad and ironic) example of the promise of digital information being crippled by contract for economic reasons. Where digital information holds the promise of being easily copied, re-distributed, and re-used, we see instead extreme restrictions being imposed on the information because "costs must be recovered". The restrictions bear repeating so that we can imagine the future of a world without digital deposit or a world with DRM locked down deposit or a world where use is limited not by copyright, but by contract: [1] Files may "NOT be redistributed", [2] Access only on "the premises", [3] Digital access hidden from web crawlers, [4] Digital access prohibited by users outside the library. The true "Luddites" are those that impose restrictions on access to government information rather than envisioning and enabling the possibilities created by digitization of information.

(PS: Of course costs must be recovered. But this is taxpayer-funded information. The cost-recovery model envisioned here is to charge taxpayers twice. This model is not only unfair to taxpayers, who have already paid once, but thwarts the public purpose in producing the information in the first place. Instead of being available to all taxpayers with a need to use it, the information will be available only to the subset who pass a means test.)

Lessig on the CC

Lawrence Lessig recaps the story of Creative Commons, and kicks off its fund-raising campaign, in a lengthy posting to the CC blog. Excerpt:
We stole the basic idea [for CC] from the Free Software Foundation -- give away free copyright licenses. Because copyright is property, the law requires that you get permission before you "use" a copyrighted work, unless that use is a "fair use." The particular kind of "use" that requires permission is any use within the reach of the exclusive rights that copyright grants. In the physical world, these "exclusive rights" leave lots unregulated by copyright. For example, in the real world, if you read a book, that's not a "fair use" of the book. It is an unregulated use of the book, as reading does not produce a copy (except in the brain, but don't tell the lawyers). But in cyberspace, there's no way to "use" a work without simultaneously making a "copy." In principle, and again, subject to fair use, any use of a work in cyberspace could be said to require permission first. And it is that feature (or bug, depending upon your perspective) that was the hook we used to get Creative Commons going.

ALPSP meeting with the RCUK

On September 16, the ALPSP met with representatives of the RCUK to discuss publisher objections to the draft OA policy. The ALPSP has publicly disclosed this much about the results of the meeting:
We are reassured that RCUK have agreed to explain to grant recipients why publishers might find it necessary to impose an embargo or time limit for deposit of articles in order to protect subscription and licence sales, and also to insist that such embargoes must be observed; we have offered to help with drafting the wording for this. We are also pleased to know that RCUK will be consulting publishers over the specification of the research which will be conducted over the next two years, to evaluate the likely effects of the policy (although papers arising from research funded after the beginning of 2006 are unlikely to have been published by the review date of 2008); we hope that the research will be sufficiently objective to ensure that publishers do provide data about the effects, if any, on downloads, subscription/licence sales, and other measures of journal sustainability. RCUK plan to hold a workshop for societies in the early part of next year, and ALPSP has offered to help in any way that might be required.

The ALPSP minutes of the meeting are available to members only.

(PS: It looks like the RCUK will not close the "copyright loophole" in the current draft, which allows publishers to impose embargoes. Instead, it may even let publishers reword it to suit themselves.)

Updated Dworaczek index

Marian Dworaczek has updated his Subject Index to Literature on Electronic Sources of Information. The October 1 edition indexes 2,157 separate works.

How good is Wikipedia?

There's an interesting new Slashdot thread on the quality of Wikipedia.

Thursday, October 06, 2005

Making ER materials OA for the public

Klaus Graf, Electronic Reserve and Open Access, Archivalia, October 7, 2005. Excerpt:
Copyright law requires that an ER must be restricted to students and staff. Even if the ER is in the same repository as the OA eprints (this is the case e.g. in Essen-Duisburg) web users without a specific account cannot view the course materials....ERs contain both copyrighted modern works and Public Domain (PD) materials which were scanned for classroom use....Administrators and staff of ERs should give the general public access to PD documents. Administrators should encourage staff members to do so, and inform them about the legal framework and copyright issues (e.g. in Germany a work is PD if the author is 70 years dead). Concerning the copyrighted material (modern articles and book chapters) there is also a way to support OA. When preparing a course ER scholars can ask the authors for permission to make the materials available freely. In the US it is likely that the rights holder is the publisher. If publishers agree with OA (a lot of them do so) there is no legal problem to put OA versions in the web. Administrators of OA repositories should allow moving stuff from ERs (i.e. from mostly non-affiliated authors) into the archive. In the case of an unified system one has only to set access rights for the public. What is the advantage for the authors if their works are put into the OA part of an ER? They don't have to scan the documents and upload them to the repository. A permission request to an author can educate that author about OA, who may then be interested to know more about OA. Sending some permission mails is not really a lot of work. Conclusion: Administrators and staff of ERs should support OA by asking for permission to make OA versions of ER materials available.

Survey of what libraries are doing with institutional repositories

Elizabeth Winter is running a survey on institutional repositories. From her request for responses:
My colleague, Tim Daniels, and I are conducting a survey of librarians on the subject of institutional repositories, and we would be grateful for your participation. **Your institution DOES NOT have to have an institutional repository in order for you to participate.** We hope to learn some specifics about what libraries are doing with institutional repositories, and will be incorporating the results of this survey into a presentation for a conference this fall. The survey will only take 5-10 minutes to complete, and will be available [online] until Wednesday, October 19th. Your participation is, of course, voluntary, and we are not collecting any information that will identify your responses with you personally. We will be glad to share the aggregate results with you (just send me an email if you're interested: ewinter@gsu.edu). If you have any questions, please contact me.

More on the RCUK policy

The Dangers of Open Access, RCUK Style, Research Fortnight, October 3, 2005. An unsigned comment, accessible only to subscribers, critical of the draft RCUK OA policy. For some quoted selections, and direct rebuttals, see Stevan Harnad's response.

More integrated OA databases coming

From an Indiana University press release, dated yesterday:
Medical scientists must sift through and analyze mammoth amounts of data to find ways to treat disease, and an Indiana University School of Informatics-led team has been assembled to help them develop new discoveries. The School has been awarded a two-year $500,000 grant from the National Institutes of Health to establish the Chemical Informatics and Cyberinfrastructure Collaboratory, and it brings together experts in informatics, medicine, computer science, chemistry, biology and from IU’s Pervasive Technology Labs (PTL). Chemical informatics is the application of computer technology to chemistry in all of its manifestations, particularly in the drug-manufacturing industry. The group seeks to devise an integrated cyberinfrastructure composed of diverse and easily expandable databases, simulation engines and discovery tools such as PubChem, the NIH’s small molecule chemical and biological database. They will use emerging high-capacity computer networks and data repositories and develop grid and Web technology for chemistry research.

More on ACS v. PubChem

Emma Marris, Chemical Reaction, Nature, October 6, 2005 (accessible only to subscribers). Excerpt:
The American Chemical Society (ACS) is the world’s largest scientific society....The society owes most of its wealth to its two ‘information services’ divisions — the publications arm and the Chemical Abstracts Service (CAS), a rich database of chemical information and literature. Together, in 2004, these divisions made about $340 million — 82% of the society’s revenue — and accounted for $300 million (74%) of its expenditure....Although the ACS is a non-profit organization, the information-services divisions are increasingly being run like businesses. Any net revenue is naturally fed back into the society’s other activities, but the business-like attitude is making some ACS members uneasy. A small but vocal group of critics fears that business priorities are supplanting the goal laid out in the society’s charter: “to encourage in the broadest and most liberal manner the advancement of chemistry and all its branches”....An ongoing dispute between the ACS and the US National Institutes of Health (NIH) reflects some of the problems. The NIH has recently unveiled a freely accessible database called PubChem, which provides information on the biological activity of small molecules. The ACS sees this as unfair competition to the fee-based CAS because it is taxpayer-funded, and the society wants the database restricted to molecules that have been screened by NIH centres. A few ACS members argue that the society is being unduly aggressive in protecting CAS and ought not to be challenging the scope of a database that could be a useful and free resource for chemists. For the record, Nature’s sister journal Nature Chemical Biology links all of its articles to PubChem. “I am growing increasingly upset with their direction,” says Chris Reed, an inorganic chemist at the University of California, Riverside, and one of the more outspoken critics of the ACS. “They have a culture of a for-profit corporation.”...Steve Heller, who lives in Silver Spring, Maryland, is part of an e-mail listserver community that is a source of lively discussion on this issue. Heller is a retired chemist and ACS member who also serves on an NIH advisory board on PubChem. “It seems as if those members of the ACS who see and know what is going on — and it is not a very large number — are very upset that the management and staff are taking a position without any consultation with the membership or discussion with experts in the field, and doing things that are not in the interest of their members, who want [PubChem] for free,” he says.

OCA gets it almost right

Preston Gralla, Yahoo Gets Book-Scanning Right...Almost, Networking Pipeline, October 5, 2005. Excerpt:
The Yahoo-led project to scan books and library material [called the Open Content Alliance or OCA] and make them available online is on target, unlike the wrong-headed Google initiative that will lead to massive copyright violations. Despite a few minor problems with the Yahoo program, Google should learn from its competitor and follow the same rules....There are only a few drawbacks to the plan. First is that the material will be made available in Adobe Acrobat format, rather than as text. Acrobat is a notoriously finicky format, and the Acrobat reader has probably crashed more computers than anything this side of Windows. It's big, it's ugly, and it's a resource hog. People should have the option of viewing in plain text. Second is that all the work in the archive, regardless of copyright, will be made fully available as Acrobat files, so it can be easily printed out. This is great for public domain works, but not so great for copyrighted works. Copyright holders justifiably won't want their entire works made available this way, and few will probably want to participate. Yahoo should have a two-tiered program --- snippets for copyrighted works; full online access for the rest.

Comment. Three quick replies. (1) Gralla is hasty to conclude that Google's opt-out policy violates copyright. See my defense of it from last week's issue of SOAN. (2) I wholeheartedly share Gralla's preference for plain text over PDF. The fact that Adobe is a partner in OCA doesn't mean that OCA has to lock up the content in this annoying format. Users should have a plain-text option. (3) Gralla may be right that the full-text or nothing plan will lead many copyright holders to choose nothing. But the solution isn't to limit copyright holders to snippets. OCA can enlarge the menu and offer copyright holders full-text, snippets, or nothing. Many publishers will choose full-text, just as many publishers are volunteering their books to the Google Publisher program.

Mandating OA: In what kind of repository?

Dorothea Salo, Heard 'Round The World, Caveat Lector, October 3, 2005. Excerpt:
For the first time, a certain class of researchers must provide open access to their research results as a condition of their grant. The huge UK funder Wellcome Trust made deposit in PubMed obligatory as of yesterday. We here in the States had a golden opportunity to fire the open-access shot heard ’round the world: the NIH chewed on policy for nearly a year. We backed down. Wellcome Trust didn’t. Good for Wellcome Trust, and I hope to see a troop of funders fall into line behind them. That said --you’d think this would help me and the repository I manage, but it doesn’t....The Wellcome Trust grant agreement mandates PubMed, not just open access. They don’t positively forbid grantees to deposit somewhere else, but they don’t consider that a substitute for PubMed deposit. So I’m out in the cold, basically. The deeper question is which repositories are trustworthy enough to be viable substitutes for PubMed. Wellcome Trust understandably and correctly doesn’t want researchers slapping their stuff on their own websites and calling that a repository. (Why not? Well, because real repositories make guarantees about bit preservation and URL non-breakage that ordinary websites don’t. 404s aren’t acceptable in this business.) Nor, sadly, are all actual repositories likely to make it, long-term, because not everyone who has opened a repository quite realizes what a commitment they should be making to it. The answer may lie in repository certification. It’s terribly hard for an entity like Wellcome Trust to define just now which repositories are acceptable for deposit. (Mandate software platform? Sure, but the software platform is only one small part of the story.) Once repositories can be certified as trustworthy under a central definition, it becomes easy. So as much as I disagree with parts of NARA-RLG’s recommendations, I’m very glad they exist. I want a piece of the mandated-OA action, I do, and certification seems likely to be my path to it.

CLA Info Commons Interest Group now online

Heather Morrison, Info Commons.ca website and listserv, Imaginary Journal of Poetic Economics, October 5, 2005. Excerpt:
The Canadian Library Association's newly formed Information Commons Interest Group's website is now live, and our listserv is up and running and open to all!...Projects in progress: [1] Copyright in Libraries: the Digital Conundrum (Proposal for CLA Preconference), [2] wiki setup, [3] Drafting response to SSHRC Consultation on Open Access.

More on the impact advantage of OA

Stevan Harnad, How to compare research impact of toll- vs. open-access research, Open Access Archivangelism, October 4, 2005. Excerpt:
[Sally Morris objected:] "The problem is, there is no evidence of correlation between citations and the return on research expenditure."

[Harnad replied:] Citations are one direct, face-valid measure of return on research expenditure. Research is funded in order to be applied and built-upon, i.e., to be used; citations are an index of that usage. Uncited, unused research may as well not have been conducted, and represents no return on the research investment. Whatever increases usage and citations, increases the return on the research investment. Any loss of such a potential increase is a loss of potential return on the research investment. Self-archiving increases citations 50%-250%. Hence the failure to self-archive loses 50%-250% of the potential return on the research investment....

[Morris:] "Clearly, we are a long way off being able to analyse whether or not self-archiving (or any other form of open access) does or does not contribute to these objective output measures."

[Harnad:] I thought the question was about whether citation counts are correlated with these measures. We already know that self-archiving is correlated with increased citation counts.

(PS: For the studies showing the correlation to which Harnad refers, see Steve Hitchcock's excellent bibliography.)

Open-source submission tool for ETD repositories

VALET is a new, open-source submission tool for Fedora-based ETD repositories. From the September 27 press release:
VTLS Inc. has been collaborating with the NDLTD project at Virginia Tech, the FEDORA Project, and the Australian Research Repositories Online to the World (ARROW) Project, led by Monash University in Australia, to develop VALET for ETDs [Electronic Theses and Dissertations]. This open-source product is simple, flexible, adaptable and easy to implement. A typical process allows for thesis submission by students, editing and approval by faculty, approval by the graduate school and final deposit into a FEDORA-based, institutional repository. The institution can configure the number of steps in the process and the details of each step. The software minimizes errors and offers instant, formbased validation. It also offers multi-level security for students, faculty and administration. When a thesis enters the repository, the software automatically creates standardized metadata. It is preconfigured to allow users to choose Dublin Core or ETD-MS, but can support other metadata standards or schemas, such as MARCXML. VALET helps streamline the submission process while increasing the quality of the final ETD resource. While the initial version supports submission via a Web interface, the next version will also support submission via e-mail. FEDORA is packaged with VALET.

Scirus will index ETDs

From a Reed Elsevier press release, dated yesterday:
Elsevier today announced a landmark partnership between Scirus, its free science-specific search engine, and the Networked Digital Library of Theses and Dissertations (NDLTD) to add the extensive collection of [open-access] theses and dissertations of its member institutes to Scirus. In addition to indexing the content on Scirus.com, Scirus will power a search service on the repository's site. The service will ensure this content will be easier to find on both the NDLTD and Scirus websites. The launch of the service was announced this week in Sydney at NDLTD's annual conference ETD2005. "Until now, theses and dissertations have not been fully leveraged by postgraduate students and researchers in their work because these documents have been difficult to find and retrieve," said Deborah Kahn, an associate at Electronic Publishing Services, Ltd. To combat this trend, Scirus has indexed over 200,000 theses and dissertations, in more than twelve languages...."With its particular expertise in indexing for scientific and research content, Scirus is a logical partner for NDLTD," said Edward Fox, executive director of NDLTD and professor of computer science at Virginia Tech. "Building on their impressive history of providing scientists and students with the information they need for their research, Scirus now also supports NDLTD's goals of enhanced access to scholarship worldwide. We are looking forward to expanding our collection and partnership in the future."

Amherst joins ATA

The future of OA to biodiversity data

Roger Harris, To Be Free, or Not To Be, American Scientist, November-December, 2005. Excerpt:
Imagine walking into your downtown library and finding that you can't check out a book without paying a fee. What you took for granted as a free service, you now have to pay for. A similar situation may soon face biologists who study biodiversity, the variety and number of species....Today, biodiversity databases are growing and struggling for funds --which may come in the form of private investment that could transform what is now an open, public resource reliant on government and nonprofit funding....Biodiversity databases, each with its own way to codify, organize and search data, have proliferated as experts in various taxonomic groups have built catalogs to meet their specific needs. (An example is the well-known FishBase.) The Catalogue of Life Programme is the biggest and boldest attempt to integrate these databases. It is a joint agreement between Species 2000 (acting as a coordinating umbrella organization), the Global Biodiversity Information Facility (GBIF) and the Integrated Taxonomic Information Systems (ITIS). ITIS, the main U.S. contributor, is in turn a partnership of federal agencies and nonprofit organizations (themselves collaborations!) including NatureServe, the U.S. Geological Survey, the Smithsonian Institution and the National Biological Information Infrastructure. The organizational layers illustrate the complexity and cost of developing gigantic data sets as well as the extent of public-agency involvement....Stuart Pimm of the Nicholas School of the Environment and Earth Sciences at Duke University agrees: "So many of the data are collected by state and federal agencies, there is enormous public pressure to keep access open." A hint of private interest in the growing databases came in January 2004, when Thomson Scientific, the world's largest information corporation, acquired Biosis, known for indexing and abstracting life-sciences journals. Biosis managed the Zoological Record, whose computers had hosted the Species 2000 project to that point. With the acquisition, Thomson now hosted the Species 2000 database, an arrangement that continues. Although Jim Pringle, Thomson's vice president of development, says the company does not have definite plans to privatize biodiversity data, Thomson promptly applied to become a member of Species 2000. Frank Bisby, executive director of Species 2000, said the Species 2000 directors "took advice … and decided that [it] was not appropriate for a subsidiary of a major multinational."

More on ALPSP's objection to Google's opt-out policy

Somehow I forgot to blog Danny Sullivan's interview with Sally Morris on the ALPSP's objection to Google Library's opt-out policy. It appeared in SearchDay for August 30. Sally Morris is the Chief Executive of the ALPSP. (Thanks to Gary Price.) Excerpt:
The ALPSP put out a statement (PDF format) last week with this key highlight that caught my eye: "Google Print for Libraries is a very different matter. We firmly believe that, in cases where the works digitised are still in copyright, the law does not permit making a complete digital copy for such [indexing] purposes." I asked Morris: "Google...has indexed nearly 1,000 pages from the ALPSP web site. My assumption is that the ALPSP never overtly asked for these pages, all of which are copyrighted, to be digitized and included in Google. Despite this, I've never heard your organization complain about such indexing....In short, why is opt-out OK when it comes to web content but not OK when it comes to [other] published works?"

Morris replied: "[Y]ou're right, in principle Google should seek opt-in permission before indexing freely available web pages, too...However, I think the issue is much more acute where the content is not made freely available by its copyright owner - which is, of course, the case for all the in-copyright content Google are planning to digitise from libraries."

I wasn't convinced on the "freely available" front and sent this follow-up: "Why is publishing a book not making content freely available? If I go into a library, I've got plenty of content for free. That's exactly why Google has gone into the libraries....I don't know of any library being sued for allowing people to borrow books, which arguably goes directly to the potential earnings a publisher could make....In contrast, Google is not making the full text of books available as a library does. If anything, libraries are far greater infringers than Google and have been so longer. Why aren't libraries being targeted?"

Morris replied: "A published book is sold - to the individual or to the library. Lending it out does not contravene copyright. To my mind, making a digital copy of the whole thing does. We are not saying that increasing visibility via Google Print is a bad thing - I think those of our members who participate in the Google Print for Publishers program (or who otherwise allow Google to index their closed content) are generally pleased with the increased hits, though I'm less clear whether they are in fact seeing increased sales. All we're saying is that the method of achieving it seems to us clearly to break copyright laws - and we'd like to work with Google to find an acceptable way of getting publishers' opt-in."

[Sullivan again:] And I guess all I'm saying is that those publishers, if they try to push this angle with Google via a lawsuit, had better be prepared for explaining why they've never complained about having their web sites indexed by Google for years without permission. Moreover, woe to the publisher or member of a publishing group that is ever found during legal disclosure to have complained about not being indexed better on Google. You can't enjoy years of free traffic from a source, then suddenly decide that copyright law is now different just because the words appear in print, rather than on the web. One interesting solution will be to see if Google simply goes out and buys a copy of every book it wants to offer in its virtual library.


Wednesday, October 05, 2005

On the length of the UK term of copyright

Suw Charman, Should the term of copyright protection be extended or shortened in the UK? Open Rights Group, October 1, 2005. Blog notes on a panel discussion among Lawrence Lessig, John McVay, and Adam Singer, moderated by John Howkins. (Thanks to QuickLinks.)

Searching Medline via PubMed

E. Motschall and Y. Falck-Ytter, Searching the MEDLINE Literature Database through PubMed: A Short Guide, Onkologie, September 2005.
Abstract: The Medline database from the National Library of Medicine (NLM) contains more than 12 million bibliographic citations from over 4,600 international biomedical journals. One of the interfaces for searching Medline is PubMed, provided by the NLM for free access via the Internet. Also searchable with the PubMed interface are non-Medline citations, i.e. articles supplied by publishers to the NLM. Direct access to an electronic full text version is also possible if the article is available from a publisher or institution participating in Linkout. Some publishers provide free access to their journals. Other journals require an online license and are fee based. The following example demonstrates some of the most important search functions in PubMed. We will start out with a fast and simple approach without the use of specific searching techniques and then continue with a more sophisticated search that requires the knowledge of Medline search functions. This example will show how the application of Medline search tools and how the use of the controlled vocabulary of ‘Medical Subject Headings’ (MeSH) will influence the results in comparison with the fast and simple approach. Let’s try to find the best evidence to answer the following question: Is a 30-year-old man with typical acid reflux symptoms for many years (gastroesophageal reflux disease, GERD) more likely to develop esophageal cancer than people without reflux symptoms? This question can be split into several components: - a patient with reflux symptoms (GERD), - esophageal cancer: etiology, risk, - study design for etiology studies: cohort studies, case-control studies.

Another publisher on Google Print

Karen Christensen, Google and the library, another installment in the EPS debate on Google Print, October 4, 2005. Karen Christensen is CEO of the Berkshire Publishing Group. Excerpt:
Our problem is that the guys and girls at Google don’t really get books. They want to believe that books are just primitive webpages, simply more information to be organized for the benefit of everyone....But websites are built for the web. Books were not written for Google Library. Forcing publishers and authors to opt-out, instead of opt-in, is not fair. It’s coercive....Librarians, unfortunately, don’t understand the rights of the creators and producers of books. Most librarians do not understand the work and expense, the expertise and talent, involved in creating the publications they buy. And quite a few believe that information should be free --unless it is only available through them. Besides that, Google has an unhealthy fascination for librarians: they are (rightly) terrified by the fact that students go to Google instead of to them, but they can’t take their eyes off it. Google is taking advantage of librarians by making them partners in a process that undermines the sources of information and knowledge that their institutions and communities depend on. As a result, authors and publishers can easily be made to look obstructive and mean-spirited....It’s a good thing the Google lawsuit isn’t going to be decided by a public referendum, because we authors would lose hands down. I’ve taken to asking people whether, if it were possible, they would be happy if they knew Google was going to scan, store, and index copies of all their personal photographs and diaries, photos of the interior of their house and their closets, all without permission? (And use that content to make money.) Our challenge is to show people just what it takes to create and publish a book and that intellectual creation merits every bit as much protection as physical property. And we need to talk about this is simple terms. When Google says it will take and hold and use content that does not belong to them, without asking permission, they are coming awfully close to breaking their own rule, “Don’t be evil.”

Comment. Three quick replies. (1) For a defense of Google Library's opt-out policy, see my article in this week's SOAN. (2) The analogy to personal photos and diaries is very bad. We don't make them hoping to bring them to the largest possible audience. We don't hope to make money from them. We don't welcome free advertising for their contents. But book authors do all of these things. (3) Publishers who don't want to look "obstructive and mean-spirited" should stop using the false and grasping comparison of intellectual property to physical property. Physical property doesn't enter the public domain after a fixed term of years, and non-owners have no fair-use rights over it. Intellectual property is only quasi-property that every country on Earth treats very differently from physical property.

More on applying trade embargoes to science

John Miller, US societies reverse rules on Iranians, TheScientist, October 4, 2005. Excerpt:
Two American academic societies have reversed their policies toward Iranian scientists. One, the American Institute of Aeronautics and Astronautics (AIAA), has decided to no longer prohibit Iranian authors from publishing in its journals, while the American Concrete Institute (ACI) has decided to install a new ban barring Iranian students from taking part in an annual engineering competition they routinely enter each year....AIAA enacted the ban because the board feared it might be violating US embargo law....In September 2003, the U.S. Treasury Department's Office of Foreign Assets Control (OFAC) ruled that the little-known embargo law prohibited the Institute of Electrical and Electronics Engineers (IEEE) from editing manuscripts from all embargoed countries, leaving it with no choice but to publish them unedited or reject them. Subsequently, a few other scientific societies stopped publishing Iranians after the IEEE decision out of fear that OFAC would also charge them with a crime. A consortium of academic publishers and societies led by the Association of American Publishers pressed OFAC for over a year to drop the embargo, but to no avail. Last October, the consortium sued the agency, and last December, OFAC reversed its decision, granting a general license to all US publishers to edit and publish material from embargoed nations. Marc Brodsky, CEO of the American Institute of Physics and a central figure in the publishers' lawsuit, told The Scientist that OFAC's reversal means AIAA never needed to ban Iranian articles....Last January, the American Concrete Institute (ACI) decided to ban Iranian students from taking part in its annual international student engineering competition after OFAC ruled that certification courses ACI had been offering to Iranian professionals were illegal, because they provided a service. Students from Cuba and Sudan were also banned from the competition, although none have entered the contest, according to William Tolley, ACI'S executive vice president. Tolley told The Scientist that since his organization had never known it was violating OFAC's rules until the agency began investigating, it decided to temporarily exclude Iranian students from the competition while it asked OFAC whether their participation was legal. However, Tolley said he has written to OFAC four times—most recently in September-- and he still doesn't have an answer.