Open Access News

News from the open access movement


Saturday, May 20, 2006

More on the FRPAA

Britt Peterson, Taking aim at scientific journals, Seed, May 19, 2006. (Thanks to William Walsh.) Excerpt:
You might think that the results of publicly-funded taxpayer research would be freely available to the citizens who footed the bill in the first place, but you would be wrong --and perhaps in the mood to remedy the situation. That's the logic that motivated John Cornyn (R-TX) and Joe Lieberman (D-CT) to introduce...the Federal Research Public Access Act of 2006...."Tax payer-funded research should be accessible to tax payers," said Sen. Lieberman in a press statement at the bill's introduction....

Many journal publishers say that open access of the sort laid out in the Cornyn-Lieberman bill would make subscription-based publications redundant, rendering moot the valuable process of selection, editing and peer review for which publishers are currently responsible. "You can't throw the baby out with the bathwater," said Rene Olivieri, CEO of Blackwell Publishing...."There needs to be an income stream from the core scientific community, the libraries, the research institutions, and let's not forget, a lot of the subscriptions are paid for by corporations and scientific laboratories within the private sector. If you give it away for free the income stream dries up. The system of control and value-adding just withers away."

Olivieri served as co-author on a study released last week --sponsored by Blackwell but carried out by independent researchers-- that found scientists rank lack of access 12th in a list of annoyances contributing to a lack of productivity; red tape and lack of funding topped the charts....

According to Gunther Eysenbach, a professor in the department of health policy at the University of Toronto, the main weakness of the proposed Federal Research Public Access Act is its inefficiency. It would force authors who have already published their work in open access journals, like the family of journals published by the Public Library of Science (PLoS), to go through the motions of republishing in the federal repositories....For Eysenbach, who recently published an article in PLoS about the benefits of open access publishing, the speed of accessibility is far more important to scientific process than any concerns relating to the viability of the scientific publishing industry. "Open access really accelerates the scientific process," he said. "I know few people who wouldn't prefer to have a cure for cancer or for AIDS in 15 years instead of in 20 years."

Comment. Olivieri is assuming (1) that FRPAA will undermine subscriptions, and (2) that without subscription-journals, nobody would perform peer review. There's no evidence for either contention and good reason to doubt both. For more details, see my 10-point rebuttal to the AAP's objections to FRPAA. Also see my comments on the report Olivieri co-authored last week.

Eysenbach is right that when articles are OA from journals, there's little or no urgency for them to be OA from repositories as well. However, there are still some reasons to deposit them in repositories, e.g. for the security reasons that lead PLoS and BMC to deposit their OA articles in PMC, or for the processing and integration with OA databases provided by the NIH. If FRPAA isn't revised to handle cases in which articles are published in OA journals, then repository managers who think it's important to have certain articles on deposit can harvest them from the OA journals in which they were published.

What publishers should and shouldn't fear

Tim O'Reilly, Publisher, be very, very afraid? O'Reilly Radar, May 18, 2006. Excerpt:

I just saw a printed copy of the New York Times Magazine issue that contains Kevin Kelly's brilliant essay What Will Happen To Books? (It's called "Scan This Book!" in the online version.) Kevin did an amazing job of bringing the potential of the universal electronic library to life, and highlighting both the issues and the opportunities. So I was really disgusted to see the cover treatment of the article bearing the subtitle, "Reader, take heart! (Publisher, be very, very afraid.)"

This is yellow journalism, pandering to fear of the future. Publishers need to get with the opportunity, not be afraid of it! As I've argued previously, in essays like Piracy is Progressive Taxation, the role of "publishing" is rediscovered in each new medium after a period in which everyone argues that the playing field has been leveled once and for all.  I was writing about music and film being threatened by peer-to-peer networks, but the same argument applies to books:

The music and film industries like to suggest that file sharing networks will destroy their industries. Those who make this argument completely fail to understand the nature of publishing. Publishing is not a role that will be undone by any new technology, since its existence is mandated by mathematics. Millions of buyers and millions of sellers cannot find one another without one or more middlemen who, like a kind of step-down transformer, segment the market into more manageable pieces....

Those of us who watched the rise of the Web as a new medium for publishing have seen this ecology evolve within less than a decade. In the Web's early days, rhetoric claimed that we faced an age of disintermediation, that everyone could be his or her own publisher. But before long, individual web site owners were paying others to help them increase their visibility....

The means by which aggregation and selection are made may change with technology, but the need for aggregation and selection will not....For publishers, the question is whether they will understand how to perform their role in the new medium before someone else does. Publishing is an ecological niche; new publishers will rush in to fill it if the old ones fail to do so.

In the five years since I wrote that essay, things are turning out much as I predicted....Let’s be clear: Kevin doesn’t make the arguments that publishers are dead -- I’m just complaining about the cover treatment of the article. In fact, Kevin gives a compelling description of the new kinds of curation that publishers will need to perform. There is a lot to learn in the new world, but the biggest fear that publishers should be thinking about is the fear that they will be displaced by new publishers who are better at mastering the rules of business than they are....

World eBook Library

The World eBook Library and Project Gutenberg are sponsoring the World eBook Fair. During the month-long fair (July 4 - August 4 2006), users will have free online access to 330,300+ PDF eBooks. (Thanks to ResourceShelf.)

Comment. If you're wondering why one month of free online access is something special when Project Gutenberg offers free online access with no time limits, the answer is that the World eBook Library is offering a one-month waiver of its membership fee ($8.95/year). I don't normally blog limited-time offers of free access; they're not free services so much as ads for fee services. But I made an exception in this case because of the Project Gutenberg connection.

Educating the non-scientific public about science

Liza Gross, Scientific Illiteracy and the Partisan Takeover of Biology, PLoS Biology, May 2006. This article isn't about OA but it has implications for OA. Excerpt:
Jon D. Miller, who directs the Center for Biomedical Communications at Northwestern University Medical School,...has devoted his 30-year career to studying public understanding of science and technology and its implications for a healthy democracy....Since 1979, he says, the proportion of scientifically literate adults has doubled --to a paltry 17%. The rest are not savvy enough to understand the science section of The New York Times or other science media pitched at a similar level. As disgracefully low as the rate of adult scientific literacy in the United States may be, Miller found even lower rates in Canada, Europe, and Japan --a result he attributes primarily to lower university enrollments. Scientific literacy doesn't call for a deep understanding of Maxwell's equations or Hardy–Weinberg equilibrium, but it does require a general understanding of basic scientific concepts and the nature of scientific inquiry....One-third of Americans think evolution is “definitely false”; over half lean one way or another or aren't sure. Only 14% expressed unequivocal support for evolution --a result Miller calls “shocking.”...

Given the partisan attack on evolution and stem-cell research, he thinks scientists need to learn more about how the political process works. They need to be willing to run for the school board, write $500 or even $5,000 checks to support moderate candidates, and defeat Christian right-wing candidates. “Scientists need to become involved in partisan politics and to oppose candidates who reject evolution or attack scientific research,” he says. “It takes time, money, and paying attention to the issues.”...And as Miller's research shows, when you get away from the religiously charged issues, there is an even greater opportunity to increase scientific literacy....When Americans are diagnosed with cancer or some other life-threatening disease, “the vast number of these people go online and learn more science in the next 12 months than a typical undergraduate will ever learn. It is impressive how much people can learn with the proper motivation. We need to get people to be savvy about how to find the information and make sense of it....There's a lot of work to be done for us to tell people what we do, why we do it, and why it's important....We in the scientific community have to treat them seriously, talk to them, and make our arguments. This is a great opportunity for us.”

Comment. My question is, What role can open access play in this? I'm not so optimistic as to think that simply making primary science easily available online will do much to foster scientific literacy and scientific knowledge among non-scientists, let alone convert creationists to evolutionists. Easy access completes the puzzle when there is antecedent interest and background, and we need help from teachers, journalists, and politicians to create that interest and background. For the same reason, however, I'm not so pessimistic as to think that OA will make no difference.

There are two mistakes to avoid here. One is to think that OA has no role to play in helping non-scientists understand science. We can call this the Royal Society mistake, after the RS's recent report on educating lay readers about science that doesn't even mention OA. The other mistake is to think that the overriding purpose of OA is to educate lay readers. No OA advocates believe this, but some publisher-opponents of OA either believe it or pretend to believe it in order set it up as a straw man and knock it down. (The most recent example is the American Society of Human Genetics, as quoted in the NYTimes for May 8.) To avoid both mistakes we have to accept that the problem and solution are both complicated. OA will play a role in public education about science --it's neither irrelevant nor sufficient-- and the size of that role is up to all of us.


Friday, May 19, 2006

National OA initiative in Sweden

Sweden has launched a national OA initiative whose goal is "to promote maximum accessibility and visibility of works produced by researchers, teachers and students at Swedish universities and university colleges." From the site:
Objectives:
  • To promote co-ordination and development of standards and tools for electronic publishing at Swedish universities and university colleges
  • To promote a rapid growth of the volume and diversity of material in academic repositories
  • To promote access to and use of content in academic repositories and Open Access journals
  • To secure long-term access to digital publications and other material in academic repositories
  • To develop quality standards for content and services in academic repositories
  • To support publishing in Open Access journals and the migration of Swedish scientific journals to an Open Access model...

Funding of projects will be awarded from BIBSAM's budget for development projects (at present 6 million SEK/year). Co-financing of larger projects will be sought from other sources.  The programme is led by a Steering Committee consisting of representatives from universities, research organisations, university libraries and, the National Library....

The two-year program (2006-07) is organized by BIBSAM, the National Co-ordination and Development program of the National Library of Sweden. BIBSAM also funded or co-funded ScieCom, the DOAJ, and the SVEP project. Also see the project page in Swedish.

PS: Kudos to all involved. The objectives are just right and BIBSAM has an excellent track record in coordinating successful projects.

New OA journal on oncogenomics

Translational OncoGenomics is a new peer-reviewed, open-access journal from Libertas Academica. (Thanks to Marcus Zillman.) From the site:
The primary mission of Translational OncoGenomics is to provide an open-access, peer-reviewed, rapid-publication forum to assist in the dissemination of novel genetic, epigenetic and molecular pathway information related to clinical cancer. The journal is designed to meet the scientific and public need for such a forum in order to process the accelerating acquisition of genomic data resultant from recent technological advances in and international programmatic commitments to this area of research. Particularly encouraged is the submission of papers with a translational connection to human cancer from basic science to ethical considerations related to the application of oncogenomic discoveries for diagnostic and prognostic purposes in clinical trials and for anti-cancer drug development. An important objective will be to contribute information published in this forum to comprehensive internationally-accessible genomic databases in order to foster the identification of molecular targets for the therapy of specific cancers.

Research 2.0

Open Access, Participation Literacy, May 19, 2006. An unsigned blog post. Excerpt:

It is self-evident that all information cannot be free....[But] there are some forms of information...[that can be free] - academic information. With academic information and knowledge, I mean information and knowledge produced in research by government financed resources. To this category I count most information and knowledge produced by universities and other forms of higher education institutions. I do not count information and knowledge produced by private companies. The form of information and knowledge produced by companies such as Microsoft and Sony belongs to another discussion....

Many universities have built their own publishing environments. The reason is not only because they want the information to be free. It is because they have realised that the business model in the academic publishing industry is out of date. A university produces large amounts of high quality information and knowledge and much of that information and knowledge is collected by publishing companies, printed on paper and/or locked in expensive digital suites and sold back to the university in the form of very expensive Journals and database subscriptions. The only reason this business model still works is because the academic norm is very conservative. The model is strongly linked to academic quality and ranking system. I do not think most researchers are so conservative though; the conservation mechanism mostly lies with the research funding and career system in the academic society....

The first point to make for a research 2.0 concept would be to free the academic information and knowledge from commercial slavery - if you publish an article in a journal, or likewise, always keep the right of reasonable usage, like a creative common license. In a connected research environment, we cannot make valuable information invisible.

And from a second post to the same blog today:

I suggest a research 2.0 concept to include: [1] Open access to information created by public authorities (Universities and the like), [2] Open Peer Review, [3] Collective Intelligence in research environments, [4] The Web as platform (paper journals is not of much use in the Web 2.0 era, only e-information can be true objects to collective intelligence).

German bill supports OA

Klaus Graf, Bundesrat für wissenschaftsfreundlicheres Urheberrecht, Archivalia, May 15, 2006. Klaus quotes key excerpts from a new bill (Entwurf eines Zweiten Gesetzes zur Regelung des Urheberrechts in der Informationsgesellschaft) before the upper house of Germany's Bundesrat that would make Germany copyright law more science-friendly.

Three sections of the bill support OA in different ways. I'd like to summarize them without misleading anyone, but my German and Google's English aren't good enough for that. If anyone can translate the key sections or point to English translations online, I'd gladly blog them.

Harnad reply to Eysenbach

Stevan Harnad, Confirming the Within-Journal OA Impact Advantage, Open Access Archivangelism, May 18, 2006. A reply to Gunther Eysenbach's reply to Stevan's review of Eysenbach's article in PLoS Biology. There aren't too many layers of this dialog to follow, but there are too many for me to excerpt the latest one without omitting key points or quoting at great length. I hope you'll read it all first-hand.

Intro to Connotea and tagging for scientists

Ben Lund, Social Bookmarking For Scientists - The Best Of Both Worlds, a paper delivered at XTech 2006. Also see his slides. A good introduction to Connotea.

SPARC and Bioline working together on OA

SPARC has announced a partnership with Bioline International. Excerpt:

SPARC (Scholarly Publishing and Academic Resources Coalition) today announced a partnership with Bioline International, an online publishing service that provides open access to peer-reviewed research journals published in developing countries.

Founded in 1993, Bioline helps journals from developing countries reap the benefits of open access. By providing a platform for the free and open distribution of these scientific publications, otherwise largely invisible to the international scientific community, Bioline seeks to fulfill the promise of open access publishing.  Evidence is mounting that Bioline’s efforts are bearing fruit. The use of publications on Bioline’s platform has increased dramatically over the past several years, with downloads of articles from the service’s 50-plus journals now averaging in excess of 200,000 per month. Bioline studies suggest that increased awareness and usage translate into improved journal quality by inspiring higher submission rates - especially from international authors - and improved citation rates.

”We see the advent of open access to research information as a huge boost in strengthening science in less developed nations,” said Leslie Chan, Associate Director of Bioline. “Without international visibility, researchers in poorer nations will remain isolated, partnerships with not be formed, and the knowledge generated in these regions will remain unrecognized.”

The OA impact advantage and South African journals

Sophie Hebden and Christina Scott, Scientific papers on internet making impact: study, South African Broadcasting Corp., May 19, 2006. Excerpt:

Scientific papers freely available on the internet make a bigger impact than many people realise, according to a new study....The findings will strengthen calls for more online scientific journals to switch to the open-access model and make research freely available. Journal subscriptions are too expensive for many scientists in developing countries, making open-access their sole means of keeping up to date with research in the rest of the world. The author of the study, published this week in the prestigious Public Library of Science online series Biology, concludes that "open-access is likely to benefit science by accelerating dissemination and uptake of research findings"....

[Gunther] Eysenbach found that open-access papers were twice as likely as other papers to be cited 4-10 months after publication. This increased to three times as likely 10-16 months after publication. More surprisingly, the study found that articles published as open-access from the start on had a higher impact than articles published as non-open-access, which researchers had 'self-archived' on other websites....

Meanwhile, South African research journals have been urged to dramatically increase their visibility - to policymakers, taxpayers who often fund the research, and readers across the developing world - by creating open-access internet editions as soon as possible. The Academy of Science of South Africa, led by University of Pretoria vice-principal Robin Crewe, made the call after an inquiry that found that in the past 14 years, one-third of South African journals have not had a single paper cited by their international counterparts....

The academy's executive officer Wieland Gevers, who led the investigation...says the government's system for subsidising journals must be reformed to improve their quality and visibility. Currently, the department of education pays universities R84 000 each time a government-accredited journal publishes a paper by one of their academics, regardless of the journal's international standing. Gevers says the department should divert $165 of the subsidy to the journals, to allow them to fund online and open-access editions....Adi Paterson of the department of science and technology, which commissioned the study, welcomes the report as a basis for strengthening "incentives to support high-quality research publications" and to "forge a low cost open-access approach to the publishing of publicly funded research". - SciDev.Net


Thursday, May 18, 2006

Endocrine Society offers free online access

The Endocrine Society is offering free online access to "to patient information from [its] journals." From today's press release:
Today, The Endocrine Society, publishers of four top-ranked, peer-reviewed medical journals, announced a new initiative that will give patients with endocrine disorders immediate access, through the Society's online journals, to cutting-edge research and patient information. Working with The Hormone Foundation, the Society's public education affiliate, the program not only connects the public to the latest scientific research but also includes materials designed specifically for patients, creating a one-stop resource....

[T]he Society's model pairs the science with the patient-centered information. Patients who search Google for these diseases will be directed to articles published by the Society's premier journal, The Journal of Clinical Endocrinology & Metabolism, where they will have free, unlimited access to current content that is ordinarily available only by subscription to scientists and medical libraries.

"We've found that endocrine patients, because of the chronic nature of their conditions, are more involved in diagnosis and treatment decisions than patients with acute illnesses," said Dr. [Lisa] Fish [Chair of The Hormone Foundation Committee]....

The Endocrine Society hopes that its continued leadership in this area will inspire other not-for-profit and commercial publishers to provide open access to patients to published medical research.

Comment. I want to praise the society for what it's doing, but I have to fault this press release for not explaining what it is. I've spent 45 minutes trying to figure out what the society is offering for free that it didn't formerly offer for free, and I still don't think I've found it. The society's four journals still charge subscriptions and are not converting to OA. When patients ask for information, they are told to request articles by email or wait for the 12 month embargo to end.

The society does offer one kind of significant content free online, and without an embargo, although the press release doesn't mention it. If you look at the TOC from the current issue of The Journal of Clinical Endocrinology & Metabolism, you'll see that the author's manuscript is OA, while the copy-edited version is limited to subscribers. We learn elsewhere on the site that these author manuscripts are made available from the moment of acceptance.

This is a welcome step and I applaud it. But I'm still not sure I've found what the announcement is announcing. First, it seems that the society offered OA to these manuscripts before today's announcement. Second, the announcement refers to "materials designed specifically for patients" but doesn't link to that special material or describe it any further. The web site links to patientINFORM, which fits this general description, but neither the announcement nor the web site tells us what the Endocrine Society has done to provide content through patientINFORM. Finally, the announcement makes this strong claim: "Patients who search Google for these diseases will be directed to articles published by the Society's premier journal, The Journal of Clinical Endocrinology & Metabolism, where they will have free, unlimited access to current content that is ordinarily available only by subscription to scientists and medical libraries." The content "ordinarily available only by subscription" seems to be the copy-edited manuscripts, but those are under a 12 month embargo. I don't want to overlook what the society is offering and would appreciate any help.

Launch of PLoS Clinical Trials

PLoS Clinical Trials has officially launched. From the site:

PLoS Clinical Trials (eISSN 1555-5887) is an international, peer-reviewed, open-access journal published online by the Public Library of Science (PLoS). The journal welcomes articles from around the globe reporting results of randomized trials in all fields of healthcare....PLoS Clinical Trials is run as a partnership between its in-house PLoS staff, and international Advisory and Editorial Boards, ensuring fast, fair and professional peer-review.

PLoS Clinical Trials aims to broaden the scope of clinical trials reporting by publishing the results of randomized trials in humans from all fields of healthcare. The journal's scope includes trials designed to assess the effects of different ways of treating, diagnosing, screening for, and preventing disease. Trials within the journal's scope may address any type of intervention relevant to healthcare. This may include, for example, reports of late phase II or phase III studies of pharmaceutical products, or medical devices, but also trials examining surgical therapies, mechanisms of health service delivery, and behavioral, lifestyle, psychological and educational interventions. We publish articles reporting the main results of a trial as well as interim, planned follow-up, and secondary analyses. In order to maximize the number of trials whose results are available in the public domain, publication decisions will not be affected by the direction of results, size or perceived importance of the trial. Confirmatory studies are welcome.

Each published paper in PLoS Clinical Trials will be linked to its corresponding entry in the relevant registry. PLoS is collaborating with Global Trial Bank (GTB), a non-profit subsidiary of the American Medical Informatics Association, to ensure that trial results are captured and stored in a computer-readable, standardized format. Once GTB is operational, results data from trials published in PLoS Clinical Trials will be coded and entered into the GTB database for open-access searching, browsing, and data-mining. Reciprocal links will be created between papers in PLoS Clinical Trials and the corresponding entries in GTB. More details on this collaboration will be available soon, both here, and on the GTB Web site....

All works published in PLoS Clinical Trials are open access. Everything is immediately available without cost to anyone, anywhere-to read, download, redistribute, include in databases, and otherwise use-subject only to the condition that the original authorship is properly attributed. Copyright is retained by the authors. Click here for more information on our license. Publishing costs are offset by a publication fee charged to authors. PLoS waives the fee for authors with insufficient funds. The ability to pay is not known to editors or reviewers, and never affects the decision whether to publish an article.

Also see the press release.

OAI-compliant archiving software from LANL

The Los Alamos National Laboratory has released of aDORe Archive, open-source, OAI-compliant archiving software. From the site:
The aDORe Archive is a write-once/read-many storage approach for Digital Objects and their constituent datastreams. The approach combines two interconnected file-based storage mechanisms that are made accessible in a protocol-based manner. First, XML-based representations of multiple Digital Objects are concatenated into a single, valid XML file named an XMLtape. The creation of indexes for both the identifier and the creation datetime of the XML-based representation of the Digital Objects, facilitates OAI-PMH-based access. Second, ARC files, as introduced by the Internet Archive, are used to contain the constituent datastreams of the Digital Objects in a concatenated manner. An index for the identifier of the datastream facilitates OpenURL-based access. The interconnection between an XMLtape and its associated ARC file(s) is provided by conveying the identifiers of these ARC files as administrative information in the XMLtape, and by including OpenURL references to constituent datastreams of a Digital Object in the XML-based representation of that Digital Object stored in the XMLtape. The aDORe Archive allows for the storage of mutliple XMLtapes and ARC files through the introduction of OAI-PMH compliant XMLtape and ARCfile registries.

Also see yesterday's press release.

PS: aDORe clearly has special features that set it apart from other archiving packages. To someone more technically proficient than I, the site may suggest the special uses aDORe supports that the other packages don't. But I'm still trying to figure them out.

First Monday Openness presentations

Many of the presentations from the 10th First Monday conference, Openness: Code, science and content (Chicago, May 15-17, 2006), are now online, and the rest should be online shortly. (Thanks to Jim Campbell.)

Coming from Microsoft: Windows Live Book Search

Microsoft to expand search offerings with Windows Live Book Search, LiveSide, May 18, 2006. Excerpt:
The latest [Microsoft] service on the cards is Windows Live Book Search, previously announced as MSN Book Search at the end of last year.  Windows Live Book Search is an online search service for book content, providing readers with tools for discovering and evaluating books for purchase. As with all the search products that are being released under the Windows Live brand, we're expecting it to be available as another tab on the traditional Windows Live Search toolbar, along with the recently launched Windows Live Academic Search and Product Search.

At the same time, Book Search provides publishers with a new way to connect with potential customers. For the smaller publishers, there are plans to provide a self-service publisher portal (unsurprisingly called Windows Live Publishing Portal,) allowing them to add their books to the index over the internet before shipping them to one of the globally distributed scanning vendors.

So how does this differ to Google Book Search? They key part here seems to be the content Microsoft is looking to index. As their original press announcement states, they are looking to index public domain and non copyrighted print material first, shown by their intent to join the Open Content Alliance (OCA). Of course including copyrighted content will be important too, but the recent issues Google has had with publishers should see Microsoft trying to avoid the same pitfalls.

We're expecting this service to be launched as a beta in the near future, with all the indicators pointing to the potential indexing of other print material such as magazines and newspapers in the future.

Chris Sherman building on Kevin Kelly

Chris Sherman, Building the Universal Library, Search Engine Watch, May 18, 2006. Excerpt:
What will it take for Google or another search engine to truly assemble a library of all of the world's information? A thought-provoking essay by Wired magazine's "senior maverick" [Kevin Kelly] takes a fascinating look at the challenges....

[Kelly] says these [book digitization] projects are scanning about a million books a year. Although this sounds like an impressive pace, it amounts to just 5% of all books currently in print. Fortunately, much of the new information created by humans is now in digital format, so it can more easily be included in the Universal Library without the extensive physical effort of scanning books. And let's not forget the web. Although the search engines have become fairly proficient at creating comprehensive indexes of the surface web, they're still missing massive amounts of content located in databases or other dynamic sources (the Invisible web) --not to mention web pages that have disappeared. "The grand library naturally needs a copy of the billions of dead Web pages no longer online and the tens of millions of blog posts now gone--the ephemeral literature of our time." Including this "ephemeral literature" could prove to be a major challenge. Various studies have put the "half-life" of an average web page at just under two years, with the half-life of a typical web site being just over two years. The most complete publicly accessible archive of the web, the Internet Archive, contains just a fraction of all content that has been posted to the web --some 55 billion pages in all.

But I think it's a fair bet to say that Google and Yahoo haven't thrown away the pages they've crawled through the years. And there's a precedent for digital restoration on a massive scale: Google's painstaking effort to build an archive of the Usenet. Assembling archives stored on magnetic tape, CD-ROM and other sources, Google restored a comprehensive archive of Usenet, dating back to 1981, and made this available to users in December 2001. Although still not totally complete, the renamed Google Groups now likely contains more than 99 percent of all Usenet postings ever made. It's not unthinkable that Google and Yahoo, the longest surviving crawler-based engines, could collaborate to restore a comprehensive archive of the web. Surely there are data archives from search engines now long-gone that could also be mined to build out an archive....

More on Google's book digitization program

Leslie Walker, Google's Goal: A Worldwide Web of Books, Washington Post, May 18, 2006. Excerpt:

It’s odd to hear Vinton Cerf, regarded as one of the founding fathers of the Internet, to gush over ink-on-paper books. The electronic pioneer and computer scientist, who now works as Google’s chief Internet evangelist, is also a bibliophile...These days, Cerf is busy promoting Google’s plan to marry his two passions -- books and the Internet -- by digitizing millions of library books. He recently dropped by my office to explain the controversial plan and talk about its implications for book lovers.

As Cerf talked about his personal book collection and the limitations of having knowledge fixed on paper, he got me thinking about how reading will be transformed when static libraries join the more dynamic world of cross-referenced knowledge on the Web...."Think for a moment about the dead-tree problem," he said. "When you stand in your own personal library looking for something and you realize that A, you can’t remember which book it was in, and B, there’s no way you can go through manually looking at all the pages, then you think, ’God, I wish all this stuff was online.’ "...

Google is not alone in trying to digitize library books. Yahoo, Microsoft and other Internet players have joined a collaborative effort called the Open Content Alliance, which is planning to digitize not only library books but other types of multimedia, as well, making them all accessible on the Web....

Cerf thinks [the five] publishers [suing Google] fail to appreciate that Google probably will help them sell more books by making them searchable. Helping people locate a book and know what’s in it, he said, are key steps toward getting them to buy it. And for many books are available for sale, Google provides links to Amazon.com and other online sellers. Google does not sell books.

For now, Google is showing no ads alongside search results involving books from libraries, only books provided by publishers. In those cases, publishers are receiving a share of the ad revenue. Google also recently announced it will soon allow publishers and copyright holders to sell full electronic access to books through Google book search, either by letting people read the text online or downloading copies. Google will take a 30 percent commission on any fees publishers collect.

What Google has not announced, but is likely to one day, are ways it might help publishers and authors enhance pages from printed books once they are online. Cerf refers to this as "books that talk to each other," an idea to make them more like the rest of the Web where pages are cross-linked and visitors can annotate and tag text as is done with Web logs. "Because the Internet is a computing environment, a software environment, it’s possible to create a much richer kind of information than what we are typically accustomed to in books," Cerf said. Digitized books, he said, can be searched and updated easily, linked to related material, and enhanced with audio and video. But they can also be changed, which means that the book you read a year ago may look different the next time you consult it.

More on govt agencies selling data rather than giving it away

Michael Cross, Companies House holds all the cards, The Guardian, May 18, 2006. Excerpt:
Companies House is one of the umpires of British capitalism. The agency, based in Cardiff, runs the official register of UK businesses and their shareholders. Its database is the first port of call for anyone checking corporate bona fides.

Unlike most umpires, however, Companies House also competes in the game it supervises. As it adopts new technology, the agency is moving up the value chain in a growing market for electronic business information. Other firms fear that the state monopoly's entrepreneurial zeal, driven largely by government policy, could put them out of business.

The case of Companies House illustrates some of the conflicts faced by government trading funds required to fund essential services by selling information-based services. Guardian Technology's Free our Data campaign argues the two roles should be separated in the interests of nurturing a knowledge economy....

Advertising the institutional repository

The University of Michigan libraries issued a kind of advertisement or press release last week to encourage faculty to deposit their research in Deep Blue, UM's OA institutional repository. Excerpt:

Your work: cited more, safe forever.

The University of Michigan has more than 150 years of experience and expertise in presenting and preserving the world's best research and creativity. With Deep Blue, the UM Institutional Repository, we now have a place specifically for our faculty work. Faculty create it, deposit it online, and decide who should have access. We take care of the rest, for free.

Use it to connect with other scholars: In a cross-disciplinary study, when compared to articles that require paid access, those in systems like Deep Blue "...have consistently more citations, the advantage varying from 25%-250%."[1]

Ask your librarian or send a message to deepblue@umich.edu to get started. For more information about Deep Blue, see http://deepblue.lib.umich.edu/about/

1. Based on a study of 1,307,038 articles published from 1992-2003 in biology, psychology, sociology, health, political science, economics, education, law, business, and management. (Hajjem, Harnad, and Gingras, "Ten-Year Cross-Disciplinary Comparison of the Growth of Open Access and How It Increases Research Citation Impact." IEEE Data Engineering Bulletin, Vol. 28 No. 4, December 2005, 8pp.)

More on Turkey's OA and IR Working Group

Bülent Karasözen, Ilkay Gürbüz-Holt, and Cem Coskun, Open Access and Institutional Respositories: Recent Developments in Turkey and SPARC, a presentation delivered at The Role of the Academic Libraries in the Preservation of Cultural Heritage: Institutional Repositories (Thessaloniki, May 8-9, 2006). Self-archived May 17, 2006.
Abstract: In this work, establishment of Ankos Open Acess and Institutional Repositories Working Group, its goal, objectives and work; initiatives taken on open access in Turkey; cooperative work with SPARC are examined.

Helping authors retain the rights needed for OA archiving

John Ober, Facilitating open access: Developing support for author control of copyright, College & Research Libraries News, April 2006. (Thanks to Information Overload.) Excerpt:

Advancing the creation, dissemination, and preservation of knowledge is the nominally shared philosophy of all the stakeholders in scholarly communication systems, publishers included. Indeed it is the stated life’s work or mission of some, especially of scholarly societies. But increasing shareholder value, or, for many societies, supporting all of their good works on the backs of publishing revenues, too often trumps the philosophy. So the question becomes this: Through what logic and what mechanisms, if any, can copyright be managed to truly support that underlying philosophy?...

The need for global balance between monetary incentives to publish and the ensuing societal benefits is not much on the minds of scholars when they consider the terms of a publication agreement. But when scholars are, in fact, tempted to consider managing their copyright, what prompts them is the rationale that “retaining copyright can increase the amount of and the forms of dissemination of my scholarship, which leads to its greater use, impact, and resulting rewards.”

Libraries should be clear and honest about the logic of our advocacy, too, which seems to be: Faculty copyright retention is a necessary precondition for developing new forms of dissemination that (possibly) allow restructuring of some of the economic patterns to be more sustainable. Or, more bluntly, copyright retention and subsequent grants of use (might) reduce/remove (some) economic barriers to acquiring content for research/teaching....

The tool currently at the heart of scholars’ copyright management is the publication agreement/contract they sign with the publisher. It makes sense to encourage and guide authors to amend or replace the copyright language in those contracts or, perhaps even more effectively, to replace the entire contract with one designed around clear rights statements. Through their licenses, the Creative Commons has given us mechanisms to declare and attach rights to material. SPARC and the Science Commons are now extending that work by providing a model publishing contract addendum that leverages the directness and simplicity of the Creative Commons terms.  But it’s a tough slog to get these addenda and alternative publication agreements used. One response is to surround the model addenda with other copyright management infrastructure. Components of that infrastructure include, at the minimum:  [1] Extending our understanding of current faculty attitudes and behavior toward copyright; [2] A proactive campaign to educate and reach out to scholars, particularly one focused on their own self-interest in copyright management; and [3] Crucially, an explicit place to exercise the retained rights and provide unfettered access to scholarship, i.e., an institutional repository (IR) (or, failing that, assistance in depositing work in disciplinary repositories, such as PubMed Central, arXiv, CogPrints, and the like).

Turkish OA brochure

Turkey's Open Access & Institutional Repository Working Group has published a brochure on OA (in Turkish). One of its authors, Ilkay Holt, describes it on OA Librarian:

[The brochure will] guide researchers, academics, and librarians on what is open access; what is not open access; how to create open access: green and gold ways; open access and copyright; impact of open access and royalty free publication; why and how to support open access; open access in international context; towards to national open access movement and open access activities in Turkey. It is similar in format to the SPARC OA Brochure. The ANKOS OA Brochure will be distributed to the university administations, libraries, and research centers in Turkey to create awareness on the subject.

The Academic Invisible Web

Dirk Lewandowski and Philipp Mayr, Exploring the Academic Invisible Web, a preprint self-archived May 17, 2006.
Abstract: Purpose: To provide a critical review of Bergman’s 2001 study on the Deep Web. In addition, we bring a new concept into the discussion, the Academic Invisible Web (AIW). We define the Academic Invisible Web as consisting of all databases and collections relevant to academia but not searchable by the general-purpose internet search engines. Indexing this part of the Invisible Web is central to scientific search engines. We provide an overview of approaches followed thus far. Design/methodology/approach: Discussion of measures and calculations, estimation based on infor-metric laws. Literature review on approaches for uncovering information from the Invisible Web. Findings: Bergman’s size estimation of the Invisible Web is highly questionable. We demonstrate some major errors in the conceptual design of the Bergman paper. A new (raw) size estimation is given. Research limitations/implications: The precision of our estimation is limited due to small sample size and lack of reliable data. Practical implications: We can show that no single library alone will be able to index the Academic Invisible Web. We suggest collaboration to accomplish this task. Originality/value: Provides library managers and those interested in developing academic search engines with data on the size and attributes of the Academic Invisible Web.

From the body of the article:

Library collections and databases with millions of documents remain invisible to the eyes of users of general internet search en-gines. Furthermore, ongoing digitization projects are contributing to the continuous growth of the Invisible Web. Extant technical standards like Z39.50 or OAI-PMH (Open Archives Initiative – Protocol for Metadata Harvesting) are often not fully utilized, and consequently, valuable openly accessible collections, especially from libraries, remain invisible....

There are different models for enhancing access to the AIW, of which we can mention only a few. The four systems to be described [in this article] have a common focus on scholarly information, but the approaches and the content they provide are largely different. [1] Google Scholar and Scirus [2] are projects started by commercial companies. The core of their content is based on publishers’ repositories plus openly accessible materials. On the other hand, [3] Bielefeld Academic Search Engine (BASE) and [4] Vascoda are academic projects where libraries and information pro-viders open their collections, mainly academic reference databases, library cata-logues plus free extra documents (e.g. surface web content). All systems use or will use search engine technology enhanced with their own implementations (e.g. cita-tion indexing, specific filtering or semantic heterogeneity treatment)....

[T]he AIW is very large and...its size is comparable to the indices of the largest general-purpose Web search engines. Therefore, only a co-operative approach is possible. We conclude that existing search tools and approaches show potential to make the AIW visible. What we do not see is a real will for lasting collaboration among the players mentioned.

Comment. There's a lot here for friends of OA to think about. One lesson is that an OA article can still be invisible in the relevant sense (not indexed by all or most search engines) if it has no incoming links, if it's in a file format most search engines ignore, or if it's in a relational database for which access requires filling out an interactive form. Most OA content is visible in this sense, but not all of it is. We can do better, both by making existing OA content more visible and (of course) by making more content OA.

See my tips (co-written with Google) on how to facilitate Google-crawling of OA repositories and my tips on how to make visible OA content even more visible or discoverable.

New issue of AHDS Newsletter

The Spring issue of the AHDS Newsletter is now online. This issue has a list of UK digitization projects that will lead to OA collections, a list of OA collections recently deposited with the AHDS, and a very interesting report on the Arts and Humanities e-Science Support Centre, a project to promote the methods of e-Science in the arts and humanities.

Wednesday, May 17, 2006

New OA database on molecular imaging

MICAD (Molecular Imaging & Contrast Agent Database) is a new OA database from the NIH. For details, see today's press release.

New search engine for OA medical research

Healthnostics has launched MedBioAccess, a search engine for the Collection of Biostatistics Research Archive (COBRA), an OA repository from Bepress. For more details, see yesterday's press release.

More on the Eysenbach study

Sophie Hebden, Open-access research makes a bigger splash, SciDev.Net, May 17, 2006. Excerpt:

Scientific papers published in online journals that are open-access have a bigger impact and are cited more frequently than papers readers must pay for, according to a new study [by Gunther Eysenbach]. The findings will strengthen calls for more online scientific journals to switch to the open-access model and make research freely available. Journal subscriptions are too expensive for many scientists in developing countries, making open-access their sole means of keeping up to date with research in the rest of the world....

Gunther Eysenbach, a health policy specialist at the University of Toronto, Canada, monitored the number of times each of 1,500 papers published in the Proceedings of the National Academy of Sciences were cited in later studies. The journal has a ’hybrid’ publishing model, meaning that authors can choose to pay a US$1,000 fee to publish their papers for immediate free access on the journal’s website. All other papers become open-access six months after publication. Eysenbach found that open-access papers were twice as likely as other papers to be cited 4-10 months after publication. This increased to three times as likely 10-16 months after publication. More surprisingly, the study found that articles published as open-access from the start on had a higher impact than articles published as non-open-access, which researchers had ’self-archived’ on other websites. Eysenbach says this could be because few scientists search the Internet for an article if they have encountered problems viewing it on the journal website.

Subbiah Arunachalam of the M. S. Swaminathan Research Foundation in India says, however, that self-archiving is the best publishing model, even though many journals waive the open-access fee for developing country authors. "I believe that open-access archiving is a better option because it would allow us to achieve 100 per cent open-access more quickly," he said in an interview published on 10 May on technology journalist Richard Poynder’s web blog, Open and Shut?

New OA journal on the environment

Environmental Research Letters is a new peer-reviewed, open-access journal from the Institute of Physics. From today's announcement:

From climate change to waste management and renewable energy sources, environmental science has always covered issues that affect everyone.  Although specialist publications for the different branches of environmental science exist, there is still a need for a central source for all environmental research. The Institute of Physics recognises this and is launching Environmental Research Letters (ERL), the first open-access journal that will cover the whole of environmental science.

ERL will serve the entire environmental science community, including both specialist researchers and the wider public, by providing free access to wide-ranging content on topics extending across environmental science. The journal will offer a combination of research letters, commentaries, job and other advertisements, reviews and news items. ERL will be completely free to read online and published authors in the journal will be required to pay an article publication charge....

Tim Smith, publisher at Institute of Physics Publishing said “...We believe that ERL’s topical coverage together with the general public’s increasing awareness of issues relating to the environment makes open-access an appropriate publishing model for the journal. ERL will be primarily an online publication, but people will be able to subscribe to a print version in 2007. The first content will be available in October 2006.”

Maximizing ROI in research

The RCUK has published a brochure, Adding Value: How the Research Councils Benefit the Economy.

In the blurb on its what's new page, the RCUK says the new brochure "shows how the Research Councils work collectively and individually to get the best returns from their investments in research" (my emphasis).

Comment. I have no doubt that the Research Councils benefit the UK economy --indeed, the world economy-- through the research they fund. But they do not maximize the return on their investment in research until they provide open access to the results.

More on the Humboldt OA declaration

Richard Seitmann, Open Access: Mit Hochschul-Publikations-Servern aus der Zeitschriftenkrise, Heise Online, May 16, 2006. On the recent meeting of the German University Rectors Conference (Hochschulrektorenkonferenz, HRK) at Humbolt University Berlin and the supportive views of HRK's General Secretary, Christiane Ebel Gabriel, on OA and Humboldt's new OA declaration. (In German.)

PS: See my 5/12/06 blog posting on the Humboldt OA declaration.

Willinsky podcast on OA

Here's a 60-minute podcast of John Willinsky's keynote on OA at the conference, Learning Free of Boundaries (Okanagan, British Columbia, May 405, 2006). (Thanks to Jim Sibley.)

Ian Russell moves from Royal Society to ALPSP

Ian Russell, the Head of Publishing for the Royal Society, has been appointed the new CEO of the Association of Learned and Professional Society Publishers (ALPSP). Russell will take over on October 1. For more details, see today's press release.

Jan Velterop on FRPAA

Jan Velterop, On the bill, The Parachute, May 16, 2006. Excerpt:
The Cornyn-Lieberman Bill, a.k.a. the Federal Research Public Access Act or FRPAA, has evoked some strong reactions. Many - perhaps most - publishers are dismayed; many - perhaps most - open access advocated are delighted.  Yet I'm afraid I see the FRPAA as a bit of a dogs dinner. Fish nor fowl. The six months' embargo is a perilously short period of time for most publishers to recoup their costs via subscriptions. And it is useless as a stimulus to the development of sustainable open access publishing.

Of course, I know of assertions that a six months' embargo poses no threat to subscriptions; even that immediate open self-archiving is safe. The example ('evidence') invariably given is that of physics and the effect ArXiv has had on subscription [at least so far]. Evidence? Perhaps. But without a mechanism, or even hypothesis, that might possibly be seen as explaining the phenomenon. Not even like evidence that extremely diluted potions still seem to provide a cure for some diseases in some people. For that we have at least a hypothesis: placebo-effect. Even if publishers do not entirely reject the evidence, they simply cannot afford to bank on its broad and sustained validity. Hence, publishers' anxiety.

The only reason why there possibly is a six months' embargo in the Bill is a realisation that publishers need to be able to recoup the money they put into publishing. Given that Messrs. Cornyn and Lieberman realise this, it would have been better to require immediate open access and to acknowledge that publishing is part of doing research, and therefore the cost of publishing part of the cost of research, thereby stimulating publishers to seriously develop open access publishing models based on article processing charges....

The whole world of scientific and scholarly research benefits from having robust and reliably sustainable open access publishing structures. Politicians do, too, because society as a whole does, too. And yes, publishers do, too.

Comments. Just two quick responses.

  • I agree with Jan that FRPAA should have provided that the cost of (OA) publishing is part of the cost of research and hence that grantees may use grant funds to pay processing fees charged by OA journals. I listed this as one of three weaknesses in the bill in my article on it earlier this month.
  • However, I disagree with Jan that we have no hypotheses to explain why high-volume OA archiving in physics has not caused cancellations of physics journals and why (after more funder mandates) we may see the same phenomenon in other disciplines. Here are seven hypotheses. (1) ArXiv is mostly preprints, not postprints, and libraries want to provide access to postprints. (2) The CURES Act and FRPAA would mandate OA to the author's final version of the manuscript, not the published edition (which may contain extensive copy-editing and mark-up), and libraries want to provide access to the published version. (3) CURES and FRPAA allow a six-month delay before the OA edition must be released, and libraries want to provide immediate access. (4) CURES and FRPAA only mandate OA for articles based on research funded by certain agencies, virtually no journals limit themselves to those articles, and libraries want to provide access to all the articles in a journal, not just the subset funded by certain agencies. (5) CURES and FRPAA only mandate OA to research articles, not to review articles, editorials, letters, news, and other journal content, and libraries want to provide access to these other kinds of content as well as the research articles. (6) Some publishers report that OA archiving soon after publication actually increases subscriptions and submissions. (7) University libraries want to provide access to journals in which their own faculty publish (or, they are under pressure to do so even if they don't want to).

Dorothea Salo on the Eysenbach study

Dorothea Salo, That's the stuff, Caveat Lector, May 16, 2006. Excerpt:

So a couple months after the kerfuffle about how to explain citation advantages for open-access articles, a new study comes out saying “no, no, it really is the open-access advantage.”

This study went flat-out to deal with other explanatory possibilities. They stuck with articles in one journal (PNAS) to cancel out journal-prestige issues. They stuck with newly-published articles, to cancel out author-vanity effects. They multiply-regressed their data until the spreadsheets cried for mercy to account for career length and similar author-prestige measures (though I would like to see more tests on OA and non-OA articles by the same authors, just for fun).

And guess what. Even taking all that into account, there’s still a significant and measurable advantage for open access. Ha bloody ha with knobs on, as Bertie Wooster would say.

There’s one joker in the abstract for repository-rats, though (added emphasis mine): “Articles published as an immediate OA article on the journal site have higher impact than self-archived or otherwise openly accessible OA articles.”  I believe that, actually, especially for newly-published articles. It’s just plain easier to find an article via a publisher’s website than on the open Web....I also believe, though, that this advantage is likely to thin out over time; articles that have been out longer get found any number of ways, Google not least. And obviously the publisher’s website is less salient for journals that are widely available in article databases (which PNAS isn’t, I don’t think) or for disciplines that have one or two well-known and well-used disciplinary repositories.

Even so, we repository-rats have got to get busy on better open-access linking and discovery mechanisms. We’re cursed blessed with highly heterogeneous repository content, which makes us dubious search destinations; the chances that a given repository has an item of interest to a specific person is really pretty small. Metadata sharing, harvesting and metasearch perforce have to be the answer, but the current state of the art in harvesters marketh not disciplinary boundaries --which makes the search engines frustrating to use for real researchers trying to find work within their disciplines. And we have to mark peer-reviewed, published-elsewhere articles better. The way we do it now (we mostly don’t) is just plain bloody broken.

Charging readers, paying authors, in micropayments

Cell Science is a journal whose business model includes micropayments paid by readers and paid to authors. (Thanks to LIS News.) From the journal page on micropayments and royalties:

In May 2006 Cell Science became the first scientific Journal to introduce the payment of royalties to authors.At the crest of the emerging wave of ‘open access’ online Journals, Cell Science has marked its 2nd anniversary with the first introduction of a micropayment system to make scientific research more widely available, affordable and rewarding to authors.Although only a recent entrant into the global Scientific, Technical & Medical (STM) publishing market which an estimated annual turnover of $11bn (2004, Simba), Cell Science aims to transform the way in which scientific journals are funded to enfranchise authors and reduce the cost burden to readers and institutions....

[T]he new Cell Science initiative to pay authors royalties [encourage academics to publish in OA journals].Given that the prices of scientific periodicals are currently often inversely related to their scientific impact, the introduction of micropayment models might provide a mechanism to allow the revenue of a journal to be coupled to its demand....

The problem is exactly how to remove the onus of hefty library subscription payments without merely offloading the financial costs of publishing onto the shoulders of the scientific authors themselves as is presently the case.It is all well and good to relieve the burden of publication costs from wealthy institutions, but requiring authors to pay for publication within open access journals is hardly a fair or ideal alternative, and this especially disadvantages those researchers from Africa, Eastern Europe, Central and South America....

So what alternatives are there to the traditional subscription-based and the pay-to-publish open access models within the Brave New World of the Internet? Apparently overlooked by the Wellcome Trust, amongst others, is a third publication model which answers many of the short-comings of both the current models. The solution, first instituted by Cell Science, is a micropayment funded model, wherein a small revenue is generated every time an article or edition is accessed by an end user.In volume such micropayment royalties would cover not only publication costs, but also the payment of royalties and third party payment processing fees.Journals would rise and fall according to fairer market principles, with more popular Journals collecting higher revenues and paying out more in royalties to the most popular authors.Good research would be readily available, affordable and accessible, and would be paid dividends.Excellent research might conceivably even repay itself from publication revenues.Libraries would no longer have to pay vast sums to subscribe to bundles of journals that few people read....Authors would consider levels of royalty payments in their assessment of which Journal to publish in, and not solely impact factors or prestige.A truly competitive market for STM publishing could be created, with those Journals which offer less interesting fare or smaller royalties falling by the wayside....

Although its modus operandi is more akin to that of a co-operative than of a corporation, for the convenience of accounting and raising investment, Cell Science was founded in 2002 as a Private Limited Company with a capital outlay of no more than $30,000.Funded through limited advertising and micropayments, it was intended that its editors would be paid in the form of dividends and its authors in the form of royalties....

Cell Science has initially offered scientific reviews from leading International authorities, although from 2007 it is expected that the full publication of original research findings will commence. Cell Science is intended ultimately to provide a flexible level of publication for a wide range of scientific correspondence, from articles to short dispatches. In addition scientific correspondence, in support or contradiction of papers previously published, will be welcomed, as debate is the engine of scientific advance. It will be interesting to see how Cell Science fares within an intensely competitive market place, and whether it will successfully form the crest of a wave of economic change within the well-heeled and comfortable enclave of scientific publishing.

Comments. Here are some first thoughts on this interesting model.

  1. I hope Cell Science implements this plan. Creative thinking about business models and real-world experimen