Open Access News

News from the open access movement


Saturday, June 09, 2007

UK govt reports calls for OA to more UK public data

Call to open up public data use, BBC News, June 8, 2007.  Excerpt:

...Commissioned by the Cabinet Office, the report [The Power of Information by Tom Steinberg and Ed Mayo]...said that...much more could be done to open up access to official information.

It said public data should be published in open formats to encourage use....

It noted the many cases in which public sector information already generated huge amounts of business in the UK....

The benefits of making it easier to use and share official information were not just financial, said the report.

For example, in Los Angeles sharing reports about the food safety records of restaurants has led to a significant drop in food poisoning cases....

But the report took the government to task for not putting in place policies that make the most of web opportunities.

The authors recommended that the government work more closely with existing sites and communities that share official aims; do more to help innovators use public data and work to ensure people know what to do with public data and how to get at it.

Among 15 specific recommendations the report said the government should not set up its own sites if existing web communities do a good job of getting information to people.

It also said it should speed up efforts to put data in open formats and publish under terms that let people freely use it....

Open courseware at MIT and beyond

Kim Thomas, MIT sets learning free, Information World Review, June 4, 2007.  Excerpt:

In the late 1990s, when everybody wanted to take advantage of the money-making opportunities offered by the internet, Massachusetts Institute of Technology (MIT) decided that it, too, wanted a slice of the action. MIT was, and still is, one of the most prestigious universities in the world. Couldn’t it use some of the intellectual property it was creating on its campus to generate some additional revenue?

A committee of faculty members looked at the issue but decided, after careful consideration, that the internet didn’t offer much of an opportunity to make money, after all. Why not, the committee suggested, focus instead on the university’s core mission: “to advance education and serve the world”?

That refocus led MIT to a radical new proposition. In the words of Anne Margulies, now executive director of MIT Open CourseWare (OCW), “They decided that the best way the internet could be used to fulfil that mission would be to give the materials away.” ...

In 2002, MIT launched its Open CourseWare site with materials from 32 courses in 16 academic departments. This has since expanded to include materials from 1,550 courses, including not just course notes but audio and visual materials too. By the end of this year, material from all 1,800 courses run by MIT will be available online.

According to Margulies, the facility has proved immensely popular. In January this year, the site received 1.5 million visits from all over the world, 60% of which came from outside the US, with India and China the biggest users. MIT is now partnering with other organisations to translate its OCW materials: about 400 courses are currently available in other languages besides English.

Where MIT has led, the rest of the world has been keen to follow. There are now 60 higher education institutions offering open courseware programmes, including several universities in the US, China and Europe. These institutions, as well as around 60 more that are at the planning stage, have formed the Open Courseware Consortium to share ideas and experiences....

The most striking feature of the open courseware movement is that it requires a fundamental switch in institutions’ approach to information, one shared by the open access movement in scholarly publishing. You might expect academics to be keen to protect and copyright course materials that are the product of years of research and teaching experience. Yet all members of the open courseware consortium make their course materials available under a Creative Commons licence, in which people can use the materials freely, provided they accredit the institution as the source of the material and do not use it for profit....

Perhaps it is this global reach of the open courseware movement that offers the most radical challenge to the traditional localised method of delivering education. Some of the Open Courseware Consortium’s members are experimenting with new models. Universia, for example, is a collaboration between a number of Spanish and Latin American universities, funded by the Bank of Santander. Its country-specific web portals, which offer information about higher education, such as available courses and grants, already attract six million visitors each month. It was ideally placed, therefore, to provide open courseware, and for the past four years has been translating MIT’s open courseware into Spanish and putting it online. As a result, many of the member universities, says Pedro Aranzadi, Universia’s managing director, saw the benefits of putting their own courseware online, and Universia has just launched 10 open courseware sites supplied by Spanish universities....

“If you’re dealing with a knowledge society, the best thing you can do is make knowledge freely available,” Kirschner says.

Call for papers on OA

OCLC Systems & Services has posted a call for papers for an upcoming issue devoted to OA .  Proposals are due by July 1, 2007.

Elsevier opens IJSS backfile after two year embargo

Elsevier's International Journal of Solids and Structures has decided to make its back issues OA after a two-year embargo.  Details from Zhigang Suo on the iMechanica blog:

At a meeting of the Editorial Board of IJSS, on Sunday, 3 June 2007, in Austin, Texas, the representatives from Elsevier, the publisher of IJSS, told the members of the Board that all articles published in IJSS will be freely accessible 24 months after publication.  The first of these articles will become available in October 2007....

All articles published in IJSS, dating back to Volume 1 in 1965, are available as part of the engineering backfile package.  This package is a one-off purchase with no annual fee....

Open access is likely the future mode of publishing.  The question is how we get there....

OA at the U of Oldenburg

There's a short note about OA at the University of Oldenburg in the new DINI report, Changing Infrastructures for Academic Services: Information Management in German Universities (undated but announced this week).  Even though the report is online under a CC license, the PDF is locked (why?) and I only have time to rekey this paragraph from p. 347:

The strategic goal of the University of Oldenburg to extend its research profile requires [it] to disseminate the research results into the scientific communities in the best possible fashion by documenting and, after quality assurance, publishing them. IBIT supports scientists and researchers by providing a platform and services for electronic publishing to make their works openly accessible worldwide. The University of Oldenburg commits itself to the international movement for open access to scientific and scholarly information.

PS:  This is the first I've heard of an OA policy at Oldenburg.  Does the word "requires" in the first sentence refer to a publish-or-perish policy or to an OA mandate?  If the former, then what form does Oldenburg's commitment to OA take?  If anyone can shed light on this, please drop me a line or post a note to SOAF.

Open data on real-time web activity

Akamai has opened up its real-time web monitor, formerly restricted to paying customers.  The monitor shows real-time traffic volume, latency times, and attack frequencies around the world.  (Thanks to Glyn Moody.)

More on Nature Precedings

Pedro Beltrão, Nature Precedings, a pre-print server for biomedical research, Public Rambling, June 7, 2007.  Excerpt:

...I have been participating in the beta for some months now and as it is mentioned in the editorial it will be openly available starting next week. All documents are citable (have DOIs), are not peer-reviewed (in the formal sense) and are archived under a creative commons license (derivatives allowed). The site has the community features (tagging/commenting/rating/RSS feeds) that you would expect and that will hopefully allow for requesting and providing comments on early findings. In summary an nicer version of ArXive for biomedical research.

I think this is great news that serves on one hand to improve access to research (open access by pre-print archiving) and increase the openness of research. This can provide a place for independent time-stamping of early findings and could be improved (hopefully with community feedback) until it is appropriate for formal submission to a peer-reviewed journal.
A framework for open science (in biology) can now go from blogs/wikis to pre-print server to peer-reviewed journals. Many ideas might die along the way and many collaborations might form by connecting early findings in an unexpected way.

Of course if you are in maths/physics you have arXive and you are probably wondering what is taking us biomedical researchers so long to get into this.

Update. Also see Bryan Vickery's thoughts on Nature Precedings, mixed with his reminiscences of Elsevier's now-defunct Chemistry Preprint Server. Excerpt:

Just like the CPS, submissions will be screened for appropriateness, but no judgement will be made by the internal editors about the quality of the work. This, is the job of the community.

The next step is, obviously, to allow authors of research articles posted to Nature Preceedings, to be able to submit these directly to a peer reviewed journal and for that journal to be able to link directly back to previous versions of the manuscript.

PhysMath Central, from BioMed Central, does exactly that. Authors can upload their arXiv manuscript directly to the online system. See Chris' recent blog posting "How to submit your arXiv manuscript to PMC Phys A."

Nature is certainly taking what Web 2.0 has to offer, and making it reality. I wish them luck.

Nature launches three OA resources

Community service: Introducing three free-access websites for research networking and outreach, Nature, June 7, 2007.  Excerpt:

...[Nature's dual mission "to help scientists communicate with each other and to communicate science to wider audiences"] applies to two websites to be launched this week: Nature Reports Climate Change and Nature Reports Stem Cells. Aimed at researchers and at anyone else who is interested, both give an editorial perspective of their fields through a combination of original journalism and commissioned comment, alongside archived material from other Nature publications. Both sites also facilitate community interactions through blogs....

These sites will develop further by way of community interactions and applications in the coming months. The original content of both is freely accessible.

Also free is a very different website to be launched next week: Nature Precedings. As its title implies, this site will enable researchers to share, discuss and cite their early findings. It provides a lightly moderated and relatively informal channel for scientists to disseminate information, especially recent experimental results and emerging conclusions. In this sense, it is designed to complement traditional peer-reviewed journals, allowing researchers to make informal communications such as conference papers or presentations more widely available and enabling them to be formally cited. This, in turn, allows them to solicit community feedback and establish priority over their results or ideas.

Intended to cover biomedicine, chemistry and the Earth sciences, the site will host a wide range of research documents, including preprints, unpublished manuscripts, white papers, technical papers, supplementary findings, posters and presentations. All submissions will be reviewed by staff curators and accepted only if they are considered to be legitimate scientific contributions of likely interest to others in that field. No judgement is to be made about the quality or uniqueness of the work, and submissions are not subjected to peer review before they are released. Because of this, accepted submissions will usually be published within one working day, and no charge is made to either authors or readers.

Nature Precedings will make full use of participative features such as tagging, voting and commenting to facilitate the discovery of especially interesting and relevant content. We anticipate that the content will be mirrored by academic partner organizations, several of whom have been involved with us in developing this service. As well as allowing it to become incorporated into the substantial information hubs already provided by these organizations, this federated approach will also help to ensure the long-term availability of the content — and act as a practical guarantee of the Nature Publishing Group's pledge not to charge readers for access.

PS:  Kudos to Nature for adding these OA resources to its lengthy list of earlier OA projects and experiments.

Update. See Nature's June 8 press release on Nature Precedings:

...It is anticipated that the content will be mirrored at one or more partner organisations. This federated approach will ensure the long-term availability of the content, and effectively guarantee that the service will remain free and open....

Graham Cameron, Associate Director of the EBI, said, "This is a great step forward in the open sharing of the findings of science. It will...facilitate connections to our databases and allow the application of our state-of-the-art text-mining tools."

"Science progresses through the open exchange and reuse of ideas and data, but within a system that provides proper credit for their originators," said John Wilbanks, Executive Director of Science Commons. "Creative Commons licenses can help to achieve just that, and we are delighted they have found yet another scientific use in Nature Precedings." ...

Representatives of [the project partners, NPG, the British Library, the European Bioinformatics Institute (EBI), Science Commons, and the Wellcome Trust] will form a Precedings Advisory Committee, where they will be joined by a group of senior practicing scientists....

Update. Also see the Science Commons blog post on Nature Precedings. Excerpt:

This is the biological equivalent of the physics arXiv, but with a critical improvement. Placing pre-prints online solves the problem of an individual’s ability to access an article. But in the absence of an explicit copyright license, it’s unclear what that individual can actually do with the downloaded file. Nature’s choice to use CC-BY is a validation of the need to grant rights in advance to users, and of the CC-BY license in a truly Open Access service.

What to deposit where

Stevan Harnad, The University of Leicester Archive -- and What to Deposit Where, Open Access Archivangelism, June 8, 2007. 

Summary:  Prof. Andrew Colman of the University of Leicester inquires about (1) whether to deposit the publisher's PDF or the author's final refereed draft, and (2) whether a central archive would be more visible and accessible than an Institutional Repository (IR) like the Leicester Research Archive. The reply is quite straightforward: 

The default version to deposit is the author's final refereed draft, as it is the one with the fewest publisher restrictions and it fulfills all the usage needs of all (otherwise access-denied) users; the publisher's PDF should only be deposited if the publisher agrees. 

The locus of the deposit should definitely be the author's own IR. In the OAI-interoperable era, the distributed IR metadata are harvested by central harvesters. Creating and depositing in separate central archives for each discipline and combination of disciplines is not the coherent and systematic way to ensure that all institutional research output is made OA. 

The Leicester Research Archive's policy -- deposit the author's postprint, except if the publisher explicitly allows the proprietary PDF to be deposited, and deposit in the author's own IR, rather than central repositories -- is hence correct, but if the IR does not wish to remain near-empty, as now (only 320 deposits in its first year), deposit needs to be mandated, not just invited. I strongly recommend adopting the Immediate-Deposit/Optional-Access (ID/OA) Mandate....

Plans for a Canadian PMC

Dean Giustini, PubMedCentral Canada? Open Medicine blog, June 4, 2007.  Excerpt:

...According to Open Medicine sources, CISTI and the Canadian Institutes of Health Research (CIHR) are in discussions regarding the creation of a Canadian version of PubMed Central International (PMCI).  A vision for this initiative includes building a platform and services for peer-reviewed Canadian health research to form a fully distributed, international open repository. Its launch will coincide with the implementation of CIHR's Policy on Access to Research Outputs.

CISTI and CIHR welcome a dialogue with organizations interested in partnering on this initiative.  The fit with organizational mandates and/or objectives, as well as any unique and relevant expertise would be starting points for discussion....

Notes on the SSP conference

An anonymous blogger has posted some notes on the SSP conference, Imagining the Future: Scholarly Communication 2.0 (San Francisco, June 6-8, 2007).  The post on Day One is password protected.  Excerpt from Day Two:

...I also spoke with Melody Merin, Managing Editor of the news magazine of the American Association of Pharmaceutical Scientists.  AAPS is a small (13,000 members, <50 staff) organization headquartered in Arlington, VA.  They produce three journals, all open access.  Melody is in the process of developing the business case for taking their news magazine online.  She is interested in hearing ideas, recommendations, pitfalls, lessons learned, etc....

Morning Keynote

Dr. Larry Sanger, founder and executive of the Citizendium project, offered insight into alternative forms of collaboration between scientists in support of scholarly publishing.  Citizendium takes Wikipedia to the next level by adding “gentle expert oversight” and requiring contributors to use their real names.  As of June 5, 2007, Citizendium contains approx. 2,000 articles (only 20 of which are actually “approved”) produced by 1,700 authors and managed by 24 editors.

Following are some key bullets from his rather informative presentation:

  • “Academic publishers can survive in the new digital era only if they become something other than academic publishers.”
  • Lessons learned from previous initiatives to build online collections: ...
    • Use an open content license - content should still be there, even if the managing organization goes under, content will survive....

Dr. Sanger suggests four possible business models for selling “free” information:

  1. Advertising - increasingly lucrative but still some ethical concerns; however, consider that the old model (newspapers, TV) also includes advertising which has always been acceptable; make sure not to compromise content.
  2. Pay-to-Play - the contributor pays the publisher for the right to participate; not a popular model.
  3. Premium content - offer most base content for free, but charge price for deep, premium content.
  4. Patronage or “content brokering” - publisher is the middle man and is connected with large network of scholars, institutions, etc.; publisher solicits donations from individuals and institutions to sponsor the creation and maintenance of content; many librarians have tried this before but in vain; it is hard to get funding established to begin with but even harder to keep it coming for ongoing maintenance of the content; may want to consider tapping into foreign audiences....

Comment.  I'm glad to hear an AAPS insider say that all three of the AAPS journals are OA.  But the AAPS web site says that only two of its journals (AAPS Journal and AAPS PharmSciTech) are OA and that access to the third (Pharmaceutical Research) is limited to members of the society.  I'd welcome clarification of this; but either way, I applaud the AAPS for its commitment to OA.

A blog on OA in ophthalmology and dermatology

OATES : Open Access To Eye and Skin is a new blog by Sara Kuhn.  From the site:

A subject guide to open access scholarly publishing in ophthalmology and dermatology....

The Open Access to Eye and Skin blog is meant to assist scientists, doctors, researchers and librarians in finding, publishing, and discussing scholarly open access research in the global fields of ophthalmology and dermatology (as well as various cancers related to these fields). Please feel free to comment on the resources and on the state of open access in eye and skin research. This blog is for you.

PS:  I love the fact that OA is now big enough to call forth specialized resources like this one.  I also love the way Sara has posted TOC feeds from the major OA ophthalmology and dermatology journals directly to the blog sidebar --something that more general resources (like OAN) could never do.  Every research niche should have a blog like this one.

Need for informal communication to supplement formal publication

Aaron Rowe, New Stem Cell Journal in Print this Month, Wired Science, June 8, 2007.  Excerpt:

Cell Stem Cell [coming this summer] is a big deal because it's published by Cell Press, a highly respected family of journals often cited in the media....

What disappoints me about the new Cell Stem Cell website is that it does not contain any message boards, a feature that open access journals like Chemistry Central have included from their inception. This would allow scientists to ask the authors of a paper questions, the research community to add conclusions that the authors did not think of, and the public to react to the societal implications of the work. Chemists and biologists have lagged far behind engineers, legal scholars, and politicians in embracing blogs, wikis, and message boards. This is partially the fault of the publishers that have a stranglehold on scientific communication and little motivation to change.


Friday, June 08, 2007

22 Iranian OA journals

The Tehran University of Medical Sciences (TUMS) publishes 22 journals, all of them OA.  (Thanks to Caroline Sutton.)  From the TUMS Publications page:

...[A] simple...online system for TUMS journals was established by TUMS Central Library....This system permits researchers to have Free Access to all of full text articles published by TUMS journals....

Most of these journals are abstracted and indexed in famous international databases. Some of these are published in English and some in Farsi language with English abstract....

New OA journal on public health communication

Cases in Public Health Communication and Marketing is a new peer-reviewed OA journal from the George Washington University School of Public Health and Health Services.  It's edited and run by graduate students at GWU's Public Health Communication & Marketing program.  The inaugural issue is now online.  (Thanks to R. Craig Lefebvre.)

More on OA for ETDs

Peter Murray-Rust, Electronic Theses (ETD2007), A Scientist and the Web, June 8, 2007.  Excerpt:

I am honoured to be asked to speak at the meeting next week in Uppsala on electronic theses (The Power of the Electronic Scientific Thesis)....Some snippets:

Yet our own work in the SPECTRa project has shown that 80% (or more) of scientific data is never published….Electronic theses have the power to change all this. The thesis has several major advantages over current methods of publication

  • the author and/or its institution retain complete control over the copyright of the work and are not forced to hand it over to the publisher
  • there is a strict quality control system of internal and external examiners. The candidate has to convince them that the data are fit for purpose.
  • the student cannot be “lazy” about the means of authoring. If a university insists on XML then the student will have to do it.
  • an electronic thesis can (and I argue must) be openly available in an institutional repository.
  • an unlimited amount of supporting data can be copublished.

There are technical and socio-political barriers....

My utopian vision is that students prepare their thesis in XML....

I also prepared a “manifesto” for the JISC meeting - it overlaps with the rules but adds

  • Theses must be born-digital (i.e. NOT PDF)
  • Domain ontologies must be used
  • All data must be included in theses
  • Data must be validated before submission
  • Theses must be openly exposed to data and metadata crawlers...

My interpretation of Open Access is strict BOAI (Budapest Open Access Initiative):

By “open access” to this literature, we mean its free availability on the public internet, permitting any users to read, download, copy, distribute, print, search, or link to the full texts of these articles, crawl them for indexing, pass them as data to software, or use them for any other lawful purpose, without financial, legal, or technical barriers other than those inseparable from gaining access to the internet itself.

Unfortunately it is common practice for many at the JISC meeting to talk of “Open Access” when they mean “Toll Free”. I asked several organizers of thesis repositories specifically whether my robots could download these “Open Access” theses, text-mine them, and publish the results. In all cases I was told that for existing theses this was not allowed. However most agreed that born-digital theses had the opportunity for authors to make their theses fully Open.
The single most important rule, therefore, is that authors should be very strongly encouraged to make their theses fully Open under the BOAI and given the technical and legal tools to do so....

PS:  Hear, hear.  For a supporting argument, see my article from last July, Open access to electronic theses and dissertations.

MRC data access policy

The UK Medical Research Council (MRC) has added a data access policy to its larger open access policy.  Excerpt:

Every year, the MRC invests around £500 million of public money in research, the primary output of which is data. We need to make better use of the research opportunities that such a diversity, richness and quantity of publicly-funded research data provide. One of the best ways of achieving this is to ensure that data are properly preserved for sharing and informed use beyond the originating research teams....

MRC data sharing policy and access principles promote new and extended use of MRC-funded data for high-quality, ethical research. Responsible sharing of data allows testing of new hypotheses and analyses, linkage and pooling of datasets, and validation of research findings. These activities not only reduce duplication of data creation but also enhance the long-term scientific value of existing data. This benefits the wider research community and generates new opportunities for advancement towards the longer-term goal of improving human health.

Below are general principles governing access to, and use of, all MRC-funded research data.

1. MRC research data are publicly-funded and, as a public good, must be made available for new research purposes in a timely, responsible manner.

2. Governance of researcher access to MRC-funded research data must balance the interests of data creators, custodians, users and data subjects.

3. Access policies and practices for individual MRC-funded datasets must be transparent, equitable, practicable and provide clear decisions consistent with MRC data sharing policy.

4. Access to, and use of, MRC-funded data must comply with statutory and other regulatory requirements, and good research practice....

Study-specific data access arrangements must embody the four principles and may draw on the supporting guidance provided here as required.

General guidance on implementing MRC data access principles sets out key elements of good practice in key areas:

General guidance for all MRC-funded research data ...

To complement MRC Access Guidance, the following will soon be available:

• MRC Guides on Curation of Research Data

• MRC Data and Tissues Toolkit

• Data sharing documents – example material...

Comparing PublicationsList and the Depot

The blogger behind Disruptive Library Technology Jester has written a detailed comparison of PublicationsList and the UK Depot.  (Thanks to Charles Bailey.)

Lorcan Dempsey on the CIC-Google deal

Lorcan Dempsey, Systemic change: CIC and Google, Lorcan Dempsey's weblog, June 6, 2007.  (Thanks to Charles Bailey.)  Excerpt:

Today Google and CIC announce an agreement to digitize ten million volumes across the CIC libraries....The CIC announcement is interesting for several reasons:

  • It is a shared effort across a major group of libraries with significant collections. There appears to be strong CIC institutional commitment. Of course, CIC has a history of collaboratively sourced activities and this 'pooling' model makes increasing sense given the necessary policy and service challenges that need to be addressed....For some things, scale matters.
  • The libraries have a shared approach to managing the digital copies based on shared infrastructure at the University of Michigan, and serving them up to their user communities....
  • Google recently advertized for somebody to work on collection development and we seem to be seeing a stronger focus in this area. Collecting areas of importance within each library [pdf] have been identified for attention.

This initiative in turn prompts some more general thoughts about access:

  • One of the most valuable features of the Google initiative is that it digitizes book content, allowing fine-grained discovery over topics, people, places and so on. Of course this presents interesting questions about indexing, retrieval, ranking, and presentation but the advantage of having this access seems clear. It drives use and sales, and it supports enquiry. Without it, the book literature is less accessible than the web literature.
  • However, as we are beginning to see on Google Book Search, we are really going beyond 'retrieval as we have known it' in significant ways. Google is mining its assembled resources - in Scholar, in web pages, in books - to create relationships between items and to identify people and places. So we are seeing related editions pulled together, items associated with reviews, items associated with items to which they refer, and so on. As the mass of material grows and as approaches are refined this service will get better. And it will get better in ways that are very difficult for other parties to emulate.
  • Currently this material is made available within the Google destination site. Google is an advertizing engine and its approach depends on aggregating attention for adverts. This approach may be difficult to deploy within a more 'data services' approach where others - especially the partners - have remixable access to content and services. However, the 'utility' value of this resource will be diminished if it is not made available in this way....(See the related discussion about the search API.)
  • This type of access seems especially important for the partner libraries. In the early days of this activity there was some discussion of the types of services which would be built on top of the digitized books by the libraries. However, it is difficult, and maybe not very sensible, for the libraries to individually invest in some types of service development. An important factor here is that they cannot benefit from the network effects that arise in larger collections and so are limited in the range of service that they could individually develop. This points again to issues of collaborative sourcing.

For me, the CIC announcement moves the conversation about mass digitization to another level. The Google relationship with libraries has seemed like an interesting initiative. But it now seems plausible to think that we are looking at systemic change in how we engage with particular classes of material. Which in turn will cause us to look at the way in which the systemwide library resource is organized. It touches on so much.

  • Disclosure, discovery, delivery.....
  • Collective collection....
  • Copyright....
  • Knowledge organization....
  • Preservation....

OA book on OA self-archived

Last year's book edited by Giandomenico Sica, Open Access, Open Problems (Polimetrica, 2006), was not only OA from birth, thanks to the publisher, but has now been self-archived in E-LIS, also thanks to the publisher.  The book contains essays by Antonella De Robbio, Takashi Kunisawa, Derek Law, Paul Uhlir, and myself.

PS:  Kudos to Polimetrica.  I applaud OA publishers who archive their works in independent OA repositories, assuring authors and readers alike that the works will remain OA no matter what happens to the publisher.

Old dogs, new tricks

A post from Read Doug's Mind:

I was speaking with a History prof the other day and he said one of those things that makes me happy.  We were discussing the availability of articles online (and the open access publishing business), and he made the comment that Google had revived a number of his old articles.  Seems that these articles had pretty much run their course as far as being used through the conventional indexes, but with the advent of google searching for such things (I assume that google scholar was working for him), people were once again finding his work.  His only regret was that the titles of his old stuff didn’t have very google-friendly titles. It is always amazing to hear such comments from the very people who claim that the Internet, and new modes of research, mostly reveal how much crap is out there.


Thursday, June 07, 2007

Internet & Society 2007 podcasts

Podcasts of the major talks at the Berkman Center's Internet & Society 2007 (Cambridge, Massachusetts, May 31 - June 1, 2007) are now online.

PS:  I was scheduled to facilitate a session with Stuart Shieber on what universities can do to promote OA, but fell ill and couldn't make it.  I thank Stuart for taking the session alone on short notice.

OA plus print-on-demand for Emory rare books

Emory University is digitizing thousands of its rare public-domain books and will make them available in OA and print-on-demand editions.  The Emory library houses more than 200,000 public-domain books.  For details, see yesterday's press release.  (Thanks to Charles Bailey.)

More on Innocentive

Tracey Caldwell, R&D finds answers in the crowd, Information World Review, June 4, 2007.  Excerpt:

With it taking anything up to 15 years and the help of a multimillion-pound budget to bring a single product on to the market, pharmaceutical and biotech companies are understandably eager to ensure that their scientists receive the best R&D support possible. But as information professionals and researchers in R&D know only too well, the solution to a research problem cannot always be found in-house....

A crowdsourcer is a business that has created a global, web-based scientific community whose scientists and professionals can be challenged to solve other companies’ R&D problems. So far, chemicals and life sciences have been the main users of crowdsourcers, offering rewards of up to $1m if they are successful. Innocentive, set up by drug giant Eli Lilly in 2001, is one such crowdsourcer, and other sites, such as Nine Sigma and Yet2.com offer similar models.

There is no doubt that crowdsourcing has resulted in solutions to problems that would not have been found otherwise, but this is an evolving model that has to work hard to address concerns about commercial sensitivities, intellectual property rights and scientists’ need to build reputations, even open access to scientific collaboration....

Companies that regularly post challenges include Boeing, Dow Chemical, Eli Lilly, and Procter & Gamble.

In a report, entitled “The Value of Openness in Scientific Problem Solving”, Karim Lakhani, a lecturer in technology and innovation at MIT, found that the broadcast of problematic information to outside scientists resulted in a 29.5% resolution rate for scientific problems that had previously remained unsolved....

But there is a fear that financial incentive-based initiatives could end up diverting open scientific collaboration into closed communications....

Comment.  OA research already takes full advantage of Linus' Law that, given enough eyeballs, all bugs are shallow.  Non-OA research must settle for a lesser degree of this problem-solving power through limited and strategic sharing with motivated problem-solvers.  As organized by Innocentive, it works well enough to attract some major corporations.  But the better it works, the more we should remind ourselves how much better unfettered sharing can work.

Federated searching of US government OA databases

Drew Robb, Exploring the deep web, GCN, June 4, 2007.  Excerpt:

For the past decade, the Energy Department’s Office of Scientific and Technical Information in Oak Ridge, Tenn., has been using the Internet to speed research processes.

“When we first started posting information on the Web in 1997, we relied on search engines provided by the database vendors,” said OSTI Director Walt Warnick. “It soon occurred to us that it would be helpful to provide our patrons with the ability to search across multiple databases at one time.”

That led the agency to install federated search software....In April 1999, OSTI launched the EnergyFiles site, providing access to over 500 DOE databases and sites. That was followed in 2002 by Science.gov, which allows a single query to pull data from 30 scientific research databases at 12 federal agencies. February 2007 saw the release of Science.gov 4.0 with greatly enhanced relevance ranking. OSTI is now working to expand the system to include government research sites worldwide....

Google may dominate the search market, but it has two major shortcomings. The first is that it barely accesses what is known as the deep Web....“In 2000/2001 we did some analysis and realized that the quantity of documents from these deep-Web databases was far bigger than what everyone was calling the Internet,” said Jerry Tardif, vice president at search firm Bright Planet.  Tardif estimated that the deep Web is several hundred times the size of the surface Web....Others give a lower figure....But whatever that size, if you are only using Google or Yahoo, you are missing most of what is out there.

“Google makes search look simple, but in fact, search is not simple, particularly when completeness is important,” said David Fuess, a computer scientist at Lawrence Livermore National Laboratory’s Nonproliferation, Homeland and International Security (NHI) directorate.

The other problem is information overload. Public search engines may be fine for locating a hotel in Singapore, but not for professional research.

Federated search engines address both of these problems....

 “Science.gov is mostly [research and development] findings,” Warnick said....[I]t gives searchers in-depth access to research papers from CENDI (originally the Commerce, Energy, NASA, Defense Information Managers Group), an interagency working group of senior scientific and technical information (STI) managers from a dozen agencies, including DTIC, the National Agricultural Library, the National Library of Medicine and the National Science Foundation. Together, CENDI members control more than 95 percent of the federal R&D budget, so accessing their databases provides a near-comprehensive overview of federally funded research. OSTI also hosts several other federated search sites including E-Print Network and Science Accelerator.

DTIC has its own federated search engine — STINET (Science and Technical Information Network) Federated Search — specializing in providing research information to the Defense Department community.

“Our customers wanted to come to a single site and search for scientific information from both the DTIC and our sister organizations in other federal agencies,” said Ricardo Thoroughgood, chief of the STINET Management Division. “Initially, it was an internal DOD resource, but we shut down that site and made it available to the public with all unclassified and unlimited information, so that data is readily available to the public through the STINET databases.” ...

Summary of the Manchester repository conference

JISC has issued a summary on the now-concluded conference, About Digital repositories: Dealing with the digital deluge (Manchester, June 5-6, 2007).  Excerpt:

A major conference on digital repositories took place this week in Manchester, attracting nearly 200 delegates from around the UK.

The conference began on Tuesday with an overview from Rachel Bruce, JISC programme director, who explained that although the conference marked the end of JISC’s Digital Repositories programme, this in now way meant the end of JISC’s work and investment in this area....

Andy Powell of the Eduserv Foundation gave the first keynote presentation on the ‘Repositories Roadmap’, a vision and forward plan for the establishment and development of repositories in the UK covering the period 2006 to 2010. He said that the report originally suggested that the main challenges were in the areas of policy. However, he continued, “getting the technology right can have a huge impact on policy, culture and working practices.” ...

The vision for 2010 refers to the wish that a “high percentage of newly published scholarly outputs [be] made on available on terms of open access” ...The question now, as far as these goals are concerned, said Andy Powell, is increasingly “not if, but when…” The situation now might therefore require us to set a more ambitious target than that of a “high percentage”, he said....

Dr Keith Jeffrey of the Science and Technology Facilities Council gave the second keynote address. The benefits of open access repositories, he claimed, include faster “research turnaround”, improved quality for the originators of research as colleagues were able review the research more easily, as well as improved quality for the community in general. They also support innovation, he continued, improve education and public engagement with science and research and enhance an institution’s standing.

In conclusion he said that the development of repositories and the wider access to research outputs they enabled should not be delayed by commercial interests.

Dr Jeffrey then launched the Depot, a national repository open to all UK authors to submit their research papers and other outputs into. Claiming that the Depot marked an “important milestone” in the development of a national infrastructure for repositories, Dr Jeffrey explained that the Depot constituted a national facility or set of services, including a reception service which redirects authors to an institutional repository where one exists, as well as ingest, storage, transfer and access services for the depositing of research outputs, principally post-prints....

The second day of the conference began with a keynote presentation by Professor Drummond Bone, Vice Chancellor of the University of Liverpool and President of Universities UK who began by saying that Universities UK was “firmly behind” JISC’s approach to the development of open access repositories, suggesting that repositories were “vital to universities’ economies and to the UK economy as a whole.” ...

Further details of the conference, including presentations, will be available shortly.

Never mind

Two weeks ago I blogged a Tom Matrullo interview with JSTOR's Bruce Heterick, in which Heterick said, or seemed to say, that JSTOR was considering OA:

...[T]he goal of open access is very much on [JSTOR's] mind.  “It’s not a question of if we should do it but when we can do it and not devolve our preservation goals,” [Heterick] says....

Matrullo has now blogged Heterick's clarification:

It isn’t really the case that JSTOR is thinking about “open access” as much as I was carrying forward the notion that JSTOR is always trying to “open access” more broadly to other communities (e.g. secondary schools, public libraries, developing nations). That is an important part of JSTOR’s mission (to extend access as broadly as possible), so perhaps I should have used the phrase “broaden access” instead of “open access” to avoid the confusion with much more highly-publicized “open access movement” (OA).

PS:  All right, noted.  But why isn't JSTOR considering OA?  Why not OA for the sufficiently old issues of participating journals on which JSTOR has already amortized its investment?  JSTOR is a non-profit corporation.

Handbook on OA from the German UNESCO Commission

The German UNESCO Commission (Deutschen UNESCO-Kommission or DUK) has published an OA handbook, Open Access: Chancen und Herausforderungen - ein Handbuch, June 6, 2007.

Edited by Barbara Malina, the volume contains separate sections by 38 authors spread over five chapters: 

  1. Definition und Ursprung von Open Access
  2. Drei Publikationsmodelle stellen sich vor
  3. Aspekte der Realisierung von Open-Access-Modellen
  4. Politische Perspektiven
  5. Internationaler Kontext

PS:  Chapter 5 includes a short section (pp. 121-125) by me on OA in the US, an abridgement of my longer piece in Neil Jacobs (ed.), Open Access: Key strategic, technical and economic aspects, Chandos, 2006.  Thanks to Philipp Disselbeck for translating it into German.

12 research universities join Google Library project

All 12 universities in the Committee on Institutional Cooperation (CIC) have joined the Google Library project.

From Google's press release (June 6, 2007):

The number of libraries participating in the Google Book Search Library Project just got a whole lot bigger with today's addition of the Committee on Institutional Cooperation (CIC). The CIC is a national consortium of 12 research universities, including University of Chicago, University of Illinois, Indiana University, University of Iowa, University of Michigan, Michigan State University, University of Minnesota, Northwestern University, Ohio State University, Pennsylvania State University, Purdue University and the University of Wisconsin-Madison. Google will work with the CIC to digitize select collections across all its libraries, up to 10 million volumes....

Google will provide the CIC with a digital copy of the public domain materials digitized for this project. With these files, the consortium will create a first-of-its-kind shared digital repository of these works held across the CIC libraries. Both readers and libraries will benefit from this group effort:

  • The shared repository of public domain books will give faculty and students convenient access to a large and diverse online library before housed in separate locations.
  • This new collaboration will enable librarians to collectively archive materials over time, and allow researchers to access a vast array of material with searches customized for scholarly activity.

For books in the public domain, readers will be able to view, browse, and read the full texts online. For books protected by copyright, users will get basic background (such as the book's title and the author's name), at most a few lines of text related to their search, and information about where they can buy or borrow a book.

"This library digitization agreement is one of the largest cooperative actions of its kind in higher education," said CIC chairman Lawrence Dumas, provost of Northwestern University. "We have a collective ambition to share resources and work together to preserve the world's printed treasures."
Two CIC member universities are already working with Google Book Search, the University of Michigan and the University of Wisconsin-Madison, and this new agreement will complement the digitization work already taking place....

More from the CIC press release (June 6, 2007):

...[S]aid Mark Sandler, director of the CIC’s Center for Library Initiatives[,] “We have a remarkable opportunity not only to preserve what easily could be lost, but to make the entirety of our print collections more accessible than ever through a simple computer search.”

Google will have the opportunity to scan some of the most distinctive collections from the CIC’s holdings, now over 75 million volumes. The collections are comprehensive and global in scope, such as Northwestern’s Africana collection and the University of Chicago’s renowned South Asia holdings. The collective library holdings also underscore the Midwest foundation of the CIC universities....

Also see the CIC's collection of related links, such as an FAQ and highlights of the member libraries.

More from Dan Carnevale's story in today's Chronicle of Higher Education (accessible only to subscribers):

Adam M. Smith, product management director for Google Book Search, said the goal of Google Book Search "is to create a repository of books that allows users to search the full text of those books as easily as they search Web pages today." ...

Sanford G. Thatcher, president elect of the Association of American University Presses, has been a vocal critic of Google's digitization of copyrighted works. But he is also director of Penn State University Press -- and Pennsylvania State University is a member of the Committee on Institutional Cooperation. He said he applauds Google's effort to preserve public-domain books and to make them widely accessible....[He gives this deal] "two and a half cheers." He added: "I simply reserve the one-half cheer for the fact that they are including some copyrighted material in here."

Comments.

  • This is the first consortium to join the Google Library project all at once, unless you count the 10-campus University of California system (which joined in August 2006).  It's a huge jump for the scope of project. 
  • See the June SOAN for some recent CIC activities in support of OA, particularly its provost letter in support of FRPAA (July 2006) and its author addendum (May 2007), which has now been adopted by three CIC institutions.

OpenDemocracy shifts to OA

OpenDemocracy Campaigns to Support Open Content, iCommons blog, June 6, 2007.  Excerpt:

Here at openDemocracy we publish around 20-25 articles a week, and given our high standards of production and editorial, the cost of sustaining that level is well beyond what we make from the standard operations of the site. We had tried subscription models in the past, but now we have moved over to publishing the overwhelming majority of our articles under a CC licence, this is no longer appropriate. Luckily, however, we have a large, enthusiastic and loyal readership who understand that in order for us to continue publishing, they need to support our model, as well as a number of philanthropic individuals and foundations who have supported openDemocracy since its inception.

We’ve shied away from a Kottke-esque micropayments model, mainly as we have yet to see any sustainable implementations of this in similar businesses to ours....Instead, we run regular readership campaigns....

But why do our readers give so much to access content that is ‘free to the world’? They value our independence enormously and respect us for our transparency and honesty in requesting funds and the day to day operations of our organisation and they are realising enough real value from our free content that they want to ensure our business is sustainable....

PS:  I call this the Public TV model of OA:  "It's free but please pay anyway."  It won't work for every OA provider but I'm glad it's working for OpenDemocracy.


Wednesday, June 06, 2007

OA thesis on university presses

Heinz Pampel, Universitätsverlage im Spannungsfeld zwischen Wissenschaft und Literaturversorgung. Eine kritische Bestandsaufnahme, Diploma thesis, Bibliotheks- und Medienmanagement, Hochschule der Medien Stuttgart, 2007.  Self-archived May 23, 2007.  (Thanks to Klaus Graf.)  In German but with this English-language abstract:

Departing from the serials crisis, this paper seeks to point out the areas of conflict, which are dominated by three principal agents: the scientific community, publishing houses and libraries. In order to facilitate interaction between these agents the open access movement has been formed, which receives considerable support from the librarianship. Abutted to the Anglo-American university presses, it has been postulated in the scientific community that German universities engage in publishing activities too. The focal point of this work is - apart from an assessment of the current situation and an overview of the Anglo-American university presses - a critical account of the German university press. On the basis of qualitative interviews different publishing houses are critically assessed in terms of their services offered. In a short excursus this paper describes subject-related publishing activities in the context of the open access movement. In conclusion this paper will expound the problems of German university presses – and formulate an outlook as to what may be the future course of the industry.

Presentations on open data

Most of the presentations from the IASSIST meeting, Building Global Knowledge Communities with Open Data (Montreal, May 15-18, 2007), are now online.

The benefits of open licensing

Candace Hare, Copyfight: Creative Commons, Open Licensing, Bringing Information to the People (and Letting Them Use It), Dalhousie Journal of Information & Management, Winter 2007.  (Thanks to Charles W. Bailey, Jr.)

Abstract:   This article looks at some of the current issues regarding copyright laws and open licensing. The "copyfight" is a response to the increasingly strict copyright laws instituted in North America and internationally, and includes such projects as Creative Commons copyright labelling to promote the sharing and remixing of creative works. More political efforts are also undertaken to bring free information to those who need it, such as citizens of the developing world who could benefit from knowledge held under copyright in developed countries.

Why academic women should embrace OA

Why Women in Academia Should Embrace Open Access, Co-Action, undated but apparently released in the past couple of days.  Excerpt:

INCREASED VISIBILITY
As digital scholarship advances, the individual article is
becoming as important as the journal it is published in.
While studies show that women generally publish fewer
articles than their male counterparts, it is also known
that they tend to receive more citations for the articles
they do publish. An important factor for an article to be
cited is that it is accessible to a large pool of readers....

IMMEDIATE IMPACT
...Women already receive more citations than
their male counterparts, and – as compared to work
deposited in repositories - articles that are formally
published in Open Access journals have been shown to
receive the greatest number of citations. With the widest
dissemination possible, imagine the impact for women
researchers of publishing in Open Access journals!

VIRTUAL NETWORKING
Several studies indicate that factors external to the
academic setting make it difficult for some women to
fully participate in international networks necessary to build
an academic career. However, in the context of e-
science, networking is now increasingly taking place in
new ways. Virtual networks and global collaboration
through interactive databases and other free web
resources such as Open Access publication outlets,
provide new and important opportunities for women to
participate....

SOLIDARITY WITH DEVELOPING COUNTRIES
...By publishing in Open Access journals or depositing their
work in Open Access repositories, women scientists
from developed parts of the world can support their
fellow women scientists from poorer countries and truly
engage in the widest possible dialogue within their
research field. In addition, as several studies indicate,
collaboration greatly increases publishing productivity –
a problem that women researchers from all over the
world need to address.

ROLE MODELING
Beyond lab work, grants, and tenure there is now also
focus on who controls the publication process. Scholarly
publishing is dominated by men; most leading journals
have male editors. Role modeling has been identified as
an important element in strengthening the presence of
women in academia....

EMPOWERMENT THROUGH OWNERSHIP
By publishing in Open Access journals, authors do not
sign away the rights to their work. This means that they
can take ownership of their research results. Moreover,
Open Access materials can be freely downloaded by
others and used in educational and other contexts, again
making women’s work more visible.

NEW FORMS OF IMPACT
Increasingly, various bodies demand that the outcome of
scholarly work should reach beyond academia to the
broader community. Open Access fosters more rapid
dissemination of results from the “ivory tower” to
practitioners, industry, patients, consumers and others.  By taking advantage of Open Access publishing
channels, women scientists have a greater chance of
contributing to developments outside of academia and
generating an impact more quickly in these other
arenas.

CAREER BEYOND THE UNIVERSITY
More attention is being drawn to the research of women
scholars beyond the university setting. By publishing in
Open Access journals, women scholars can ensure that
their work is widely accessible not only to fellow
academics but also to industry representatives and
policy makers. This increases the likelihood for more
women to be recruited to industry and public sector
positions....

More on the OAI-ORE standard

Carl Lagoze and Herbert Van de Sompel, Compound Information Objects: The OAI-ORE Perspective, Open Archives Initiative, May 28, 2007.

Compound information objects are aggregations of distinct information units that when combined form a logical whole.  Some examples of these are a digitized book that is an aggregation of chapters, where each chapter is an aggregation of scanned pages; a music album that is the aggregation of several audio tracks; an image object that is the aggregation of a high quality master, a medium quality derivative and a low quality thumbnail; a scholarly publication that is aggregation of text and supporting materials such as datasets, software tools, and video recordings of an experiment....

Several information systems, such as repository and content management systems, provide architectural support for storage of, identification of, and access to compound objects and their aggregated information units, or components....In most systems, the components of an object may vary according to semantic type (article, book, video, dataset, etc.), media type (text, image, audio, video, mixed), and media format (PDF, XML, MP3, etc.). Depending on the system, components can themselves be compound objects – allowing recursive containment of compound objects. Also, components may vary in network location....

Unfortunately, the manner in which information systems publish compound objects to the web is frequently less-than-perfect and, without commonly accepted standards, ad hoc.  In many cases, advanced functionality provided by individual information systems is lost when publishing compound objects to the web.  Frequently the exposure to the web is targeted towards human users rather than machine agents.  The structure of the compound object is embedded in ”splash” pages, user interface “widgets” and the like. This approach can leave the essential structure of compound objects opaque to machine-based applications such as crawlers, search engines, and networked desktop applications....

The absence of these standards affects the functionality of a number of existing and possible web services and applications.   Crawler-based search engines might be more useful if the granularity of their result sets corresponded to compound objects (a book or chapter, in this example) rather than individual resources (single pages). The ranking algorithms of these search engines might improve if the links among the components of a compound object were treated differently than links to the object as a whole, or if the number of in-links to the various component resources was accumulated to the level of the compound object instead of counted separately.  Citation analysis systems would also benefit from a mechanism for citing the compound object itself, rather than arbitrary parts of the object....

A core goal of OAI-ORE – Object Reuse and Exchange – is to develop standardized, interoperable, and machine-readable mechanisms to express compound object information on the web.  The OAI-ORE standards will make it possible for web clients and applications to reconstruct the logical boundaries of compound objects, the relationships among their internal components, and their relationships to the other resources in the web information space. This will provide the foundation for the development of value-adding services for analysis, reuse, and re-composition of compound objects, especially in the areas of e-Science, e-Scholarship, and scholarly communication, which are the target applications of ORE....

OA/TA competition helps users

Péter Jacsó, Trends in Professional and Academic Online Information Services, a paper presented at INFORUM 2007 (Prague, May 22-24, 2007).  The full presentation is available as text or slides.  (Thanks to Kidney Notes.)

Abstract:   Since the 1970s there has been significant competition among the major players within the traditional, subscription-based professional and academic online information services arena. Since the late 1990s some of the mushrooming (partially) open access information services which originally targeted the huge market of casual Internet users also entered the arena and posed a new challenge for the rival incumbents. The increasingly fierce competition enhances the academic and professional information services with increasingly better features. Traditional information retrieval techniques, such as citation-based searching, multiple database searching, ranking of result sets by multiple data elements, are being adopted and chiseled by the developers of the open access services. They create databases using autonomous citation indexing technology, metasearch tools for federated searching, sophisticated clustering algorithms for grouping results into topical and other subsets, link resolvers for guiding the users to complete documents, and various tools for visualizing information. Many also license content from the traditional services. In turn, these refined retrieval tools and resource enhancements keep showing up in the fee-based, professional and academic online information services, deployed in metadata-rich, highly structured mega-databases which have served academic, public, school and special libraries for decades. There are novel academic and professional information retrieval services, mashing up the best of both worlds, interlinked through one click or few clicks to deliver the most pertinent primary documents with informative metadata for the patrons. We, the end-users, will all be the beneficiaries of this rivalry -or rather, "co-opetition" - , enjoying the much improving solutions for resource discovery and result refinement, increasing recall and precision at our whim, reaping the benefits from the synergy of on-the-fly mash-ups of data from different sources in the process of searching - to turn it into finding and instant gratification.