Open Access News

News from the open access movement


Friday, April 24, 2009

New chemistry data repository project

Peter Murray-Rust, CLARION - our chemical data repository project, A Scientist and the Web, April 24, 2009.

We were very pleased to be told recently we had been awarded a grant from JISC for repository enhancement. It’s CLARION (Chemical Laboratory repository In/Organic Notebooks) ...

We believe that most chemistry data in most departments is valuable to science. ... These facts are – largely – reproducible so that the same substance in different laboratories will give “the same analytical data” (crystal structure, spectra, composition). ...

And the data are born-digital. They come out of machines as reproducible numbers. The semantics are not always explicit but they can usually be added if done by the author. But all too often the data are emitted as unsemantic PDF, printed on paper, scribbled with pencil, covered with coffee-mug rings and then published as some ugly bitmap. The poor reader then has to measure the peaks with a ruler.

I repeat. We are in the twenty-first century and we still use rulers.

That’s because the data publication process is not yet developed. Perhaps I should say data publication culture. Because the tools are all there. We’ve done this for the whole of the department’s crystal structures and put them in a repository (C3DER).

The structures are not yet all exposed as we need agreement with the researchers. I’m sure this will be forthcoming readily – many have said it gives them a warm fuzzy feeling to make their data available. Usually it has to be done after publication (we don’t expect everyone to adopt open Notebook yet) and this needs culture and process.

So an important part of CLARION will be developing the means for working with scientists to expose their data at the appropriate time. CLARION will expand to include a variety of spectral data, both from central analytical services and from individual labs.

Another key aspect of CLARION is that we shall be integrating it with a commercial electronic laboratory notebook (eLNb). We’re in the process of evaluating offerings and expect to make an announcement soon. This will be a key opportunity to see how feasible it is to integrate a standard system with the needs of a departmental repository. ...

Update. See also the project blog.