Open Access News

News from the open access movement


Tuesday, January 10, 2006

Making text and data visible together

Leigh Dodds, The Modern Palimpsest, Lost Boy, December 16, 2005. (Thanks to Richard Ackerman.) Excerpt:
The following is a brief summary of a talk I gave recently at the Ingenta Publisher Forum on the 28th November. The slides are available as a Powerpoint presentation. In the presentation I tried to highlight some of the possibilities that could become available if academic publishers begin to share more metadata about the content they publish, ideally by engaging with the scientific community to expose "raw" data and results. The conceit around which I hung the presentation was the suggestion that the scientific paper is the modern equivalent of a palimpsest. A palimpsest...is a scroll or manuscript that has been written on, had its text scraped off, and then reused....A great deal of success has been made in extracting the original texts from these works....The underlying text is known as the scriptio inferior, and may actually be more valuable than the more visible content. I likened the process of authoring a scientific paper to that of the creation of a palimpsest. Starting from original research results and working through the synthesis of a cogent explanation of the results or discovery, at each step the content becomes more abstracted from the original results, the previous work being "lost" to the reader. Data is presented in pre-analysed forms and is not amenable to reuse. Like the palimpsest the raw data has not really been lost, its just not (easily) accessible to the reader. If the scriptio inferior, the underlying data, were made available to the reader, then there a lot of interesting possibilities arise....In my presentation I tried to stick to a pragmatic and practical line and demonstrate the possibilities by referring to actual examples. I ended up pointing to three:...iSpecies is a nice example of a science "mashup" that illustrates an alternative search interface for finding related content. I used the false results that can appear when performing simple keyword searches to reinforce the need for standard identifiers. (The need for a common, scoped identifier for authors, is a particular hobby horse of mine.) I also showed the excellent HubMed as an example of how both an alternative user interface can be better than the original, and also how content can be "enriched" by mixing in other sources. The "terms" feature which dynamically links keywords in an abstract through to a number of data sources, demonstrates this very well. I used the fact that material can be sourced from user contributed sources such as Wikipedia, to promote the idea that content needn't be fixed at the point of publication but can be annotated after the fact.