Open Access News

News from the open access movement


Friday, March 11, 2005

Text mining requires text access and more

Dietrich Rebholz-Schuhmann, Harald Kirsch, Francisco Couto, Facts from Text --Is Text Mining Ready to Deliver? PLoS Biology, February 15, 2005. Excerpt: 'Could we automatically analyse new scientific publications routinely to extract facts, which could then be inserted into scientific databases? Could we tag gene and protein names, as well as other terms in the document, so that they are easier to recognise? How can we use controlled vocabularies and ontologies to identify biological concepts and phenomena? Fortunately, there are many groups that are now seeking to answer these questions, precisely with a view to extracting facts from text....For all automated information-extraction methods, it is obvious that access to literature is crucial. Electronic access has, of course, already had a huge impact, but the structure and organisation of manuscripts could also be improved. For example, semantic tags could be integrated into the text.'