Open Access NewsNews from the open access movement Jump to navigation |
|||
Hans-Michael Müller, Eimear E. Kenny, Paul W. Sternberg, Textpresso: An Ontology-Based Information Retrieval and Extraction System for Biological Literature, PLoS Biology, November 2004. From the abstract: "We have developed Textpresso, a new text-mining system for scientific literature whose capabilities go far beyond those of a simple keyword search engine. Textpresso's two major elements are a collection of the full text of scientific articles split into individual sentences, and the implementation of categories of terms for which a database of articles and individual sentences can be searched. The categories are classes of biological concepts (e.g., gene, allele, cell or cell group, phenotype, etc.) and classes that relate two objects (e.g., association, regulation, etc.) or describe one (e.g., biological process, etc.). Together they form a catalog of types of objects and concepts called an ontology. After this ontology is populated with terms, the whole corpus of articles and abstracts is marked up to identify terms of these categories. The current ontology comprises 33 categories of terms. A search engine enables the user to search for one or a combination of these tags and/or keywords within a sentence or document, and as the ontology allows word meaning to be queried, it is possible to formulate semantic queries. Full text access increases recall of biological data types from 45% to 95%." From the main text: "Access to the full text of articles is critical for sufficient coverage of facts and knowledge in the literature and for their retrieval (Blaschke and Valencia 2001); our results confirm these findings." (PS: Text-mining has spectacular potential for helping researchers find what they need, understand it, and stay abreast of new developments. As this article confirms, its powers are greatly enhanced if it has access to full-text literature. Enlarging the body of OA literature will give a boost to text-mining and improvements in text-mining will give a boost to OA.)
|
|||