Philip Resnik has developed software that can tell with 90% accuracy when one web document is a translation of a second web document. The software can be used to learn language and translation subtleties from millions of documents that are extremely difficult to program directly. This may lead to translation software with a grasp of idiom and context unrivaled by translation software built from top-down rules and vocabulary lists.
PS: This is a perfect example of how free online texts provide free online data to sophisticated software, and actually stimulate the development of software that would never be developed if the data were priced or hidden behind passwords.
Posted by
Peter Suber at 6/24/2002 10:15:00 AM.
The open access movement:
Putting peer-reviewed scientific and scholarly literature
on the internet. Making it available free of charge and
free of most copyright and licensing restrictions.
Removing the barriers to serious research.