Open Access NewsNews from the open access movement Jump to navigation |
|||
Did you ever wonder how many web documents contain a given word? How a given word ranks in frequency of web usage? The UC Berkeley and Stanford University Digital Library projects now have the data to answer your questions. For example, "FOS" occurs in 23,873 web documents and ranks 47,010th in frequency. The Berkeley and Stanford teams have created a free online searchable front end to their database and offer the data files free for downloading. The database is derived from a January 2001 archive of 88 million web pages, but will apparently be regenerated on newer and larger archives in the future. The Berkeley/Stanford purpose was to gather the data needed to bypass dead links and search for documents by their lexical signatures.
|
|||