Your Web News in One Place

Help Webnuz

Referal links:

Sign up for GreenGeeks web hosting
October 28, 2021 07:03 am

Giant, Free Index To World's Research Papers Released Online

In a project that could unlock the world's research papers for easier computerized analysis, an American technologist has released online a gigantic index of the words and short phrases contained in more than 100 million journal articles -- including many paywalled papers. Nature reports: The catalogue, which was released on October 7 and is free to use, holds tables of more than 355 billion words and sentence fragments listed next to the articles in which they appear. It is an effort to help scientists use software to glean insights from published work even if they have no legal access to the underlying papers, says its creator, Carl Malamud. He released the files under the auspices of Public Resource, a non-profit corporation in Sebastopol, California that he founded. Malamud says that because his index doesn't contain the full text of articles, but only sentence snippets up to five words long, releasing it does not breach publishers' copyright restrictions on the re-use of paywalled articles. However, one legal expert says that publishers might question the legality of how Malamud created the index in the first place. Some researchers who have had early access to the index say it's a major development in helping them to search the literature with software -- a procedure known as text mining. [...] Computer scientists already text mine papers to build databases of genes, drugs and chemicals found in the literature, and to explore papers' content faster than a human could read. But they often note that publishers ultimately control the speed and scope of their work, and that scientists are restricted to mining only open-access papers, or those articles they (or their institutions) have subscriptions to. Some publishers have said that researchers looking to mine the text of paywalled papers need their authorization. And although free search engines such as Google Scholar have -- with publishers' agreement -- indexed the text of paywalled literature, they only allow users to search with certain types of text queries, and restrict automated searching. That doesn't allow large-scale computerized analysis using more specialized searches, Malamud says.

Read more of this story at Slashdot.


Original Link: http://rss.slashdot.org/~r/Slashdot/slashdot/~3/7w3pNp8na1k/giant-free-index-to-worlds-research-papers-released-online

Share this article:    Share on Facebook
View Full Article

Slashdot

Slashdot was originally created in September of 1997 by Rob "CmdrTaco" Malda. Today it is owned by Geeknet, Inc..

More About this Source Visit Slashdot