Your Web News in One Place

Help Webnuz

Referal links:

Sign up for GreenGeeks web hosting
November 24, 2017 12:00 am

More Than Half of GitHub Is Duplicate Code, Researchers Find

Richard Chirgwin, writing for The Register: Given that code sharing is a big part of the GitHub mission, it should come at no surprise that the platform stores a lot of duplicated code: 70 per cent, a study has found. An international team of eight researchers didn't set out to measure GitHub duplication. Their original aim was to try and define the "granularity" of copying -- that is, how much files changed between different clones -- but along the way, they turned up a "staggering rate of file-level duplication" that made them change direction. Presented at this year's OOPSLA (part of the late-October Association of Computing Machinery) SPLASH conference in Vancouver, the University of California at Irvine-led research found that out of 428 million files on GitHub, only 85 million are unique. Before readers say "so what?", the reason for this study was to improve other researchers' work. Anybody studying software using GitHub probably seeks random samples, and the authors of this study argued duplication needs to be taken into account.

Read more of this story at Slashdot.


Original Link: http://rss.slashdot.org/~r/Slashdot/slashdot/~3/noJQizu7mcw/more-than-half-of-github-is-duplicate-code-researchers-find

Share this article:    Share on Facebook
View Full Article

Slashdot

Slashdot was originally created in September of 1997 by Rob "CmdrTaco" Malda. Today it is owned by Geeknet, Inc..

More About this Source Visit Slashdot