Your Web News in One Place

Help Webnuz

Referal links:

Sign up for GreenGeeks web hosting
August 12, 2018 04:16 pm

Researchers Use Machine-Learning Techniques To De-Anonymize Coders

At the DefCon hacking conference on Friday, Rachel Greenstadt, an associate professor of computer science at Drexel University, and Aylin Caliskan, Greenstadt's former PhD student and now an assistant professor at George Washington University, presented a number of studies they've conducted using machine learning techniques to de-anonymize the authors of code samples. "Their work could be useful in a plagiarism dispute, for instance, but it could also have privacy implications, especially for the thousands of developers who contribute open source code to the world," reports Wired. From the report: First, the algorithm they designed identifies all the features found in a selection of code samples. That's a lot of different characteristics. Think of every aspect that exists in natural language: There's the words you choose, which way you put them together, sentence length, and so on. Greenstadt and Caliskan then narrowed the features to only include the ones that actually distinguish developers from each other, trimming the list from hundreds of thousands to around 50 or so. The researchers don't rely on low-level features, like how code was formatted. Instead, they create "abstract syntax trees," which reflect code's underlying structure, rather than its arbitrary components. Their technique is akin to prioritizing someone's sentence structure, instead of whether they indent each line in a paragraph. The method also requires examples of someone's work to teach an algorithm to know when it spots another one of their code samples. If a random GitHub account pops up and publishes a code fragment, Greenstadt and Caliskan wouldn't necessarily be able to identify the person behind it, because they only have one sample to work with. (They could possibly tell that it was a developer they hadn't seen before.) Greenstadt and Caliskan, however, don't need your life's work to attribute code to you. It only takes a few short samples.

Read more of this story at Slashdot.


Original Link: http://rss.slashdot.org/~r/Slashdot/slashdot/~3/VOGjSbm_J1M/researchers-use-machine-learning-techniques-to-de-anonymize-coders

Share this article:    Share on Facebook
View Full Article

Slashdot

Slashdot was originally created in September of 1997 by Rob "CmdrTaco" Malda. Today it is owned by Geeknet, Inc..

More About this Source Visit Slashdot