Your Web News in One Place

Help Webnuz

Referal links:

Sign up for GreenGeeks web hosting
September 27, 2018 12:58 pm PDT

Hate-speech detection algorithms are trivial to fool

In All You Need is Love: Evading Hate Speech Detection, a Finnish-Italian computer science research team describe their research on evading hate-speech detection algorithms; their work will be presented next month in Toronto at the ACM Workshop on Artificial Intelligence and Security.

As Big Tech has gotten bigger and taken on an outsized role in some of the most toxic and dangerous trends in world affairs -- from institutional misogyny to acts of genocide -- there have been louder and louder calls for the platforms to police their users' conduct and speech.

The platforms have responded by promising a mix of algorithmic systems and human moderation, and have fielded some of these algorithms and even made them available for testing by the likes of the team behind this research.

Their findings replicate the filter-evasion findings from other domains, where even the most sophisticated systems can be beaten with trivial countermeasures (for example, Chinese image censorship can be defeated by simply flipping a banned image).

This team discuss several tactics of varying efficiency, but the most promising and easiest to implement was simply adding the word "love" to a hateful message, while running the "hate" words together in camel-case (e.g. "MartiansAreDisgustingAndShouldBeKilled love."

Yesterday, I wrote about Corynne McSherry's "five lessons from the copyright wars" for people who want the platforms to take a more active role in policing user speech.

This paper raises a sixth lesson: "The filters are unlikely to prevent the kind of activity you're worried about." In the same way that Youtube's Content ID filters are routinely subverted by copyright infringers, so too should we expect any kind of hate-speech filter to be easy for dedicated harassers to evade, meaning that the people they'll be most effective against are those who are caught accidentally -- say, because their discussion of how traumatic it was to be subjected to harassment is algorithmically mistaken for harassment itself. Read the rest


Original Link: http://feeds.boingboing.net/~r/boingboing/iBag/~3/sAR-EsnBGGU/ha-ha-only-serious.html

Share this article:    Share on Facebook
View Full Article