Sources Contact Advanced Search Tutorials

An Interest In:

Web News this Week

Search Archive

Some of Our Sources

View All Sources

Help Webnuz

Referal links:

October 16, 2019 12:10 am

OpenAI's AI-Powered Robot Learned How To Solve a Rubik's Cube One-Handed

Earlier today, San Francisco-based research institute OpenAI announced that it had taught a robotic hand to solve Rubik's cube one-handed. "Lost in the shuffle is just what is new here, if anything, and what of it may or may not be machine learning and artificial intelligence -- the science in other words," writes Tiernan Ray via ZDNet. An anonymous Slashdot reader shares an excerpt from his report: The real innovation in Tuesday's announcement, from a science standpoint, is the way many versions of possible worlds were created inside the computer simulation, in an automated fashion, using an algorithm called ADR. ADR, or "Automatic domain randomization," is a way to reset the neural network at various points based on different appearances of the Rubik's cube and different positions of the robotic hand, and all kinds of physical variables, such as friction and gravity. It's done by creating thousands of variations of the values of those variables inside the computer simulator while the neural network is being trained. ADR is an algorithm that changes the variables automatically and iteratively, as the policy network is trained to solve the Rubik's cube. The ADR, in other words, is a separate piece of code that is designed to increase random variation in training data to make things increasingly hard for the policy neural network. Using ADR, the real world Dexterous Hand can adapt to changes such as when it drops the cube on the floor and the cube is placed back in the hand at a slightly different angle. The performance of the Dexterous Hand after being trained with ADR is vastly better than without it, when only a handful (sorry again again for the pun) of random variants are thrown at it using the prior approach of manually-crafted randomness, the authors report. What's happening, they opine, is the emergence of a kind of "meta-learning." The neural network that has been trained is still, in a sense "learning" at the time it is tested on the real-world Rubik's cube. What that means is that the neural network is updating its model of what kinds of transitions can happen between states of affairs as events happen in the real world. The authors assert that they know this is happening "inside" the trained network because they see that after a perturbation -- say, the Dexterous Hand is hit with some object that interrupts its effort -- the robot's activity suddenly plunges, but then steadily improves, as if the whole policy network is adjusting to the changed state of affairs.

Slashdot

Slashdot was originally created in September of 1997 by Rob "CmdrTaco" Malda. Today it is owned by Geeknet, Inc..

More About this Source Visit Slashdot