Researchers from the Singapore University of Technology and Design (SUTD) created a new software centered around reinforcement learning and phase-change memory that’s designed to understand complicated movement design.
Previous work has applied this kind of deep learning to other games like Chess or Go, but they decided instead to expose the D-PPO algorithm to the rigors of Street Fighter Champion Edition II. The SUTD researchers trained its SF-R2 AI player on two days of consecutive play against the computer, before letting it loose on a human participant – who the AI-powered system beat comfortably.
The work has implications for movement science more broadly, according to the research paper, and can possibly be fed into improving robotics and autonomous vehicles, for example. It paves the way for broadly applicable training in fields where machines may observe human norms and attempt to replicate and outperform them.
Ready Pl-AI-yer One
One of the major milestones that AI researchers have used to measure the effectiveness of the systems they’ve built is by letting them compete with human players in different kinds of games. This has been happening for some time.
In 2017, an Alpha Go AI built by DeepMind beat the number-one human Go player in the world for the second time, following the first victory over Fan Hui the previous year. Microsoft’s AI, in June, achieved the world’s first perfect Ms. Pac-Man score, and in August we saw an OpenAI engine beating the best Dota 2 players of the time.
This latest milestone – besting a Street Fighter champion – was made possible due to reinforcement learning as well as phase-change memory. First developed by HP, this is a form of nonvolatile memory achieved by using electrical charges to change areas on chalcogenide glass. It’s much faster than commonly used Flash memory.
“Our approach is unique because we use reinforcement learning to solve the problem of creating movements that outperform those of top human players,” said principal investigator Desmond Loke to TechXplore. “This was simply not possible using prior approaches, and it has the potential to transform the types of moves we can create.