Today in “computer beats human at thing computers could not previously beat humans at” news: Google Deepmind has bested StarCraft 2 pros at their own game. “AlphaStar” was unveiled on a livestream last night, in a show revolving around matches against top StarCraft pros Grzegorz “MaNa” Komincz and Dario “TLO” Wünsch. All the games AlphaStar won were actually prerecorded, because GOOGLE ARE COWARDS COME FIGHT ME.
Notably, the AI was playing the same version of StarCraft that you or I could boot up right now – unlike OpenAI’s Dota bots, who failed to beat Dota 2 pros last year at a cutback version of the game. That’s 2-0 to Google, who also beat the GO world champion back in 2016.
Still, it’s important to bear in mind that when one of the AI’s superhuman advantages was disabled, it lost. The final, live match was played against a version of AlphaStar that couldn’t zoom out, viewing more of the map at once than its human rival.
Here’s DeepMind’s rundown of AlphaStar, though remember to read every claim in there through furrowed eyebrows. Same goes for the stream below.
Take this claim, for instance:
“In its games against TLO and MaNa, AlphaStar had an average [actions per minute] of around 280, significantly lower than the professional players, although its actions may be more precise. This lower APM is, in part, because AlphaStar starts its training using replays and thus mimics the way humans play the game. Additionally, AlphaStar reacts with a delay between observation and action of 350ms on average.”
That’s a vital and somewhat impressive note – the AI didn’t win through leveraging superhuman speed. That little aside about its actions being more precise, though, strikes me as a big deal. Superhuman microplay undermines the idea that AlphaStar won through out-thinking their human opponent.
I also reached out to AI researcher Vanessa Volz, who raised this very valid point: “In some instances (like Stalker and Drone over-production), AlphaStar was playing a strategy that was unfamiliar to the pro, who thus had difficulties to react. Therefore, it is not clear whether that part would have been out-thinking or rather out-surprising the human player.”
While it’s important to bear those limitations in mind, this is still a neat accomplishment. I won’t go into all the details behind how the neural network wrapped its circuits around StarCraft’s complexity, but here’s an overview:
“AlphaStar’s behaviour is generated by a deep neural network that receives input data from the raw game interface (a list of units and their properties), and outputs a sequence of instructions that constitute an action within the game.
“AlphaStar also uses a novel multi-agent learning algorithm. The neural network was initially trained by supervised learning from anonymised human games released by Blizzard. This allowed AlphaStar to learn, by imitation, the basic micro and macro-strategies used by players on the StarCraft ladder.”
They later got AlphaStar playing games against different versions of itself, with its evolving strategies mirroring those of humans, as “cheese strats” succumbed to more even-handed approaches.
Does this have applications outside of video games? Google sure think so:
“The fundamental problem of making complex predictions over very long sequences of data appears in many real world challenges, such as weather prediction, climate modelling, language understanding and more. We’re very excited about the potential to make significant advances in these domains using learnings and developments from the AlphaStar project.”
Plausible. Definitely plausible.