In a little over four weeks, five AI bots will play a team of pro players at The International. OpenAI’s bots have learnt to play Dota 2 almost from scratch, building up an understanding of the game using one of the most advanced reinforcement learning techniques AI researchers have yet devised.
It’s an impressive achievement, regardless of whether the bots manage to crush their flesh and blood opponents. But don’t get too carried away. Last week I told you about Mike Cook’s blog post, where he highlighted both the technical and conceptual limitations surrounding OpenAI’s bots. Mike is best known for Angelina, the game making AI that he’s been tinkering with for the past eight years – but no AI problem can escape his expert gaze. So I called him up and asked if he could explain some more.
RPS: First of all, what have OpenAI accomplished with their Dota bots? What’s impressive about them?
Mike Cook: So OpenAI had already made a big splash last year by being able to train a bot to play 1v1 mid, which is a special game mode within Dota. It’s kind of like the penalty shootout is to football, and they completely defeated a bunch of professional players, at least on the big stage. Then a year passed and they said they wanted to take a look at 5v5 Dota, so closer to the actual game that you would play if you fired up Dota. About a month ago they announced that they’d developed bots that could beat humans – pretty good humans – at a particular kind of 5v5 matchup.
The interesting things are what restrictions were there, and what help the bots got from OpenAI. But in general this is still a pretty impressive achievement, because Dota 2 is an extremely complex video game – any kind of progress here is pretty impressive even if there are caveats attached.
RPS: So that’s the next question – what are those caveats?
Mike: So, the original announcement included a range of restrictions, most importantly restricting the heroes the AI could choose and the items they could build. The AI was restricted to a mirror matchup of five heroes, and a prepared list of items they’d purchase, I believe just in sequence regardless of what was happening in-game. There were also some mechanics the bots were told not to use – using wards, which are items that grant vision in an area, or attacking Roshan, an optional objective that gives teams a big advantage.
These restrictions and adjustments are a big deal. To put the hero choice in perspective, there are billions of possible combinations of the 115 heroes in a 5v5 match. If you’d asked me last week what the chances were of them scaling up from a fixed matchup with simple heroes, I’d have said it was years away. But now, this week, OpenAI have announced that they’re lifting some of these restrictions – there are now 18 heroes involved, and while it’s not clear how hero selection works, this is still a huge complexity jump for the system to make in just a month. They’ve also added back in some mechanics like ward usage and Roshan, which means bots will have to make richer tactical decisions. Warding is especially significant because good ward spots are more or less universal, which means the bots have a chance to change how even you and I play Dota 2, if they find a new place to put wards that is hard to predict and provides useful vision.
RPS: You were confident the bots would win before. Has this latest announcement from OpenAI changed that?
Mike: One of OpenAI’s co-founders tweeted at me that the progress made in the last month wasn’t something they were totally confident about, they just were eager to get their bots playing ‘real DOTA’ as soon as possible. If that’s true, it’s really exciting, because it means they don’t really know how well their bots will adjust to this new, more challenging problem. There’s still a lot of question marks about the game setup for me: are the human players restricted to the same items as the bots, for instance? Will they get to draft heroes themselves and possibly counterpick the bots? Overall, my money’s still on the bots. In particular, if you look at the heroes added to the game, a few in particular stand out as heroes that a bot could really dominate with. One of the heroes in the pool, Slark, has an ability that makes him untargetable, and another ability that purges any negative spell effects applied to him. This hero needs a very specific and focused style of play to control, and the game format may mean that the human players lack the items or the heroes to counter him. I wouldn’t be surprised if we see a few godlike Slark games from the AI.
RPS: So I’ve seen headlines about master Go players reevaluating their assumptions about the game after AlphaGo, which is one of google’s AI projects, beat the world champion relatively recently. So do you expect to see similar headlines coming out of the International?
Mike: Possibly – there is evidence that OpenAI’s bots have done this in the past! So last year when they were playing 1v1, because these 1v1 match-ups are so intense and so repetitive, and played with the same kinds of very focused criteria by very high-skilled people, the bots were already able to play in a way that kind of threw off the human players – and some of the pro players have said that it changed their attitude towards 1v1. Now I don’t know to what degree they’re exaggerating this, because the 1v1 meta changes drastically. So last year, the way the mid lane was played in Dota 2 has already completely disappeared compared to how it’s now played, so if the 1v1 bots did have an impact I don’t know if that impact has lasted. But there is evidence that at least pro players feel like they gain something by practising against these bots. Even if they had their flaws, there were things that they were revealing.
Now there’s already been some unusual things noted about the way the bots play, which some analysts have attributed to a very high level of game understanding. So Blitz, I think his name is William Lee, is quoted in the OpenAI articles – he was kind of an adviser to them I think – and he commented that the bots showed a certain degree of high level reasoning. They would group up and avoid parts of the map that they couldn’t get an advantage on, and they’d move to the opposite side of the map and try and push an advantage there. Obviously I haven’t seen the bots play and I’m not as good at analysing Dota 2 as Blitz is, but there are aspects of that that make me think this is a human over interpreting what the bots are doing rather than the bots actually having understood something fundamental about Dota. That’s based on the assumption that Blitz has only watched a few games, so at the moment it feels a bit early on. It’s difficult to tell whether this is kind of a glitch in the system, because we saw glitches with their 1v1 bot last year, or whether they have unlocked something deep in the bowels of Dota 2.
RPS: So other than purely how they perform, is there a way of measuring the extent to which the bots understand the game?
Mike: That’s an excellent question! It’s really difficult, because normally we trust analysts and pro players to tell us what other people are doing. So when I watch the International I’m relying on those experts behind the desk to appreciate these fine decisions being made that are so high-detailed I might not even be able to tell they exist.
Now with the bots, it’s really difficult. If they beat the human team next month, that will be a big deal. And they could beat them in the most ludicrous way possible, any kind of victory will be fine. In a sense it doesn’t actually matter how they play, the fact that they win will be enough – it will be enough for the audience, it will be enough for OpenAI. However, one of the things that happened last year with the 1v1 bot was once people were allowed repeated attempts, because some of the pro players were able to play multiple times against the bot, they started to realise that the bot’s knowledge was very brittle. So even though it had appeared to be extremely good at this particular thing that it had trained for, as soon as it encountered something unexpected or was forced to play in an unusual way, this knowledge kind of broke down. So there I guess the question of do they really understand the game is much harder – they understand certain bits of the game better than others. You kind can of imagine it a bit like a river, I guess. The deepest part of the river is like the channel that the river has carved out over time, and that’s the bit where the bots feel most comfortable in. And as you get closer to the edges of the river, that bit might only get carved out occasionally when a flood or something like that, those are areas where the bots have explored less. And the more that they’re pushed into those areas the less impressive they’ll be.
But it’s really hard, I’d argue they’ve already exhibited a lot of understanding of Dota. For instance understanding that creep blocking is an advantage. Creep blocking is not a rule or a concept in the game, it was a bug in the original Warcraft 3 mod because of the way NPCs pathfind through a level, and people realised that if they messed up the pathfinding they’d find themselves in a more favourable situation. So this is a human invented concept that had emerged out of an unexpected part of the game, and for the AI to reinvent this and to rediscover that it’s worthwhile, that’s already showing that it understands some aspects of the game – which is really cool! But we have to be careful, because kind of like my example with Blitz, just because they’re charging alongside one part of the map and ignoring this other side of the map, there could be a billions of reasons for that. So some things we can feel confident about and say yes, they’ve understood something here, they’ve achieved something and this is important. Other parts we might need a hundred or a thousand more games with a lot more variables to really appreciate whether or not they know what they’re doing.
RPS: So this might be a hard question to answer, but how long do you reckon it’ll be before they can compete without any restrictions?
Mike: It’s really tough. I can honestly say, the progress made in the last month is not something I ever expected, and I’m a bit blown away by it. There’s still a big hill left to climb, and I’m looking forward to reading more technical details about what they got up to in the past few weeks to make this progress, but perhaps we’re only a few years away from Dota 2 being something AI can trivially beat humans at. It’s easy to think about science, and AI specifically, as a smooth gradient of continuous progress, but the reality is more bumpy – sometimes you hit a roadblock and it takes years to overcome, but then there are months where everything changes and huge progress is made. I think it’s safe to say OpenAI just had one of those months. Whether the future continues that way, it’s tricky to say, but the team seem confident. Predictions are fun, I’ll call it: The International 2022 will be won by bots.
RPS: See that raises an interesting question, because I know you said in your blog post you had misgivings with “AI as spectacle”. To what extent is this a spectacle and to what extent is it a useful apparatus to do AI research?
Mike: So this is also something we see in academia, where we will see someone publish a paper on a topic where there are clear open questions that they could have answered, but answering those questions is not novel enough to produce more publications or get more funding, and so those questions don’t go answered. So I don’t want to make it sound like this criticism is unique to non-publicly funded research, but there is this problem of needing to grab headlines.
There’s a fine line between science communication and, as you said, AI as spectacle, so once we’re kind of encouraged to do the latter it becomes harder to be diligent and honest and make sure everything is laid out openly and that we don’t try and distract or mislead people.
I’m not necessarily saying this is done intentionally or maliciously, it can happen by accident, it can happen by being too enthusiastic. It can happen by being too excited by the areas you’ve made progress in to want to talk about the areas that you struggled with. Like, it’s not interesting to talk about your struggles, you talk about the things you kick ass at! But I think it is really problematic, especially with a topic like AI that I would say is poorly understood right now. And there is a lot of misinformation out there, that you have to be super super careful when communicating. I think their communication has improved over the last 12 months, but overall this still feels like… like they’re aware that when they beat humans next month, they’ll get loads of headlines that say they’ve solved Dota 2. They know that’s going to happen, and I don’t think they particularly mind about that. I’m not necessarily saying they’re aiming for it, but I think they’re fine with that happening. They’re not going to go out and email people and tell them not to do it.
That said, you asked if this is a good way to do AI research. This is a very difficult AI problem, they’ve used their kind of public image to get leverage with Valve, to get access to this thing which not many other people could do. Academic labs could not afford the kind of technology they have either, so they are in a unique situation and they’re using that unique situation to do something impressive. But my gut feeling is that these things do more harm than good in the long run, and I say that as someone who is thoroughly looking forward to these matches. I can’t wait to see the bots run all over these humans! But from a science communication standpoint it worries me a little bit, yeah.
RPS: So in your blog post when you talked about reexamining the idea of humans playing against computers and what read into that, I imagine that’s what you had in mind… but are there more useful milestones, and ones that can be as effectively communicated to the public?
Mike: So I think… the thing I’m about to say I’d imagine is quite difficult for OpenAI to do, I imagine there are restrictions on how they could use their API or how they could distribute their bots and their technology. But a major decision I made at the start of my PHD was to make sure that everything I produced people could download and interact with. Having people able to touch the thing you’ve made is really important. It makes people understand it better, because they can perform their own tests. They don’t need to be scientists, but they can still have an idea about what will happen and then try it out – just like anyone who’s messed with the AI in a video game will know that it’s fun to kind of have a hypothesis and test it.
The second thing is it builds trust. So right now we have lots of unanswered questions about how OpenAI’s bots work. If I could sit down and play a game against them now, it might kind of increase my confidence in certain aspects of the bot if I could see that ‘oh it responds to this thing in this way’. Or there are just facts I don’t know, like I don’t know if the humans were restricted to the same items that the bots were. I have no idea, I assume they were but it doesn’t say. So having people being able to try out the thing for themselves builds trust in that way and confidences. So that would have been a change that I would have made.
In terms of changing the whole project, I think building AI that have to deal with humans in a non-competitive way would have been interesting. So for instance an OpenAI bot that trains and tutors people would have been much more interesting, I think, than a competitive bot. If you beat humans in a game of Dota 2 that’s it – people focus on the win. Where as if you can perform a kind of Dota 2 ‘My Fair Lady’ scenario of taking a team of human 2K [low-skill bracket] gamers and getting them to enter the International qualifiers and qualify, that’s a huge deal. And arguably it’s better for Dota 2! So one of the things I’ve tried to underline to people is that OpenAI is exciting, but it’s not gonna improve the bots that you play against in the game particularly. Whereas getting AI to do coaching and training – that’s really interesting to me, and that’s something more people can engage with. It forces journalists and other AI communicators to think more carefully about how they evaluate it.
Again, this is not a dig at journalists because it’s not on journalists to fix this problem – it’s on us as researchers and academics. So that’s one of the angles I talked about, I think a different kind of problem can help. One with a less objective kind of goal state. But this has always been the problem with AI research; we tended to have picked problems that have objective, clear black and white goals because they’re easier to build systems for, they’re easier to evaluate whether you’ve reached the goal or not. Like, OpenAI are gonna have a very obvious results section to their paper next month if they beat a team live on stage. Whereas trying to say whether or not you made a thousand people better at Dota 2 is really subtle, it’s really complex. But I think we should be bolder, and arguably corporations and companies that have lots of funding have the most privilege to be bolder in the kinds of problems we look at.
But that’s not gonna happen, probably, because these kind of problems are very appealing and arguably AI researchers generally believe that these will lead to AI advances in the real world. Whether or not they will is an open question, but they believe that it’s kind of good for general AI to play and win at these games. And you know, it is entertaining. I really can’t underline this enough: I’m super excited to see the bots play! But yeah, I think that different problems, different ways of evaluating and letting people touch them… those are the two things. Let people touch the things you build, and have more nuanced goals that make you think about evaluating them.
RPS: I did just want to touch on one more thing, which does steer us back towards competition. But another limit you raised, something to bear in mind that makes their achievements less impressive, is that an AI is reading the map at intervals that are far shorter than a human would be capable of. So I was wondering… how practical would it be to shape an AI with the limitations of a human?
Mike: So yeah, that’s really interesting. There are two ways we think about doing that. One way is that we think about giving them restrictions that feel human like to us. So an obvious one would be finding out what the reaction speeds are of top Dota 2 professionals, and limiting the software’s ability to react to that limit. But of course even that doesn’t solve the problem, because right now they’re being given extremely precise measurements of the world, so they know that there’s 1.36 seconds left before this character can move again when they cast a spell or whatever. And they’re getting these updates every .8 of a second, or less than that. So even if you gave them kind of artificial limitations that wouldn’t necessarily solve the problem.
Another approach that DeepMind are taking with their Atari games, and we’ve seen recently with Doom as well I think, is to actually have them read pixels off the screen. Now OpenAI mentions this explicitly in their blog post, and they say that basically it would be too high of a computational load to have to read the Dota 2 screen, and I kind of agree with them for some things. So for instance reading the screen is kind of closer to what humans do, you’re actually looking at the same user interface that humans have and you’re using the mini-map, but there are some things that humans kind of get for free. So reading pixels off the screen is one approach that might make us feel like it was closer to human, but it would be such a huge task that I’m not sure they would even get to the stage that they are here without exponentially increasing the amount of computation available. But generally that’s the approach we go for when we want to make something closer to the human experiences, is to have them read information off the screen, look at the arcade games in the same way that a human would. That’s how machine learning systems have learnt how to play all those Atari games and Doom recently, is actually reading pixels off the screen.
So that could work here, and you could argue that bots that play Go or Chess, they see representations of the board which are much closer to how humans see representations of the board. So when I look at a chess board I know exactly where the king is. There’s no ambiguity, there’s no measurement that needs to take place. The problem is that even when reading pixels off a screen, a system will always be able to have a more refined view of the world than I do. So for instance it will always be able to keep an eye on the mini-map in the bottom hand corner, it will never get distracted by something happening elsewhere on the screen if it gets really good at screen reading. So even reading pixels off a screen doesn’t necessarily solve the problem, because there’s just a fundamental difference. There’s informational overload going on, whereas with chess the information about the game state is very low, what matters is your thoughts about what decision to make next.
I think that when people think about AI playing Dota, the thing that they’re interested in is that bit, they’re not interested in looking at the chessboard and learning what all the pieces look like, they’re interested in where you decide how to make the next move. I don’t think they’re even interested in using a spell perfectly or last hitting creeps, like these are all skills where it’s not surprising that an AI can learn them. I think what people are looking for is the surprising and creative play, which you maybe only get two or three of in a whole match. But they’re the things that people talk about later, the things that people clip and highlight and put on Youtube and Twitch. They don’t really care so much whether they beat the humans or lose to them, they just want that one moment where they can clip and share where something unusual happens, something absolutely extraordinary.
RPS: But do you expect us to see that at the International?
Mike: I expect there to be at least half a dozen times where the human player is absolutely sure they’ve got a kill on a bot, so they chase it under the towers and the bot leads them on a merry chase that leads to the human player getting killed. I think that’ll happen like six or seven times in the match, and it’ll get funnier the more it happens. We saw this in the 1v1, that kind of makes it look like the human players are idiots, or that they’re underestimating the bots, and they kind of are underestimating them I guess… but one of the advantages these match ups have, and one of the reasons it’s not a great evaluation of your bot always, to just put them up against a human and see if they win or not, is that people start playing against a bot with all these preconceptions about how a bot can work.
So right now Beyond The Summit, which are an esports studio, are running something called Bot TI. And the way this has worked, it’s a knockout cup they’ve been doing and each round is five AI controlled heroes against five AI controlled heroes. They don’t play a game of Dota 2, they just run at each other and then fight and whoever is left standing wins. And it’s very very funny and silly, it reveals a lot of slapstick about the way that bots are coded in Dota currently, but what I find interesting watching it is listening to analysts predict how a bot will work. Because fifty percent of the time they have a really good understanding, and then the other fifty percent of the time they just have these vague preconceptions about how code should work or how bots logic should work, and these are the same kind of preconceptions you see when players come up against things like OpenAI’s bots for the first time.
It’s really good fun. It doesn’t necessarily tell you how good the bots are, but it is really funny, it’s always very silly, and I think that’s one of the great achievements of [OpenAI’s] project. It was a great achievement of it last year, and it will be this year. And that’s why Valve want it on stage. Whether or not it changes AI forever or even Dota forever, it will definitely be a very very funny thing to watch.
RPS: Thanks for your t-
Mike: There was one last thing I wanted to mention! The history of Dota 2 is basically a series of people discovering unusual things about this mod that has now become a separate game. There are mechanics that only exist because someone decided to try a thing out a year ago and now it’s become something that everyone does, and these things come in trends and waves and people find new secrets. Every time the game is patched it will cause some kind of weird collision between two rules that emerges in some kind of exploit or something. That’s one of the joys of Dota 2, and one of the things I’ve been reflecting on lately is that if OpenAI actually solves Dota 2 or becomes good enough to play it perfectly, there’s this point in the future where you might just have every secret of this game laid out in front of you for everyone to see. And that day is a really exciting one to think about, but it’s also extremely sad.
Because some Go players reacted very happily to AlphaGo beating Lee Se-dol, but others kind of saw it as the death of Go in some ways. Not in that there were no secrets left in Go, but that something fundamental had been kind of cracked open by this thing happening – and I’ve been thinking about why we even want bots to do this. So I’m looking forward to this International, but maybe five years from now I’ll be kind of dreading it!
RPS: Thanks for your time.