How Spirit hopes to bring humanity to AI

Spirit AI

Sitting in a Bath tearoom having just wiped the lemon curd from my fingers I was tasked with interrogating a robot about a murder. The interrogation scene was the GDC demo for Spirit AI [official site] – middleware geared around bringing more expressive characters to gaming as well as building safer and more inclusive online environments. Both hinge around the same set of technologies. They each look at language to understand interactions but one uses that understanding to build meaningful encounters with AI characters and the other uses it to keep an eye on how players are behaving towards one another.

I was sitting with Mitu Khandaker, creative director at Spirit AI. You might remember her work as the developer, The Tiniest Shark, on the game Redshirt or, if you’re in academia, she’s an assistant arts professor at NYU Game Center and holds a PhD in games and the aesthetics of interactivity. She was watching me play through the demo, using natural language to try to figure out how a man called Martin died and whether the robot is culpable. The demo was by Bossa Studios, makers of Surgeon Simulator, and it gave me a limited amount of time to chat with the robot – me typing and the robot speaking into the tearoom with a female voice in a Scottish accent – before asking for my verdict on her guilt.

So far I had made the robot shy by asking if she was lying about not being able to kill humans and she reiterated the prime directive. As I probed for information the AI character was walking a difficult tightrope, trying to conceal some things in order to stay true to her character but volunteer others so that, as a player, I didn’t get frustrated or end up without any threads to tug.

As an example, I asked who did kill Martin and she didn’t have that information and so couldn’t answer. Instead of just saying “I don’t know” she told me that she thinks the murder was related to his work and was almost certainly a human action. This gives me a direction in which to move.

Spirit AI

The system isn’t perfect at this point but it is intriguing and nuanced. I asked what a woman named Alicia does after the robot mentions her by name, meaning what she does for a living and the robot answers with what Alicia was last doing in terms of her actions in this scenario. It also seems flummoxed by some of my phrasing, such as “why do X get cross with Y?” There’s also a moment when I’m trying to pin down motive and the robot seems to think I’m flirting.

But there is also a real sense of information unfurling in a way I’m not used to with games. The robot was generally capable of understanding who I meant when I started using deictic words like “she” or “they” or “there”. The AI is generally able to keep track of who I mean in those scenarios which is impressive to me. You can also, because of a few formal errors like missing capital letters at the start of sentences, see that the AI is not following a script but is drawing conversational elements together to create our interaction based on what seems relevant and whether I’ve triggered particular beats.

At the end of the first playthrough of the demo I’ll admit I was none the wiser as to the murder, but I had another round and found out a lot of helpful circumstantial things. Sergei was strong enough to have hit Martin as hard as was needed to kill him and Alicia wasn’t, and Sergei couldn’t access the room in which Martin was found. Alicia DID have access to that room AND the robot was covered in blood. Oh, and carrying a body wouldn’t interfere with the prime directive.

I decided that Alicia and Sergei probably conspired to kill Martin with Sergei doing the actual killing and Alicia providing access to a location Sergei couldn’t have reached. The pair then had the robot carry the body from where the murder happened to the location in which it was found.

Spirit AI

I feel like I let Angela Lansbury and Hercule Poirot and Columbo down, though. They would have solved the whole thing and not just be relying on circumstantial evidence for their conclusions. I also managed to phrase my answers to the robot at the end of the demo sufficiently unclearly that she thought I was accusing her of murder anyway. I was accusing her of being an accessory after the fact who had had her memory wiped! Justice is difficult.

We only have a short amount of time to take a look at what’s happening under the hood for this specific scenario but from what I see and for what Mitu points out it’s setting out a framework within which the AI can improvise, acting and reacting dynamically instead of following one of those conversation trees where every branch must be scripted.

Through the system a writer or narrative designer would be able to highlight important pieces of information for the character to convey, what to keep nudging the interaction back towards, story beats to hit and so on rather than following a specific path. That’s where the personality of an AI would also begin to be created – via how the character responds and prioritises, their motivations and their agenda.

“We’re not trying to recreate humans or anything,” says Mitu. “In fact, in a nutshell, with the character engine what we’re doing is letting you, if you’re a narrative designer or a writer, be creative and craft very specific stories but allow characters to behave dynamically within those stories.”

There are obviously other aspects to a character – their body language, their gait, their specific vocabulary, their understanding of personal space and so on but those would be the domain of the developers as normal. In terms of which of those elements impact the interactions, that would also vary from game to game. Within the system you can also do things like define how assertive a character is when they speak. “We can plug in any amount of game state information into our system,” says Mitu. “We already do proximity and we’re about to do gestures in VR.”

Spirit AIHere you can start to see how the game works with a timeline of events to establish one set of knowledge for the AI.

In fact, at one point I leaned in towards the robot during my interrogation scene via a keyboard command and she read that as an aggressive action. Mitu tells me that in the room-scale VR version you can stand behind the robot and loom over her which she perceives as threatening.

I start to worry about the impact this newfound awareness could have on my habit of squat-thrusting my way through exposition dialogue or trying to clamber on furniture when some authority figure is telling me about a mission. Imagine a Mass Effect where instead of staring into the middle distance and banging on about some family thing the NPC suddenly stops and says “Excuse me, that’s my bed! Stop jumping on it with your grubby boots when I’m telling you a harrowing tale.”

Mitu is interested in that too: “One of the things I’m really excited about when this stuff gets out in the wild is players doing the things they’ve always done and playing round with boundaries but having characters suddenly understand!”

Understanding is at the root of everything Spirit AI’s developers are working on. As Mitu said, the goal isn’t to create believable humans via AI, it’s about players being able to express themselves in a naturalistic way and having the game systems understand them, thus helping create more meaningful interactions. As to the specific nature of those interactions, Mitu says everything is customisable:

“Characters have a script space and a knowledge model which is a repository of information they have about the world and the things they know. We don’t assume anything. There are a bunch of defaults we give just so you know how to use the tool but you can customise everything.”

There is still a lot of writing involved but the toolset is supposed to make that a bit more manageable and it can also generate similar lines of dialogue to make the process easier. You also need to set up the knowledge models for the different entities in the world.

Spirit AIThe synonyms column here shows you how the entities start to get rounded out so that the AI can parse meaning in a more natural and flexible way.

For example, you can have a company as an entity and the company can have employees which is another type of entity. A person would be a third entity but they can also be an employee – all things you can define and relate. Thus you can create the idea of a company with employees or you could create the idea of a nest of ravenous ants on an alien world made from ice cream. (No, YOU dropped an ice cream on the pavement earlier and are now preoccupied with the ants scaling its heights while you write a feature.)

But alongside the framework for creating AI as game characters the Spirit team are employing that tech to help developers grow non-toxic communities. During her first conversation with Steve Andre (CEO and founder of Spirit AI and formerly of IBM where he worked alongside the team developing Watson, the computer system which won a dedicated edition of the US game show Jeopardy) Mitu asked, if the Spirit AI system could understand language they why not use it to try and tackle harassment? Thus the other main facet of the work at Spirit AI sits Ally.

Ally is a different tool and I’ll quote from the official site to summarise what the company’s goals are with it:

“Ally underpins the duty of care games studios and social platforms have towards their communities, helping to safeguard against abusive behaviour, grooming and other unwanted activity – all on the basis of listening to the player about what makes them feel safe.”

To start dealing with abuse the system has to be able to recognise when it’s happening. One traditional method is to scan for key words or phrases – slurs are usually at the top of the list, with other, more specific terms and behaviours following. One of the more memorable attempts at countering abuse recently was in Overwatch where “GG EZ” (an eye-rolly bit of bad manners indicating the opposition didn’t offer much of a challenge) gets substituted in chat for lines of Blizzard’s choosing like “I’m wrestling with some insecurity issues in my life but thank you all for playing with me.”

Spirit AIThe game pictured here is an off-the-shelf MMO – the experiment was to see if the Ally chatbot could be plugged into it rather than the game content itself.

Ally isn’t looking to do that. For one thing trolls will note which words are banned and then work around them. Instead it’s looking for patterns of behaviour. That’s not to say it doesn’t look at the language being used but it tries to see it in context. By that I mean that it’s also looking at whether another person in the chat is expressing discomfort. That could be something like typing “leave me alone”, but it can also see when interactions are largely one-way. So if a player is spamming someone else with group invites or unwanted attention and the recipient is trying to ignore it the system can register that. It can then combine that discomfort with instances of sexualised or racial language (or whatever else has been happening.

Something developers can do with this information is use it to activate an NPC interaction. So perhaps a character from the game or just a community moderation bot would then send a message – “I noticed X was happening, are you okay?” From here the player could indicate whether they wanted to block the other player but also whether the system had misread the situation. Perhaps in this context the language and the lack of response were friendly teasing so you could specify that in this instance or with this particular player it was fine. Or even that it just wasn’t bothering you. The system can then learn from those responses and apply your preferences in the future.

The developer could also set up Ally so that it talks to the player instigating the harassment to figure out what’s going on on that side of the interaction.

The big thing here is the understanding that boundaries aren’t fixed and that what is okay in one scenario might not be in another. Intervention is the preferred method here (according to a study Mitu references from Data and Society) in terms of its effectiveness at dealing with online harassment but that usually relies on having a robust moderation and community management team to make those interventions happen.

Spirit AI

Spirit are working with some live games at the moment to feed data to the system and make improvements based on what developers and players actually need. Mitu tells me one of those games deals with 1,400 support tickets a day. That requires a mixture of processing power and emotional labour depending on the scenario. What Ally would help to do is take on some of that labour – more than is taken on by a mute button or kicking players into the low priority queue for abandoning games.

The system uses predictive analysis so it would also be able to flag up potential problems even if it doesn’t understand them. For example if it picks up a pattern of players showing discomfort it can flag what’s causing it so a human moderator can tag in and figure out what’s going on. Thus it would theoretically be able to pick up on the workarounds trolls might be using or see when offensive memes flared up even if it didn’t know what the content actually meant.

It wouldn’t replace a ban hammer, nor is it intended to, but it would offer a broader and more nuanced toolset when dealing with problems in player communities.

On the flip side it could also be set up to pick out when players were being helpful, maybe teaching newbies how to play or keeping a team’s morale up. Instead of punishments it could be set up to mete out rewards.

One thing I’d want to keep an eye on is the temptation to silo players off in toxic pools where they only get to play with one another. That might work for some games, or at least contain the problem there but I’d prefer the set-ups to focus on integrating groups of players and incentivising these positive reactions in the hope that it promoted positivity (or at least reduced harassment) elsewhere online. I’d also be interested in the potential to plug the Ally toolset into things like comment sections online.

Mitu touched on that in our interview a little noting the capacity for Ally to help developers set the tone for their community.

In terms of the murder mystery proof of concept demo I was playing, there’s enough there and the interactions worked well enough that I’m excited to see what happens next, especially when more elements get added in – animation, detailed art, different scenarios and games. But Ally is the tool I’m really hoping to see develop. I regularly interact with games deemed by many to be the most toxic in PC gaming (although they are also responsible for me meeting lifelong friends). The prospect of systems better set up to deal with the nuances of online harassment AND community building seems like something special.


  1. Premium User Badge

    Drib says:

    I like the idea of talking with a robot to solve mysteries or something. Using a robot is good too, because it cuts down a bit on the uncanny valley issue.

    I’m not so sure about automated chat management, I could see it being annoying. “Are you okay? Your teammate is loitering near you” every five minutes, or something. It’d be a balancing act, at best.

    • GenialityOfEvil says:

      They should hire clippy.
      “Hey, it looks like you’re trying to forge the Doomhammer of Margalath. Would you like some help with that?”

      • Premium User Badge

        Drib says:

        Are you okay? You seem to be making absurd marketing pitches on a site where no one is interested. Do you need me to block you?

  2. seroto9 says:

    Hands up who had to look up ‘deictic’? Not seen the word used in this context before.

  3. Ben King says:

    I never thought about just how much I really wanted basically conversational AI until my GF and I put 150hrs into Skyrim between us. To spend so much time in a big beautiful world but have such limited character interaction galled me. It was just about at that point I realized for the first time that there was far more that I wanted out of games narrative interaction than an occasionally branching story and canned responses. On the flip side i can imagine a world where low cost and even slightly more competent conversational bots turns the Internet into a far more painful place to negotiate. Capably skimming chat channels for patterns of abuse is in some ways even more interesting than the conversational part, but it’s still all really neat. I still haven’t looked up diectic yet… Pip will definitely do some vocabularies expanding with her work today.

    • Ben King says:

      Edit expired, I mean “deictic.” And I learned it. Thanks pip, thanks Internet!

  4. Premium User Badge

    garfieldsam says:

    As a developer I am stoked by the idea of conversational AI but I see that in turn requiring so many resources to implement that it might require the entire game to be built around it as a mechanic or requiring a giant EA-size team. Otherwise there are so many potential issues that it could hurt the design by exploding the number of variables needing to be accounted for, like usability, giving the player a clear direction, etc.

    • GenialityOfEvil says:

      E.g. No Man’s Sky

    • LewdPenguin says:

      That was pretty much what I was thinking as I read it, a neat bit of tech that has the potential to produce 1 or 2 really amazing games where a developer focuses on making full use of it, but which otherwise will be regarded as too complex, time consuming and expensive to make use of compared to the now standard short branching dialog tree leading to a Good or Bad (and maybe Neutral) ending to that conversation popup.

  5. brucethemoose says:

    I’d also be interested in the potential to plug the Ally toolset into things like comment sections online.

    So what shall the official RPS comment bot be called?




    I don’t suppose it can be a tribute to a game, can it? Like Shodan or Legion?

  6. DEspresso says:

    The Secret to any interrogation is to pretend to be leaving the room only to turn around and surprise them with an innocent ‘One more thing though’.

    Or hitting the suspect with a phonebook.

    • Okami says:

      Or you could combine the two methods.

      ‘One more thing though.’ *SMACK*

      • TheMightyEthan says:

        I want to thank you, I had a good laugh at that mental image.

  7. Ny24 says:

    As of yet I was never impressed by these kind of games, but in general I find them very interesting. This particular one reminds me a little bit of a game I did 2015 for my master’s: Subject. In it you play some kind of translator between two identities which also adjust their trust and behavior according to your actions. It’s not that complex and far from a finished project. For me it was just a little tool to test narrative in games, but it’s free if you want to give it a try.

    Also I’m sure the first thing that comes out of this is a Turing Test game in which you play multiplayer and need to guess if it’s a “Robot” or “Human” you’re speaking with.

  8. Premium User Badge

    Ninja Dodo says:

    I am really intrigued by this project but one big question I have is: Will they be doing more with animation (as it sounds like there is none currently)? Because it seems like a huge missed opportunity if text and voice are the only way the characters communicate. A system like this would be the perfect place to also experiment with dynamic gesturing and facial expression. Movement and body language is essential to creating complete living breathing characters…

    It sounds like they’re focusing more on the dialogue right now, though “we’re about do gestures in VR” sound promising, but it also sounds like they’re focusing on the player side there and not so much on the character visually emoting in return? I really hope they branch out into animation as well.

    Either way, I will be following the development of this project with great interest.

    • Ben says:

      “There are obviously other aspects to a character – their body language, their gait, their specific vocabulary, their understanding of personal space and so on but those would be the domain of the developers as normal. In terms of which of those elements impact the interactions, that would also vary from game to game.”
      Sounds like the Spirit AI would tell the dev systems how the AI character is feeling and the dev systems would then do what they wanted with animation accordingly…

      • Premium User Badge

        Ninja Dodo says:

        Yeah, I saw that bit and I understand that is where they’re at right now, but if they have no intention of exploring animation at all (even in the long term) that’s disappointing, because that side is as much in need of further development and standardization as dynamic dialogue and is an absolutely essential part of creating expressive characters.

  9. shevtsov200 says:

    I can’t find anything in google, will this demo be available to mere mortals?