Premature Evaluation: Bot Colony

Every week we send Brendan to investigate the seedy underworld of early access. This time, interrogating the robots of Bot Colony [official site].

Bot Colony, according to its own boast, is “the first video game featuring intelligent conversation as its key gameplay feature”. You speak into a microphone and ask questions of the robot characters, or give them commands. You might remember Chris attempting to put books on shelves using this voice recognition feature. You might also remember him failing miserably. Well, it’s been some time since then and we thought we’d give it another shot in this video special. Read on and watch my own doomed attempt to communicate with mankind’s newest mistake.

This time, the robot has the added challenge of interpreting a Celtic accent, a task at which voice recognition is famously terrible. I am used to changing my vowels into a mock English for the benefit of confused administration staff who don’t understand when I say that I was born in “aidy-aid”. And there is definitely some of that here. But I was not prepared for the level of misunderstanding these robots would have at the words “name” or “who” and their lack of comprehension of individual letters of the alphabet. At the bottom of the screen you can see where my words were interpreted.

This is supposed to be a training sequence, the first of three planned missions but I sadly would not get any further. The robot has witnessed a break-in. A spy has stolen an important piece of tech (an advanced sensor) from a Japanese scientist’s home. You need to question the bot and piece together the exact order of events through its responses. Sometimes you get video records of what the bot has seen, but often he will just rattle off some facts in a hugely specific fashion. The robot’s name is Jimmy. He is the worst domestic robot I have ever met.

As you can see the interrogation was almost entirely fruitless. Voice recognition and AI responses are not really at a stage when you can have an “intelligent conversation” with a machine, no matter what the game’s blurb might say. Even Alexa and Siri fall to pieces under pressure. However, the experience of repeating the word “who” into a microphone countless times has made me sympathise for all future homicide detectives who bring in a robotic witness. It’s true that you can edit the phrase you’re saying in the typebox, or just type all your questions from the start. But that kinda defeats the purpose and appeal of the game. It’s a mystery about chatting to robots, not texting them.

I’m not sure what the thinking was behind having so many Japanese names in the story, either. Surely an English-language machine would be more comfortable with ‘John’ or ‘Sarah’, rather than ‘Masaya’ and ‘Ayame’ (no matter what demands the plot has)? And surely the player, more likely to be English-speaking than Japanese-speaking, would be equally glad of simple, recognisable names in a conversation that requires clarity. It’s just one of the world’s details that feels counterproductive.

Before you begin, Windows sets up your voice recognition. You have to speak certain lines into the microphone as part of a voice test that lasts about fifteen minutes, much of it instructions on how to treat the robot (instructions which hint at the low level of cognition your robot is really going to have). But seeing that one test was clearly not enough, I exited out and tried to give the machine more to work with. That’s when you see me bumping out to desktop, whereupon I read out the game’s convoluted back story in my best storytime voice and hoped that it would be enough. It was not enough.

I’m sure this is working more confidently for some than for myself. Aside from the accent issue, more voice tests are supposed to lead to better results. But each test takes a long time and there’s no guarantee that three or four or five tests will make the interpreter as clever as it would need to be to follow directions that are more complex than “pick up X” or “walk forwards two metres”. I also realised with hindsight that many words are probably saved in the robots thick skull as American. So when I said, “turn on the tap”, I really ought to have said “turn on the faucet”. Although there is the added trouble of not knowing which objects in the house can be interacted with and which are purely environmental. It’s all very messy.

The truth is, I went in knowing that it would be janky. But I didn’t expect it to be this janky. And I wanted to be surprised at the machine’s awareness. This comes across sometimes, when you insult it, for instance, and it detects the hostility (sorry, Jimmy). But mostly the game is a victim of a multi-layered problem. First, the voice recognition can be laughable. Second, the robot’s interpretation of your questions and commands is as overly strict and non-malleable as you’d expect. Third, the actual “game” itself doesn’t help matters. Trying to collect video recordings isn’t very exciting and putting pieces of furniture back into their pre-thievery positions is no more fun than when you played ‘spot the difference’ on the back of your Honey Cheerios box.

No. It is a game wholly reliant and marketed on its technology, which feels so much more absurd the more you attempt to communicate with the machine. I suspect it will be better for people with hours of voice recognition testing behind them (not to mention the exact accent the robot was designed for), and there is also an element of learning the right commands and sticking to that cheatsheet of robot lines. It’s also admirable, from a technology perspective, to pursue projects like this. The ideal – to talk to a robot and have it understand you – is one worth chasing. But with a learning process as daft as this, I couldn’t face going through it all when the story and surrounding tasks are so wafer thin. That, and Jimmy really, really annoyed me.

Bot Colony is on Steam for £10.99/$14.99. These impressions are based on build 1631672

36 Comments

  1. Artiforg says:

    Jenny Tattersall

    I feel your pain Brendy but that was really funny.

  2. caff says:

    The video is just brilliant. Well worth a watch!

    • Premium User Badge

      alison says:

      Eye conker.

    • Premium User Badge

      tigerfort says:

      It competes well with this piece from the excellent Rab Florence (latterly of this parish) in terms of Speech Wreck Ignition software failing to cope with celtic accents. I’m actually not sure which I laughed at more.

  3. Rumpelstiltskin says:

    Voice recognition was a terrible idea IMO. Not only it’s bound to be massively frustrating, it also even more embarrassing than doing full-body VR. They should have polished the text parser instead.
    That said, it is indeed rather unfair to expect a robot to know that “name” and “knee-ehm” are the same thing.

    • caff says:

      But it does make for a brilliant video :)

    • jeremyalexander says:

      It might be an idea where technology definitely needs to catch up and we might not see it for 10 or 20 years, but I can’t see any possible argument for it being bad for games. Being able to talk naturally to companions and npcs instead of these awful interfaces and clunky movement mechanics would seem like a dream come true. It’s going to happen someday, just maybe not in our lifetimes.

      • TechnicalBen says:

        20 years ago there were games like this… and we were waiting for the tech to catch up… LOL.

  4. rymm says:

    youtube’s autocaptions seem to be better at understanding you than that poor jimmy. flipping synths

  5. BenAttenborough says:

    I think the question is WHO did it?

  6. poliovaccine says:

    Safe to say it doesn’t pass the Voight Kampf..

  7. racccoon says:

    This would of drove me nuts unless ..maybe I was drunk or it was a DRUNK hISTORY version! It would of made it more funny.. even though the video was hilarious, I really felt your frustration so bad I could hit the screen! a uninstall seems the cheapest option though.

  8. April March says:

    If I wasn’t such a lazy bastard I would most certainly create a poem from Brendan’s missheard prompts.

  9. geldonyetich says:

    Well, your microphone was overexposed, and you have a bit of an accent… but even so, the voice recognition was far worse than I would expect under those circumstances.

    I’m not sure why they didn’t just use the voice recognition technology you can find in your average smart phone. That’s considerably better.

    • Catchcart says:

      Because that’s the property of Apple/Google and probably really, really expensive to license if at all possible and runs on hardware at Apple/Google HQ rather than on your meagre smartphone processing power?

      • snv says:

        Brendan used the microsoft voice recognition wich is very bad.
        I am using Voice Attack (which uses the microsoft engine too) for my Elite Dangerous gaming and am regularly appalled at how often it fails. Especially after i have experienced how much better the voice recognition in my phone is — which by the way works offline and does not require google-server-processing power, just a little download less than 30 MB.

    • Premium User Badge

      Harlander says:

      The training screens that appeared in the video looked vaguely familiar. Is this just using the built-in Windows speech recognition?

  10. Bull0 says:

    You’d think in the case of a robot witnessing a crime, questioning by a detective would be unneccessary – just download their audio/video from the event in question. Weird premise really.

  11. Ross Angus says:

    Now we’ve seen Brendy’s desktop, the question on my lips is “why Halo?” It doesn’t seem to have been covered by “have you played” yet

  12. Biggus_Dikkus says:

    time to retire Jimmy

  13. Premium User Badge

    MajorLag says:

    You know, it sounds like the devs could have seen pretty early on that this wasn’t ever going to work that well, at which point they probably should have embraced it. Make the game a completely non-serious one about trying to use shitty voice recognition to get a robot to do simple tasks. Make the things the player has to say things that everyone knows will be difficult for a computer to parse in non-hilarious ways. Make it the QWOP of voice recognition, in other words.

    Sadly they took themselves too seriously instead.

    • Bull0 says:

      I’d play that.

    • Sin Vega says:

      Also give the robot a gun and no interest in asking for clarification

    • bonuswavepilot says:

      That could be fun! If you also made a note of which things were annoying the player (by the presence of certain words, or an increase in volume) you could bloody-mindedly insist on misinterpreting things in an infuriating way.

      Would be good in some kind of setup where there is a wide range of words that can have an effect on the world so that the misinterpreted stuff can have a catastrophic effect…

  14. BotColony says:

    It’s unfortunate Brendan ignored this advice from the game’s announcement on Steam link to steamcommunity.com:
    We recommend that you play through typing. First, you’ll need to ask LOTS of questions, so you run the risk of getting hoarse if you play through voice. … [Understanding] is difficult enough as is – without having to also deal with the errors introduced by speech-to-text.

    Brendan glosses over where the effort went (developing Natural Language Understanding). He writes:
    It’s true that you can edit the phrase you’re saying in the typebox, or just type all your questions from the start. But that kinda defeats the purpose and appeal of the game. It’s a mystery about chatting to robots, not texting them.

    Really? So building software that understands (typed) questions doesn’t count? His comment trivializes Natural Language Understanding technology as a whole (NLU works on text – and that’s where the bulk of the development effort has gone!) and makes the whole experience hinge on how well speech-to-text works for a player. I don’t feel Brendan engaged with the game meaningfully. Why doesn’t his video show ANY of Jimmy’s videos (there are 14 of them), or Jimmy placing objects (18 of those). Or any gameplay in Arrival? To make the video devoid of anything positive or impressive about the game and dump on it on account of speech-to-text? It’s strange his review is all negative and there’s nothing redeeming. His predecessor at Rock, Paper, Shotgun found positives in an earlier build link to rockpapershotgun.com
    Now, after a lot MORE work, it’s all doom?
    While this review is by no means a glowing one, it is more even handed and addresses more meaninful issues in NLU that we plan to address link to gamespew.com

    While speech-to-text IS important to the experience, it is a technology we integrate, not make. We can easily put a check on how well someone does on speech-to-text, and only enable playing through voice if you achieve, say, 90% precision. Getting ‘fun’ milage out of the Bot Colony speech-to-text was done before, see
    link to reddit.com

    We’re committed to continuing to develop Bot Colony and are looking forward to engaging with the community in a constructive dialogue.

    • Premium User Badge

      Graham Smith says:

      Hey!

      Our writers often disagree with one another. Brendan obviously had a harder time with the game than Chris did.

      I think if you include a feature in the game, it’s fair to criticise it, even if you recommend that people don’t use it. I especially think it’s fair to criticise when the game is sold off the back of that feature – your recommendation not to use it is buried in the middle of an announcement most players won’t read, while the Steam store page features reference to voice commands prominently while specifically invoking films about talking to AI like Her, 2001, etc. People are going to buy the game expecting that its advertised features will function, and in our writers experiences they frequently do not.

      Thanks for your comment – it’s always nice when developers take the time to stop by and talk to the community.

  15. BotColony says:

    Speech-to-text aside: imagine a car reviewer who reviews a novel ABC model car. ABC drives differently from other cars – but our reviewer decides to dedicate the entire review to the car’s remote keyless system that he had trouble with. That’s what Brendan’s review felt to me. Sure, you need it to get the car going, but it’s not the car. I agree speech-to-text is important to the experience, but why not evalute the game first by playing it in the recommended way?
    Brendan referred to using Japanese names ‘unproductive’. If he’d dug a bit into the background, he’d know the game is based on a Japanese-themed sci-fi novel
    link to amazon.ca
    I agree it would be easier to call Ayame Alice and Masaya Matt, but I’d have to throw the story away. In spite of him calling it ‘a convoluted back story’, many people like Bot Colony. Aside from being a decent novel, it introduces the 5th law of robotics (do as people do) and sets tangible benchmarks for future robots’ verbal abilities.

    • Bull0 says:

      If you don’t recommend people use the speech recognition you should probably say that on the store page. Your analogy is poor.

      • BotColony says:

        Some people get better result – this reviewer said it works ‘surprisingly well’
        link to gamespew.caom .
        FYI, I sent a link to Brendan’s review to Microsoft yesterday, requesting access to this link to theverge.com .
        Seen this way, the review may actually do some good.
        To address a previous comment you made
        You’d think in the case of a robot witnessing a crime, questioning by a detective would be unneccessary – just download their audio/video from the event in question. Weird premise really.
        Jimmy didn’t witness any crime, just mundane events that explain why the house was empty. He has MANY video observations (14), most of them anodyne. The idea is to put together in your mind the 5 critical videos that answer why the house was empty.

  16. Eikenberry says:

    this reminds me of an experimental “play” called Facade released way back in 2005… you could type out your conversations fairly freeform, and the characters would react accordingly. Led to some fairly hilarious situations. Still a free download! Shame it was never developed further.

    link to interactivestory.net

  17. statistx says:

    I checked the tags but couldn’t find any article about Bot Colony’s history.

    Did they come back from the dead?

    I remember them a few years back with an early access and high expectations, only to die cause of budget issues and offering some small thingies like an ebook for people who bought into it and now they are back and kicking?

  18. vahnn says:

    In the game’s defense, your mic settings were awful. The mic was crackling many times (gain and/or sensitivity too high) and there was a lot of breath and blowing noises (mic too close, high sensitivity again). Plus you were talking too quickly. You need to have definite breaks between words, because even good voice recognition still sucks. And “neh-yuhm?” That’s not how you pronounce “name!”