Eurogamer Statosity: 7/10

Of course, the average for the writers doesn't mean much without looking at the games they had to review.

The scores, of course, don’t matter. BUT THEY CLEARLY DO. Inspired by the analysis of Pitchfork’s scoring, Tom Armitage applies a similar methodology to Eurogamer. Go and admire his results here. It’s fun to click around madly, though the average score – 6.821 – will get quoted all over the place, I’m sure. What games scored lowest? See what the RPS writers did – John, Alec, Jim, Quinns, Tim and I. AND NOTE I GOT THE LOWEST AVERAGE OF US ALL. They’re all soft, I tell you.

58 Comments

  1. Ginger Yellow says:

    “So Kieron, for instance, didn’t review many games, but was so consist in his lower-than-average scores that he had a reasonably negative influence. ”

    You should put “a reasonably negative influence” on your business card.

  2. JohnArr says:

    Apparently Quinns is the most deviant, once you put a numerical value on it. Work harder Gillen!

  3. Gap Gen says:

    I dunno how meaningful the 6.8 figure is, assuming people start quoting it. The more typical score is around 7-8; it’s the tail of poor-to-average games that brings the mean average down – so, for example, for two games that got 9 and one that got 3, the mean average score is 7, even though it’s not the typical score that a game gets.

    The modal score (or even the median) is probably more meaningful here; mean averages tend to get skewed easily in a lot of cases.

    • Meat Circus says:

      The Median score is 7/10, which is pretty close to the mean, suggesting very little skew. And if you look at the chart, you can see there *is* no long tail. To within a small statistical margin of error, ALL games get a score of 6-9.

      Which, of course, makes a mockery of a mark out of ten. But we all know this, INCLUDING EUROGAMER. They just retain the scores because it starts flamewars in the comments driving page impressions.

    • Gap Gen says:

      Sure, but the modal value is 8 (leaning towards 7). My point being that a typical game is 7-8, rather than just shy of 7. Of course, you’re right that scores are of debatable usefulness, with some out-there games being quite hard to pin down to a given number, even if you understand what the numbers themselves signify (which is part of the problem with Metacritic – 3/5 stars is a pretty solid score for films but translating it to 60% compares pretty poorly to other % scoring systems).

    • itchyeyes says:

      @Meat Circus

      On the point of the average score making a mockery of the “out of ten” system, I used to agree with you there, but lately I’ve changed the way I think about this. The fact of the matter is there are games that make full use of the range of scores (there are 9 games on the EG list here that scored a 2). There just aren’t that many of them. Most games that are of such poor quality either get shut down before ever making it to store shelves, or aren’t really intended for the kind of audience that reads reviews, and hence don’t get reviewed.

      So merely through the virtue of making it to store shelves and onto a review publication’s radar, most games generally make it past some sort of quality hurdle. But there are still rubbish games that slip through, and there needs to be some method of distinguishing an absolutely garbage game that never should have even been made from something that, while maybe utterly broken, is still passably entertaining.

  4. d. says:

    Looks like their scoring scale is definitely not linear.
    Shouldn’t the average game get 5.0?

  5. Adam says:

    I wouldn’t say so d, 5.0 is the score used to mark a game as average, but that doesn’t mean the average score should be that. Or something. I know what I mean.

  6. Ian says:

    Quinns’ one has an awesome pie chart.

    • Man Raised By Puffins says:

      Ditto Tim’s pie chart, which seems to have been broken by his flagrant awesomeosity.

  7. Meat Circus says:

    I think we should talk about Kieron’s relentless negativity.

  8. Meat Circus says:

    Those Eurogamer scores in full:

    1 – Does not exist
    2 – Shit
    3 – Shit
    4 – Shit
    5 – Shit
    6 – Shit with a large advertising budget
    7 – Playable
    8 – Playable
    9 – Playable
    10 – Playable with a large advertising budget

    • mrmud says:

      World of Goo being one of 4 games that got a 10 kind of ruins that logic.

    • Meat Circus says:

      I don’t think a one-off Walker-orchestrated Indie insurrection belies the point that in general, to get a Eurogamer 10/10 needs the backing of a AAA hype-machine.

    • Lilliput King says:

      “one off” here referring to one of four.

      I dunno, seems fairly important.

      I haven’t played Chinatown Wars, but Uncharted 2, SF4 and WoG really deserved their ten out tens. Not as if the scoring is particularly consistent (apart from consistent 7’s), but even so.

  9. Acosta says:

    Meat Circus looks like he wants to rise a point, but I’m not sure. Could you make five posts more stating your position in order to clarify it?

  10. The Sombrero Kid says:

    0-5 is reserved for gradations of bad or wholly unenjoyable since it’s very difficult to make a piece of entertainment wholly unenjoyable by mistake, games are rarely awarded those scores, if we were rating cakes made by top chefs on taste, 5 being indifferent 0 being completely disgusting and 10 being delicious very few would get 0-5 and not because the scale is non-linear.

    • bill says:

      I like it. I shall appropriate it and use it every time someone complains about review scores.

  11. Stijn says:

    Oh Eurogamer, you capitalist SCUM

  12. Tei says:

    Information: You can make games in the 0-5 scale. These games probably never end on the hands of professional reviewers, and this Is why we see the scale “start” at 6.
    Is like, I don’t know, the military. There are rules about size, constitution, weight, etc.. to enter the army. So you see people of height X and more. With a rare case of some guy that is smaller because has lost his legs.

    • Alexander Norris says:

      There’s a super simple solution: rate games from -5 to 5. Suddenly, 0 becomes average and 5 becomes really good, and for the very rare case where a game is so shit that it deserves a score below 5, well, it’s all the more humiliating to get a negative score.

      Then watch as the mother of all PR apocalypses happen when publishers complain that their utterly mediocre game got a zero (A ZERO!?!1!?).

    • Psychopomp says:

      I’d say that if you *have* to use scores, a 1-5 scale is the way to go. No ambiguity, no confused masses. Everyone knows a 3/5 is average, 4/5 is damn good, and 1/5 is an atrocious pile of shit. For some reason, morons see 89% and lower as utter shit, not worth their time.

    • bookwormat says:

      There’s a super simple solution: rate games from -5 to 5

      There’s a even better, simpler solution: Don’t rate games at all. It is meaningless and irritating anyway, no matter what scale you’re using.

  13. Robin says:

    It seems odd to me that anyone would embark on this exercise if they’d read the text of more than one Eurogamer review.

    It would make some kind of sense to apply stat for an organ like Gamespot, PCZone* or Giant Bomb where the scoring system is clearly defined and consistency is taken into consideration above individual writers’ whims.

    Eurogamer’s approach isn’t wrong (I would rather read a passionately written review giving a game a 10 than a stuffy Edge review that’s terrified to mark anything that isn’t ‘culturally recognised’ – i.e. fashionable and/or saturation marketed in the specialist press – more than an 8), but it’s not attempting to be consumer information.

    An analysis of the words/phrases each writer used most frequently would be more interesting.

    *at least, when I last read it about 5 years ago.

  14. Hypocee says:

    I say it everywhere – this is a correct situation, an artifact of the fact that outlets tend to review games that anyone on Earth gives a shit about. Generic Gears of War Clone III not only can, but must be compared to Supermarket Checkout Sub-Flash-Quality ‘Arcade’ Games Collection. If your average is 5, then either a] you are obsessively reviewing every single SKU of entertainment software produced on the planet, in which case well done you, or b] you don’t have the range to truly express the horror of the few real stinkers that bubble up to your critical gaze.

  15. RLacey says:

    Nice average, Mr Gillen. Did you pick your last review score in order to get such a pleasingly fraction-free number?

  16. Alexander Norris says:

    Guys, we’ve all missed the very important bit here.

    One Life Left are now statistically correct in 100% of cases (maybe (according to very fuzzy maths (performed by a drunken toddler (at the very least, they’re justified)))).

  17. Buckermann says:

    AND NOTE I GOT THE LOWEST AVERAGE OF US ALL. They’re all soft, I tell you.

    Or maybe your supervisors (or whatever they are called in New Gaming Journalism) hate you so much that you only get the Worst Games Evar to review.
    But probably not.

  18. Lewis says:

    Meat Circus and others: What you’re arguing for, then, is scoring on a bell curve, I take it. As in, the majority of games get a five. A slightly smaller amount get fours and sixes. The tiny minority get tens and ones. The problem with this is that it means A) constantly re-evaluating your scoring system, which means a 7 one month might not signify an equivalent quality game the next month, and B) that you’re assuming the concept of an “average game” means something statistically, measurably average. As someone above rightly pointed out, as long as the full scoring system is being used, it means games exist to fill those roles, and as such it makes sense. If only one game drops a decade that deserves a one, for example, does that mean all other games should be marked lower? In that case, what does that utterly terrible one-worthy game get now? A half? Where do we go from there?

    It’s a year of scoring. Assuming the average will or should fall bang-on five is preposterous. Almost as ludicrous as suggesting games’ advertising budgets “buy” tens at Eurogamer. Worth mentioning, I’m sure, that Eurogamer’s editorial staff, writers and freelancers have absolutely no access to advertising information to avoid any conflict-of-interest. Their accounts folk are the only people who have access to that.

    • Meat Circus says:

      I envision a new system, wherein three experienced game journalists plus John Walker whisper whether or not they like a game to the leader of the autobots, who then delivers an aggregated verdict via his thumb of justice.

      I may patent the idea.

    • PleasingFungus says:

      Poor John Walker. Isn’t it enough that he’s a terrible healer, without you hurling this kind of abuse upon his prostrate* body?

      *John Walker may or may not be prostrate at time of writing. But it reads better if he is.

    • Hypocee says:

      In reply to Lewis or not, as RPS’ software decides:

      It’s a year of scoring. Assuming the average will or should fall bang-on five is preposterous. Almost as ludicrous as suggesting games’ advertising budgets “buy” tens at Eurogamer. Worth mentioning, I’m sure, that Eurogamer’s editorial staff, writers and freelancers have absolutely no access to advertising information to avoid any conflict-of-interest. Their accounts folk are the only people who have access to that.

      Doop de doo

    • Lewis says:

      What’s your point? That Stuart Campbell thinks there’s a chance that Eurogamer might have done a dodgy deal four years ago, but there’s no evidence?

      Either way, I’m talking about Eurogamer at the moment. I know that’s Tom’s policy. When Kristan was at the helm, it may have been different. No idea.

  19. LewieP says:

    Does anyone really like review scores?

  20. Hypatian says:

    Looking at the percentiles is perhaps the most interesting way to look at this data:

    Score – Percentile (percentage of all other games it’s better than)
    10 – 99%
    9 – 89%
    8 – 60%
    7 – 33%
    6 – 19%
    5 – 10%
    4 – 5%
    3 – 2%
    2 – 0%
    1 – 0%

    In short: If a game scores 7, then two games out of every three are better than it. If it scores 8, then it’s better than 6 out of every 10 other games. The median score is between 7 and 8, and that’s the point at which a game is “average” (i.e. better than half of the other games on the market).

  21. Hypatian says:

    Gah. Off by one error. Wish I could edit these things. Seven is the median. 33% of games are *worse* than 7, i.e. if a game scores 6, 2/3 of games are better than it, and if it scores 7, it’s better than 1/3 of games (and at least as good as 2/3 of games).

  22. Jakkar says:

    Alec’s selection are less than inspiring my confidence in his tastes. Or did he simply draw the short straw for last year’s reviews?

    Mm. Some good nostalgia in those lists.

    And Darkfall. I still weep bloody tears for the pain that game put me through. And the £100 my lady spent getting us both a copy. We played a few days, it was enough for a lifetime.

    Mortal Online; you hold all my hopes and dreams, except the small few being fondled by APB.

  23. Jimbo says:

    I believe Metacritic came up with ~7 when they released similar data. I don’t like that 50+% of reviewed games always end up crammed into 30% of the review scale – it means the scale/s have too much emphasis on categorizing bad games, which is pointless. In practice, 1-5 might as well be replaced with ‘NO’.

    I think review scales need to be calibrated to better reflect ‘reviewed games’ rather than ‘all games’, so that 5.5/10 = an average ‘reviewed game’.

  24. Hobbes says:

    “Wrangler” surely? And yes, this was meant to be a reply.

  25. Guncry says:

    I stopped paying attention to review scores of games that I’m interested in years ago.

    If I like the sound of the basic setup / premise / mechanics then I’ll check out a few reviews, mostly to see what the pitfalls are – if I think I can live with what the reviewer saw as negative, then I’ll probably pick it up. These days it’s easy to check out gameplay videos to see what the title is like “in the flesh” too, which always helps.

    As indicated by the comments here, scores alone are far too misleading and inconsistent… not only across publications but even between reviewers working for the same publication. That’s just the nature of the beast and it isn’t really anyone’s fault, but arbitrary scores ultimately just don’t offer much merit to me.

    • D says:

      Until you buy a 360 and suddenly have half a million reviews/videos to browse through. Then ‘sort by score’ sounds like a good start. You might not ever do so of course, but they do have a usefulness, is my point.

  26. Phil says:

    I think games should be scored along the lines of the Michelin star system – average games get no stars at all, a very good game in it’s category gets one star, and a truly exceptional game gets three.

    This would keep metacritic happy too, since it can be easily converted to the standard 7-10 scale…

  27. Bret says:

    I kinda like having a score.

    I mean, sure, the review is the important part, and some games get, say, sevens but the text gives the idea “Geeze. This? This is my kind of game.” (Like most non-Kieron reviews of EDF 2017, say.)

    But a score helps make dillydallying clearer sometimes. It conveys the general impression, and for some reviews, it really helps deal with nitpicking in an odd way. I mean, if a review catalogs faults, and ends on a ten from a reasonably reliable journalist? Says the fun, or the art, or whatever totally overwhelms the flaws. Not always good, but sometimes useful.

    Also, I like brightly colored numbers. They make me clap my hands in childlike glee.

  28. VelvetFistIronGlove says:

    I like the RPS verdict approach—all four entities in the hivemind would play the game, then have a combined discussion about what they liked/disliked about it. At the end, each would give it a thumbs up or thumbs down. Since by now I have a reasonable idea of how much each entity’s game taste overlaps with my own, the result was exceedingly useful (e.g. Kieron gave this a thumbs up but John thumbs down? Probably not my cup of tea).

  29. hoff says:

    Further more proves that an “average” score in gaming journalism is an 8 or 7, while a 5 is already an insult. Maybe it’s something psychological and there’s no way around it. But I wonder, if a 5 was the average score given, how the additional refinement in scores would allow better comparison. Especially, since EG decided not to use 1-100 but 1-10 measurements.

  30. Pod says:

    No sigh of good old 73%?

  31. TeeJay says:

    Looking at the 769 “PC games” in Eurogamer database:

    10 = 8 games
    9 = 95 games
    8 = 173 games
    7 = 203 games
    8 = 134 games
    5 = 83 games
    4 = 38 games
    3 = 26 games
    2 = 7 games
    1 = 3 game

    average score = 6.8

    Cf. Eurogamer readers’ scores (total = 1190 ie. also includes older games): average = 7.2

    Excluding all games that have less than 10 readers’ scores (total = 526): average = 7.7

    These break down as follows:
    9.00+ = 16 games
    8.00 – 8.99 = 177 games
    7.00 – 7.99 = 247 games
    6.00 – 6.99 = 70 games
    5.00 – 5.99 = 11 games
    4.00 – 4.99 = 4 games
    3.00 – 3.99 = 1 game
    (how to compare these reader averages with review scores? eg x>9.00 = 9 or 10? Or 9.5>x>8.5 = 9, etc.)

    The higher readers’ scores could suggest that not many people end up playing the *worst* games and that people tend to end up buying->playing->listing->scoring games they have already have *some* preference for. Good games motivate more readers to actually give scores than bad ones do.

  32. TeeJay says:

    Biggest mis-matches between Eurogamer critics scores and reader’s scores (10 or more votes)?

    Mafia: EG=4 Readers=8.4
    Another World (15th Ann.): EG=5 Readers=8.0
    Mount & Blade: EG=5 Readers=8.0
    Postal 2: EG=3 Readers=6.0

    Boiling Point: EG=9 Readers=6.2
    Star Wars Galaxies: EG= 7 Readers=4.4
    Sudden Strike 2: EG=8; Readers=5.5
    Spore: EG=9; Readers=6.6

    Also agreement:
    Half-Life 2: EG=9; Readers=9.1
    Far Cry: EG=8; Readers=8.0
    Team Fortress 2: EG=9; Readers=8.9