Wednesday, July 21, 2010

Is The Godfather IMDb's real #1?

Inception is #3 on IMDb's list of top movies. Toy Story was #6 a few weeks ago, but it's rank has sunk to #8 and will probably continue to plummet. Avatar started at #28 and has sunk out of the top 100. The Dark Knight, most famously, was briefly #1. There's got to be something wrong with how IMDb uses ratings for new movies. What is it?

It's called selection bias. The pool of people who vote on movies right when they come out isn't representative of the typical IMDb user who will vote on the film. They tend to be people who were so hyped about the movie that they saw it opening night or opening weekend. The result is that Inception has the 40,000 votes from people verly likely to enjoy the film. (IMDb corrects for this somewhat by filling in the "missing" votes that would get Inception up to 200k with an "average" opinion, but the trend shows they clearly don't compensate enough.)

There are other biases, though, in the IMDb formula that can throw the ranks out of whack. One is that it doesn't take into account the age distribution of voters. In general 18-29 years old give movies higher ratings than older people (45+) and these groups have different tastes. 18-29 year olds have substantially higher ratings than 45+s for The Shawshank Redemption (9.4 vs 8.7) and The Dark Knight (9.1 vs 7.7). In contrast, both age groups give similar ratings for The Godfather and Avatar. So I asked if it's possible IMDb has the ranking of Shawshank and The Godfather and The Dark Knight and Avatar backwards if you adjust for age demographics.

Short answer: yes. Based on these data, if you put a gun to my head and asked which movies would have a higher rating if the entire country watched each pair, my guesses would be The Godfather (by a big margin) and Avatar (by the slimest of margins).

Another bias to explore is gender bias. IMDb voters are overwhelming male, but the U.S. population (and movie-goers in general) aren't. Can that help explain why The Blind Side has a 4.4 on Netflix and an A on Yahoo! Movies but doesn't even make the IMDb Top 250? Sort of. The average rating for females is 8.2 vs 7.7 for men, and most of the ratings are from men. But if you give equal weight it only bumps The Blind Side up to a 7.9, not enough to make the Top 250 cut. The bias effecting The Blind Side is probably the selection bias in who visits the website (hipsters who don't like mainstream movies or football much?). The Notebook, though, is easily booted from the Top 250 by gender bias. It has a phenomenal 8.6 rating from females and a 7.8 from males, averaging to 8.2 and a would-be spot somewhere around #120. That is probably still to low, though, given it's 4.2 on Netflix from 5.5 million ratings.

Update: Here's another interesting story about the Godfather and Shawshank rankings.

No comments:

Post a Comment