Using a pair of python scripts, I scraped the user rating distributions of over 34,000 IMDb films stretching from 1915 to February of 2017. This included just under 5.8 billion individual ratings on IMDb’s 1 to 10 scale for all movies in that timespan. Rating distributions can be a useful measure for assessing not just the quality of a film, but also the level of audience polarization. Recent films tend to have the richest data relative to classic cinema, as modern audiences are more likely to rate recent releases.
Within these films, ratings of 7 and 8 are the most common. This preference for mid-range, above average numbers stands in distinction to 5-point rating scales, which are known to produce a disproportionate number of 1s and 5s (see: Amazon.com product reviews).
Most films show normal distributions that skew slightly toward the higher range of ratings, but it’s also common for films to produce tails at the top and bottom end of the rating range. The six types of distributions are illustrated below.
Type 1 – Elite Films
Elite films tend to show a strong skew toward 10 with 10 being the most popular rating. These films include award winning classics such as Schindler’s List as well as audience favorites like Lord of the Rings and the original 1977 Star Wars. Note, however, that even among elite films, a rating of 1 tends to be more common than a 2, 3, or 4.
Type 2 – Above Average Films
Type 3 – Below Average Films
The most common type, these films peak at lower numbers like 4, 5, and 6 while still showing a normal distribution. Note the slight upticks at 10, as these bad movies tend to still have their fans.
Type 4 – Dumpster Fires
Type 5 – Polarizing Mediocrity (the Twilight Curve)
These films tend to draw strong positive and negative reactions from diehard fans and haters, yet a large number of viewers were likely to respond with a shrug and a “meh.” Twilight is the most notable entry to this type of film, as all five films in the series fit the rating pattern.
Type 6 – U-Shaped Distributions (The Curve of Unintentional Comedy)
A bimodal distribution with peaks at the extreme ends is a strong indicator of unintentional comedy, films so poorly executed that they can end up being as entertaining as good films. The Room is the prototypical example of this effect. The film’s creator tried for a serious piece of melodrama, but the execution was so poor that the result was comedy. U-shaped distributions are the ultimate “love it or hate it” category, and the best unintentional comedies in this category tend to leave viewers conflicted, as the film is simultaneously awful and highly entertaining.
More analysis on this data and an interactive visualization to follow.