Sentiment distribution among reviewers matters to readers

This page is part of a global project to create a better online reviews system. If you want to know more or give your feedback, write at [email protected] and we’ll grab a beer ;)

Beyond the average rating, readers often examine the distribution of ratings to gain additional statistical insights, such as variance. This distribution is typically U-shaped.

Amazon provides the distribution of ratings for all products.
Amazon provides the distribution of ratings for all products.
Example: a business with an average rating of 4/5 could be the result of:
  • 10 reviews of 4 stars
  • 7 reviews of 5 stars, 1 review of 3 stars, and 2 reviews of 1 star

In each case, the perception is different. In the first scenario, the service seems consistently good but improvable. In the second, it appears more sporadic, mostly good but with some significant dissatisfaction. This information helps users make more informed decisions.

This impacts the user’s willingness to check individual reviews. According to a study 1^1, when rating dispersion is low, the incentive to read individual reviews decreases due to the principle of least effort. Conversely, when rating dispersion is high and average ratings are less trusted, the incentive to read individual reviews increases due to the principle of sufficiency.

The distribution of ratings is necessary because the average rating alone is not sufficient. However, the U-shaped distribution can occur for two reasons:

  • Diverging opinions on the same criterion. There’s a controversy.
  • The product is good in one aspect but poor in another, and most people agree on it.

Users cannot discern these nuances without delving into the comments, a task they typically reserve for the Confirmation stage due to the abundance of options and the time required to read reviews.

💡
Exploration
  • “X people have a converging opinion.” To improve clarity, we should categorize reviews and indicate how many reviewers share similar or differing opinions. Building on the design proposed in “All reviews don’t count the same,” here’s an example for an Airbnb review from a guest complaining about noise:
  • A design proposition for Airbnb.
    A design proposition for Airbnb.
  • Replace the average rating with a sentiment overview by category. This idea keeps resurfacing because the need for a sentiment distribution indicates that the average rating is often too vague. The second reason for high variance, where “The product is good in one aspect but poor in another, and most people agree on it,” clearly calls for a category-based review system. This would allow readers to understand key details at a glance.
  • Disclose the number of upvotes and downvotes per category instead of an average rating. Alongside the above suggestion, providing the number of upvotes and downvotes for each category would offer direct insights on the variance, and give information on whether there’s a controversy on a given criterion. Readers could then decide to explore further if a particular criterion is important to them.
  • A design proposition for Airbnb, replacing the overall rating with upvotes & downvotes per category.
    A design proposition for Airbnb, replacing the overall rating with upvotes & downvotes per category.

1^1 “Does the dispersion of online review ratings affect review helpfulness?” by Soyeon Lee, Saerom Lee, Hyunmi Baek, 2021.

➡️ Next up: Suggestive opinions & choosing out of spite