Satisfaction vs. Performance: understand the difference

This page is part of a global project to create a better online reviews system. If you want to know more or give your feedback, write at [email protected] and we’ll grab a beer ;)

Like any metric, the satisfaction rate (often reflected as an average rating) offers only a partial view, simplifying complex data and overlooking other critical factors. Consider the Gross Domestic Product (GDP): while it measures a country's economic power, it doesn't account for all aspects 1^1.

As discussed in “Expectations, subjectivity, standards & risks,” a user's rating is based on personal expectations, which may differ from those of other readers. Even if expectations align, other customers' satisfaction may not guarantee that a product or service will meet their specific needs long-term.

Here’s an example from a delivery service company: Managers were receiving numerous complaints about late deliveries, even though they were meeting their standards. This issue persisted for months, prompting them to pressure their team to reduce delays, creating an unhealthy work environment. Eventually, they discovered that users were seeing incorrect delivery times on the app, leading to false expectations. “Satisfaction with delivery time,” based on these expectations, is not the same as the actual “delivery time.”

Potential customers consider the satisfaction rate, but they understand it might be subjective and somewhat manipulated. Thus, they naturally weigh other factors when choosing a product or service, such as price, availability, features, etc.

The risk lies in overestimating the importance of the satisfaction rate in user choices. Research shows that while it does drive sales, platforms should avoid overemphasizing it, as this could overshadow other crucial information and reduce overall performance.

image

It's natural to seek a single, comprehensive indicator, but the more restricted the metric, the more precise it is. Conversely, the broader the metric, the blurrier it becomes. The 5-star rating is no exception. By aggregating diverse opinions with varying criteria and contexts, its relevance diminishes. However, having too many indicators can also confuse users. A balance must be found.

Clarity about what an indicator reveals is crucial. There should be a clear statement. “The 5-star rating represents the average satisfaction of customers” doesn't feel accurate because it might not reflect the real information users are providing (see “The question asked matters”) and not all customers are represented (see “Unrepresentative set of customers”).

Some businesses use multiple indicators to display overall satisfaction:

Movie-rating websites like IMDb show two ratings: one from professional critics and one from individuals. This provides more information but can still be unclear about each point of view. I myself tend to think professional critics might focus on aspects like decor, purpose, and staging, while spectators might care more about the storyline, acting, and scenery. But is that even true? Movie reviews are an interesting category, because they are inherently subjective; each critic and spectator has their own tastes, resulting in high variance. Thus, the average rating becomes less meaningful.
💡
Exploration
  • Provide satisfaction rates for several categories. Categories and criteria have been frequently mentioned in other sections, especially “Categorization and subjectivity.” Airbnb implements this but faces issues like the lack of reference for the average rating in each category, making comparisons difficult. The overall rating still dominates.
    1. Airbnb’s average ratings per category.
      Airbnb’s average ratings per category.

      We could imagine removing the overall rating and keeping only the share of positive sentiment per category:

      A simple design proposition.
      A simple design proposition.

      There are additional issues with this design:

    2. It’s more information for the reader to process. Three indicators might already be the maximum. On Airbnb, we could imagine merging “accuracy” and “location,” and “check-in” and “communication.”
    3. It’s harder to compare listings. These indicators should be available right from the list of options.
    4. If only one indicator is low, the overall consideration may decrease (see “Suggestive opinions”).
    5. Most importantly, maintaining an average rating at the category level faces the same challenges as at the global level. To name a few: unclear scale, lack of nuances, threshold of consideration, and volatility.
  • Labels are another way to summarize information, as proposed for other issues.
  • Airbnb could create labels, as strong and valuable as the “Superhost” badge.
    Airbnb could create labels, as strong and valuable as the “Superhost” badge.
  • Show other indicators. Businesses should be able to share additional indicators on public platforms to provide a fair overview of the user experience, allowing customers to rely on other parameters besides the satisfaction rate.

1^1 "Tous Notés", Pierre Bentata, 2023.

➡️ Next up: Businesses should be able to consent to receiving reviews or not