When calculating the number of reviews needed to achieve a stable average rating, we use the following formula:
Where is the difference between the new average and the old average when a new rating is left, and is the difference between the newest rating and the old average , with .
We assume that an average is accurate when each new rating doesn’t change the average by more than 0.1.
With an average rating of 5/5 stars and a new rating at 1/5 (edge case, ), we get the following result:
In this case, we would need at least 39 reviews for the average to be stable enough.
If we choose an average of 4/5 stars (industry average) and a new rating of 1/5 stars ():
I performed a test on a Google Sheet with random values of ratings between 1 and 5. In this case, the stability seems to occur after 21 reviews.
If we consider stability to occur when the average changes by less than 0.05 (because a score of 4.76 dropping by 0.05 would round from 4.8 to 4.7 on Google), stability occurs after 30 reviews:
To go further: In reality, ratings are not random and are generally consistent with the average rating. We would need to employ probabilistic mathematics (e.g., Bayesian probabilities) to achieve a more accurate result.
The actual number of reviews needed is probably around 20.
An interesting piece of text to read: