TransWikia.com

Rankings: How to deal with low number of reviews for user generated content website?

Community Building Asked by anvd on September 3, 2021

I have a community based website and I am receiving some complaints about the way I make the rankings of content.

The focus of the platform is user-generated reviews. The problem is that I use an arithmetic average, and a product having 1 five star review will rank higher than a product with 100 4 star reviews.

So this is a problem, since the top of the rank are products with only one or two reviews. The easy solution is just to remove products with less than X reviews. Although I am not sure how other platforms are dealing with this scenario. Any idea?

4 Answers

I recently watch a video of 3Blue1Brown, and he explained a really easy method: add a 1 star review and calculate the average. For products with many reviews, this won't impact the average a lot, but for products with very few reviews, this is going to decrease the average significantly.

Answered by Joy Jin on September 3, 2021

If it's media sharing, have people share their favorites of free and non free media. They can't complain and they can communicate through it.

To make it smoother you could have unlinked folders for said free media, then have a link to where they would get that material for free so you don't have to gum up with a search bar, do the same for the paid version because they're both able to get the same material.

Then maybe you could have them post what their ranking is personally in their own folders, with just the names of the files correlating to the form of entertainment, then you could just take the whole and the number one could be determined? That would be some kind of ranking system as well.

Answered by Hephestaclyse on September 3, 2021

Developing an algorithm for this will depend on understanding what the various results of a 5-star rating mean. People are not consistently rational, and their use of rating systems reflects this. People tend to 5-star rate things as follows ("product" can mean a material thing, site content, or whatever):

One Star: I'm mad. Either the product is awful, or the person offended me, or I'm in a really bad mood. In any case, I need to vent.

Two Stars: The product is bad. Possibly with some minimal redeeming quality. Or, I understand that people value the content of a 2-star review more than that of a 1-star review, and I want people to read my review. But this thing isn't good.

Three Stars: It had some value. Something about it was lame. I'm not angry. I would buy (or read) something else next time if I have the option, but I might buy (or read) something like this again if I have no other options.

Four Stars: It's good. Or, It's great, but I reserve 5 stars for things that are perfect. Or it's great, but had one problem that detracted a bit from it. Or it just worked fairly well. Either way, I am basically happy.

Five Stars: It's great. Or, it's good and this is my default if you don't anger me. Or it did everything it was supposed to so I can't justify taking a star away.

What you'll notice looking at these, is that there is little to no meaningful difference between 1 and 2 stars, or between 4 and 5 stars. In fact, often 2-star reviews indicate a higher likelihood of a bad product than 1-star reviews. On Amazon for example, you will often find 1-star reviews written as "Box smashed by UPS and couldn't use it." They are often triggered by extraneous anger rather than the product itself. For content, this might manifest as "Good article, but I hate this guy."

And, 5-star reviews are often the lazy default. It might mean "great product", or it might mean "I plugged it in and it worked, and I don't think about things much." Or even "Hmm... A little weak, but my grandson wrote it." Whereas a 4-star review reliably means "I really considered this and found it to be good." So while in most statistical situations a reduction in granularity detracts from the quality of the result, in this case it improves it. I suggest treating each review as tri-state where 1-2 stars means negative. 3 stars means nothing. And 4-5 stars means positive.

Having gotten to this point, I find the ratio of positive to negative to be most predictive of quality. And, the number of reviews to achieve full confidence to often be around 20... Though in cases where there is not a financial incentive to cheat, this could be as low as 5.

The following generic code produces a number between -100 and +100, with new product (no reviews) starting at 0. Max deviation from zero due to reviews is limited based on number of reviews.

int fullConfidence = 20;       // Review count for full confidence
float confidenceFactor = 1.0;

// This is a flat progression. Optionally scale confidence mod on a curve.
if(reviewCount < fullConfidence ) 
   confidenceFactor = reviewCount * (1 / fullConfidence );
   // Also optionally insert a minimum number of reviews before any impact:
   // confidenceFactor = (reviewCount - min) * (1 / (fullConfidence - min));

// Get percent of positive and percent of negative reviews here.
float qualityRating = (percentPositive - percentNegative) * confidenceFactor ; 

Good luck.

Answered by HumanJHawkins on September 3, 2021

The key concept here is that more votes should increase your confidence of the ranking. There are a lot of rather complicated forumlas you can use but the general principle was explained by HeapUnderflow on forum.codidact.org:

I’ll summarize what you would have seen. The Wilson score is a prediction of how likely a vote is to be an upvote, based on the votes so far, so it comes out as number between 0 and 1, exclusive. The ratio of upvotes to downvotes is the most important factor, but it will give a little higher score for 20:10 than 10:5 because more data means more confidence.

Consensus at the Codidact forum was that a simple formula for up/down-voting that gives more confidence for more votes is

(upvotes + 2) / (upvotes + downvotes + 4)

But the Wilson score is only designed for up/down rating. I've found a couple of questions on the SE network concerning star ratings, and you might be able to search for more. Sorry that I can't give you more specific advice.

Answered by curiousdannii on September 3, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP