# Comparing percentages based on likert scale by year

Cross Validated Asked by Chris Beeley on November 26, 2020

I’m reviewing an analysis someone else has done on some Likert scale data. They’ve assigned each point on the scale 1-5 (1 = bad, 2 = poor etc.), found the average score in each area, and then converted to a percentage (by multiplying by 20) to give a percentage of total score (100% being the best, 20% being the worst).

I’m okay with this, but then they’re computed a significance test as if the percentages were actual percentages, like if they’d gone out and asked people “Do you own your own home? Yes/ no”. They’ve used a method similar to the one described here:

https://www.dummies.com/education/math/statistics/how-to-compare-two-population-proportions/

I want to tell them that this is a completely invalid way of analysing the data, and they’ve ignored the variance in the scores by collapsing everything into a percentage. I feel they should use ordinary t-tests on the data to determine significant difference. But I’m doubting myself. Any thoughts appreciated.

This should help you understand better. I have chosen Paired test =False.

> #create random numbers between 1-5
> x = round(runif(10, 1, 5), 0)
> x
[1] 3 3 2 4 1 1 3 4 4 4
> y = round(runif(10, 1, 5), 0)
> y
[1] 2 2 3 4 5 2 1 3 5 4
>
> #Perform T-test
> t.test(x,y, paired = FALSE, conf.level = 0.95)

Welch Two Sample t-test

data:  x and y
t = -0.34757, df = 17.681, p-value = 0.7323
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-1.410481  1.010481
sample estimates:
mean of x mean of y
2.9       3.1

>
> #multiply X20 to scale it between 1 and 100 [Does not convert to %]
> x1 = x*20
> x1
[1] 60 60 40 80 20 20 60 80 80 80
> y1 = y*20
> y1
[1]  40  40  60  80 100  40  20  60 100  80
>
> #perform t-test on new data
> t.test(x1,y1, paired = FALSE, conf.level = 0.95)

Welch Two Sample t-test

data:  x1 and y1
t = -0.34757, df = 17.681, p-value = 0.7323
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-28.20962  20.20962
sample estimates:
mean of x mean of y
58        62

>
> #convert to percentage
> x2 = x/5
> x2
[1] 0.6 0.6 0.4 0.8 0.2 0.2 0.6 0.8 0.8 0.8
> y2 = y/5
>
> #perform t-test on new data
> t.test(x2,y2, paired = FALSE, conf.level = 0.95)

Welch Two Sample t-test

data:  x2 and y2
t = -0.34757, df = 17.681, p-value = 0.7323
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-0.2820962  0.2020962
sample estimates:
mean of x mean of y
0.58      0.62


Irrespective of the scale you end up with the same conclusion. So, Technically it should not affect it.

Answered by Not_Dave on November 26, 2020

## Related Questions

### How to compare gender proportions in a population?

1  Asked on January 4, 2021 by new

### Pseudo R2 and prob>chi2

1  Asked on January 3, 2021 by nsamwa

### Saddle-free Newton method for SGD – while Newton attracts saddles, is it worth to actively replel them?

1  Asked on January 3, 2021 by jarek-duda

### Relative Error is not normally distributed

1  Asked on January 3, 2021

### Tensor product between an ispline and a bspline for fitting data that should be monotonic in one dimension

0  Asked on January 3, 2021

### Interpretation of TSA::arimax output model is presented in R

1  Asked on January 2, 2021 by wasif

### Training samples with no labels: To include or not to include?

1  Asked on January 2, 2021 by aishwarya-a-r

### Custom Loss Function – Inducing sparsity

1  Asked on January 2, 2021 by mark-f

### Belief propagation on Polytree

0  Asked on January 2, 2021 by jonasc

### Q: Dividing maximum value by minimum value and reporting the difference “in times”

0  Asked on January 2, 2021

### Hypothesis test for difference of mean when two groups have different size population

1  Asked on January 1, 2021 by ambleu

### Combining Error Terms into a General Error Term

1  Asked on January 1, 2021

### Should I delete or average repeating training inputs from a Gaussian Process?

1  Asked on December 31, 2020 by mvharen

### Does data point ordering matter in LASSO regression?

0  Asked on December 31, 2020 by rik

### Split train//validation/test sets by time, is it correct?

3  Asked on December 31, 2020 by wishihadabettername

### Bayesian inference on mean of statistic from population

1  Asked on December 31, 2020 by helmut

### How to plot $x^{1700}(1-x)^{300}$?

3  Asked on December 30, 2020

### Relaxed Lasso Logistic Regression: Estimating second penalty parameter

2  Asked on December 30, 2020 by joanne-cheung

### Chi squared test questions

0  Asked on December 30, 2020 by woodpigeon

### QQ plot comparison of z-normalized datasets

1  Asked on December 30, 2020 by prinzvonk

### Ask a Question

Get help from others!