Box-Cox data transformation to enable linear regression

Question

I am performing multiple linear regression to predict a score (dependent variable) from multiple categorical variables. My dependent variable has skewed distribution with a large number of zero values but no negative values.
Can I use Box-Cox transformation in this scenario?
I tried to run it in R, but got the error message -
"Error in boxcox.default(linreg1) : response variable must be positive"

Vivek · Answer

Box-Cox transformation works fine with zeros. Hope you are using boxcox.fit() in package named geoR.
However, you can solve your problem of skewness with other transformations like:

Square root transformation. However, often the square root is not a strong enough transformation to deal with the high levels of skewness.
Use log(x+1) transformation which is a widely accepted way of feature transformation.

Also, I don't understand why you are doing transformation of the dependent variable. I agree with @dave for the assumption of normality in regression.

Box-Cox data transformation to enable linear regression

One Answer

Add your own answers!

Ask a Question