TransWikia.com

Which values to assign to a quantified dummy variable

Economics Asked by twoRay on July 15, 2021

I am working on a data set from kaggle (https://www.kaggle.com/spscientist/students-performance-in-exams) about how student performance relates to some explanatory variables, such as if the school provides lunch, gender etc.

enter image description here

r/dataanalysis – Which values to assign to a quantified dummy variable
I have changed all the variables to quantative dummy variables and the variable I am concerned about in this post is educ_par (education parents). It can take on 6 different values, which refer to:

  1. if both parents have a bachelors degree

  2. if some went college

  3. three if both of them have a masters degree

  4. if both have an associates degree

  5. highschool

  6. if some have a highschool diploma

Now since, I only have dummy variables my teacher suggested me to at least include one quantitative variable. So that I can interpret it better. Here I want to change educ_par into years of education by parents. However, I run into one main problem here.

Since this is a fake dataset and the variables are not that clear I do not know what values to assign. For example, given the value 5, meaning both parents have a highschool degree, and 6, meaning only some have a highschool degree. In case with 6 I do not know what education the other parent has.Also, imagine both went to highschool, which may mean a total of 12+12= 24years of educatin by both parents, whereas if some went to college, this may mean a 15+0 years of education by both parents, since we do not know what the other parent did. In turn this could mean that assigning values like this having a highschool diploma from both parents affects the grade of their children more than if only one parent has a masters degree.

So my question is how should I quantify this variable? To not run into the problem I am facing above. May I just leave out the cases in which only some parents have a specific degree just to simplify it?

Also how does including a quantitative instead of the dummy variable effect any of my regressions? Or may they just be the same?

One Answer

It's generally not good practice to go from categorical/dummy variables to quantitative variables, as csilvia noted, because categorical/dummy variables have less information than their quantitative analogs. None of the variables in your dataset are quantitative except for math, reading, writing, and total, which I'm assuming are outcomes of interest, not inputs.

If your teacher wants you to include quantitative variables, I'd go back to the original dataset and see if there's anything quantitative included there rather than trying to make a categorical variable fit as a quantitative one.

Answered by Amaan M on July 15, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP