# Combining categorical and continuous features for neural networks

Cross Validated Asked by 3michelin on August 5, 2020

Is it OK to combine categorical and continuous features into the same vector for training deep neural networks? Say there is a categorical feature and continuous feature that I want to feed into a deep neural net at the same time. Is this the way to do it?

categorical feature (one-hot encoded) = [0,0,0,1,0]
continuous feature (number) = 8
final feature vector passed into neural network = categorical feature vector CONCATENATE continuous feature = [0,0,0,1,0,8]


Basically, the question is, is it OK to have a one-hot encoding and a continuous feature together in one feature vector?

Yes, that is one typical way of doing it. But, you need to standardize your features so that gradient descent doesn't suffer, and the regularization treats your weights equally. One way is to standardize the numerical features and then concatenate the one-hot vectors, and the other way is standardizing together. As far as I see, there is no consensus over the two.

Correct answer by gunes on August 5, 2020

Yes, this is absolutely standard.

Answered by Sycorax on August 5, 2020

## Related Questions

### How to compare gender proportions in a population?

1  Asked on January 4, 2021 by new

### Pseudo R2 and prob>chi2

1  Asked on January 3, 2021 by nsamwa

### Saddle-free Newton method for SGD – while Newton attracts saddles, is it worth to actively replel them?

1  Asked on January 3, 2021 by jarek-duda

### Relative Error is not normally distributed

1  Asked on January 3, 2021

### Tensor product between an ispline and a bspline for fitting data that should be monotonic in one dimension

0  Asked on January 3, 2021

### Interpretation of TSA::arimax output model is presented in R

1  Asked on January 2, 2021 by wasif

### Training samples with no labels: To include or not to include?

1  Asked on January 2, 2021 by aishwarya-a-r

### Custom Loss Function – Inducing sparsity

1  Asked on January 2, 2021 by mark-f

### Belief propagation on Polytree

0  Asked on January 2, 2021 by jonasc

### Q: Dividing maximum value by minimum value and reporting the difference “in times”

0  Asked on January 2, 2021

### Hypothesis test for difference of mean when two groups have different size population

1  Asked on January 1, 2021 by ambleu

### Combining Error Terms into a General Error Term

1  Asked on January 1, 2021

### Should I delete or average repeating training inputs from a Gaussian Process?

1  Asked on December 31, 2020 by mvharen

### Does data point ordering matter in LASSO regression?

0  Asked on December 31, 2020 by rik

### Bayesian inference on mean of statistic from population

1  Asked on December 31, 2020 by helmut

### How to plot $x^{1700}(1-x)^{300}$?

3  Asked on December 30, 2020

### Relaxed Lasso Logistic Regression: Estimating second penalty parameter

2  Asked on December 30, 2020 by joanne-cheung

### Chi squared test questions

0  Asked on December 30, 2020 by woodpigeon

### QQ plot comparison of z-normalized datasets

1  Asked on December 30, 2020 by prinzvonk