Using word embeddings with additional features

Question

I have the set of queries for classification task using Gradient Boosting Classifier of scikit learn. I want to enrich the model by feeding additional features along with GloVe. How should I approach scaling in this case? GloVe is already well scaled, however, features are not.
I have tried StandardScaler, but this reduced the performance in comparison with just using GloVe. The problem maybe with the feature itself, however, I need your opinion on scaling starategies in case of glove and dummy variable.

Julio Jesus · Accepted Answer

My first comment would be that you have to remember that Tree-based models are not scale-sensitive and therefore scaling should not affect model's performance, so as you well mention it should a problem with the feature itself.
If anyway you want to scale all your features you could use MinMaxScaler with the min and max values, being the min and max fo the Glove Vectors so that all the features are on the same scale

Using word embeddings with additional features

One Answer

Add your own answers!

Ask a Question