TransWikia.com

Data simulation using make_classification in Python

Data Science Asked by Marni on July 3, 2021

I have a question about data simulation in Python. I deal with the classification of imbalanced data and want to test the effectiveness of different methods on simulated data. I have seen in various articles and books that the make_classification function is used to generate data. Then the data is generated from a normal distribution, so the data is continuous and not discrete. Are such data correct for classification (SVM, Decision Trees) research?

One Answer

There is no obstacle to doing this. For example you can create data by make_classification, and compare different algorithms by building model on it. You can also pass a random_state value to obtain same data each time you call the function. Both SVM, and Decision Trees can work with continuous data.

Answered by tkarahan on July 3, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP