TransWikia.com

Imbalanced classification or Regression? What is the best approach to my A/B testing related problem?

Data Science Asked by Pradical2190 on August 20, 2020

The context of the problem is A/B testing of two new versions of a game. I have a structured dataset (50000 rows x 22 columns) from the game designers that represents data with respect to two versions of the game: A and B. The rows represent individual players, and there are several categorical features (c*) and continuous features (n*). One of the important columns is the monetization (in-game purchases) obtained from the players. The end goal is to be able to recommend a version A or B to each player, such that the monetization would be maximized.

raw_df_columns: 'id', 'c1', 'c2', 'c3', 'c4', 'c5', 'c6', 'game_version', 'n1', 'n2',
'n3', 'n4', 'n5', 'n6', 'n7', 'n8', 'n9', 'n10', 'n11', 'n12', 'n13',
'monetization'

Issues:

The most important problem is that the version of the game (A or B) in the ‘game_version’ column is not the end label. It is simply the version that was issued to the person. I do not have the required labels directly to model it as a classification problem.

Therefore my initial approach to this was to attempt it as a regression problem. I did an EDA, eliminated some non-essential features and tried to predict ‘monetization’ using the rest of the features (except ‘id’ which has no correlation). The prediction accuracy is very good, but the problem is that the feature ‘game_version’ is more or less being ignored in this prediction. Or with some ML models it seems to have a very simplistic relationship, which is wrong.

Finally, in most cases A is better than B (has more monetization than the best of B), except for a small minority (~ 190 cases out of 50000) where B is better than A. This also makes it very difficult for the ML algorithms to zero in on such cases.

Ask:

What is the best approach that I should take for this problem? How do I obtain the recommended version of the game for a given player?

I can also provide additional details or clarifications if needed. Thanks in advance!

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP