TransWikia.com

Feature engineering: Measuring proportional changes across two item given two conditions

Cross Validated Asked by Comte on February 13, 2021

I am trying to work out if an equation that I was given for measuring changes in a dependent measure i.e. sales, of two items when considering two separate conditions i.e. on promotion (price drop) or not is effective as an engineered feature.

Concrete example; say I have two soft drink products, such as a Pepsi and Cola, an measure the sales throughout the year. Each product is promoted independently of the other. So we create, via imputation and modelling, both promotional conditions for each product. This is in time series, but I don’t think its important to the equation, and leaves the table below.

structure(list(day = structure(c(1546300800, 1546387200, 1546473600, 
                             1546560000, 1546646400, 1546732800, 1546819200, 1546905600, 1546992000, 
                             1547078400), tzone = "UTC", class = c("POSIXct", "POSIXt")), 
           PEPSI_BASE = c(16, 68, 92, 73, 60, 58, 77, 60, 56, 46), COLA_BASE = c(69, 
                                                                                   602, 744, 636, 571, 433, 477, 418, 113, 99), PEPSI_PROMOTED = c(245, 
                                                                                                                                               9, 16, 114, 8, 18, 138, 138, 471, 459), COLA_PROMOTED = c(204, 
                                                                                                                                                                                                       461, 536, 484, 394, 337, 350, 293, 689, 743)), row.names = c(NA, 
                                                                                                                                                                                                                                                                    -10L), class = c("tbl_df", "tbl", "data.frame"))

In order to calculate the interactions of promotions between the two products and conditions I have been given the following formula.

mag_of_effect = (Apr-Anp)/(Apr+Anp) / (Bpr-Bnp)/(Bpr+Bnp)

A and B refer to products i.e. A = Cola, B = Pepsi.
np = sales when not promoted (base)
pr = sales when item is promoted

My first question is, will this formula work of measuring differences effectively tell me how A sales are influence by the conditions of B, and vice versa. Or do I need to reverse the formula to see the opposing effect B over A?

My second question is what is the best way of comparing the effectiveness of this metric? I was thinking of comparing forecasting models utilising a simple neural net (i.e. Jordan or Elman network) with and without the metric. Is there a more direct way of measuring how strongly the resulting metric (magnitude of effect) influences time series forecasts? i.e with tidymodels, fable etc (I’m an R user predominately right now).

Thank you for your time

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP