# Convert a pdf into a conditional pdf such that mean increases and std dev falls

Data Science Asked by claudius on October 30, 2020

Let success metric(for some business use case I am working on) be a continuous random variable S.
The mean of pdf defined on S indicates the chance of success. Higher the mean more is the chance of success. Let std dev of pdf defined on S indicates risk. Lower the std deviation lower the risk of failure.

I have data,let’s call them X, which affects S. Let X also be modelled as bunch of random variables.

P(S|X) changes based on X.
The problem statement is I want to pick Xs such that P(S|X) has mean higher than P(S) and std deviation lower than P(S).

Just to illustrate my point I have taken X of 1 dimension.
Scatter plot between X(horizontal) and Y(on vertical):

You can see that P(S|X) changes for different values of X as given in the below plot:

For X between 4500 and 10225, mean of S is 3.889 and std dev is 0.041 compared to mean of 3.7 and std dev of 0.112 when there is no constraint on X.

What I am interested in is given S and bunch of Xs… pick range of Xs such that resulting distribution of P(S|X) has higher mean and lower standard deviation… Please help me find a standard technique that would help me achieve this.

Also I don’t want to condition on X such that number of samples are too small to generalise.I want to avoid cases such as on the left most side of tree where number of samples is 1.

Just apply an optimization to search for the X values that satisfy the criteria you're looking for. Here's a simple demo:

set.seed(123)
mu_x_true = 1e4
mu_y_true = 3.75
n = 1e2

x <- rpois(n, mu_x_true)
y <- rnorm(n, sqrt(mu_y_true))^2

plot(x, y)

# conditions:
# E[Y|X] > E[Y]
# std(Y|x) < std(Y)

mu_y_emp = mean(y)
sd_y_emp = sd(y)

objective <- function(par, alpha=0.5){
if (par[1]>par[2]) par = rev(par)
ix <- which((par[1] < x) & (x < par[2]))
k <- length(ix)
if (k==0) return(1e12)
mu_yx <- mean(y[ix])
sd_yx <- sd(y[ix])

alpha*(mu_y_emp - mu_yx) + (1-alpha)*(sd_yx - sd_y_emp)
}

init <- mean(x) + c(-sd(x), sd(x))
test <- optim(objective, par=init)

ix <- which((par[1] < x) & (x < par[2]))

mean(y[ix]) > mean(y)
# TRUE

sd(y) > sd(y[ix])
# TRUE


Answered by David Marx on October 30, 2020

## Related Questions

### ML, Statistics and Mathematics

2  Asked on December 20, 2020 by ranit-b

### what are the main differences between parametric and non-parametric machine learning algorithms?

1  Asked on December 20, 2020 by jackearl

### How to build a symptom checker and medical diagnose chat bot

1  Asked on December 20, 2020 by ozan-yurtsever

### IOU accounting for the difference of the damage degree in GT and prediction

0  Asked on December 20, 2020

### How do I deal with additional input information other than images in a convolutional neural network?

1  Asked on December 19, 2020 by hey-hey

### Why does Gradient Boosting regression predict negative values when there are no negative y-values in my training set?

3  Asked on December 19, 2020 by user2592989

### Model selection in active learning

1  Asked on December 19, 2020 by maurits-van-roozendaal

### Modify keras_unet.utils.get_augmented to read images from disk

0  Asked on December 19, 2020 by stepan

### sagemath: compared to r.quantile, what is a faster way to find boundaries for a boxplot?

1  Asked on December 19, 2020 by kjl

### Where can I find an algorithm for human activity classification using thigh and shank sensors?

2  Asked on December 19, 2020

### How to improve results from a Naive Bayes algorithm?

1  Asked on December 19, 2020

### Machine learning algorithms for interpreting Companies brand/s logo/s

2  Asked on December 19, 2020

### Relating ROC curves with class statistics

1  Asked on December 19, 2020 by shahriar49

### Can neural networks have multi-dimensional output nodes?

1  Asked on December 19, 2020 by stewii

### How to decide if gradients are vanishing?

1  Asked on December 18, 2020

### How to apply model to training data to identify mislabeled observations?

2  Asked on December 18, 2020 by overflowingtheglass

### Is the number of bidirectional LSTMs in encoder-decoder model equal to the maximum length of input text/characters?

1  Asked on December 18, 2020 by joe-black

### How to Predict/Forecast street’s Traffic based on previous values?

1  Asked on December 18, 2020 by angrycoder

### Use TSFRESH-library to forecast values

1  Asked on December 18, 2020 by spanishboy