About Murphy's notation: why is $p(y|x, theta)$ a conditional expectation when there is no probabilistic interpretation on $x$ or $theta$?

Question

In section 1.4.5 of Kevin Murphy ML textbook, he introduces linear regression where for a given data $x$, the target $y$ is assumed to be obtained through
$$y(x) = w^Tx + epsilon, text{ where } epsilon sim mathcal{N}(mu, sigma^2)$$
Since $epsilon$ is a random variable, therefore $y$ is a random variable as well induced solely by $epsilon$.
However, the author then defines $$p(y|x, theta) = mathcal{N}(y|mu(x), sigma^2(x))$$
where $theta = (w, sigma^2)$
First of all, $x$ here is explicitly just a vector with no assumption that is was generated according to a distribution. Secondly, $theta$ here is just the parameters associated with the Gaussian as well as the model, there is no probablistic interpretation as well.
How does the author condition on two deterministic variables?
Relevant text below:

Sergio · Answer

If $x$ is not stochastic, then $y$ is just an affine  transformation of the multivariate normal variable $epsilon$.
When $x$ is stochastic, no distributional assumption is needed. All that is required is that the conditional distribution of $y$ given $x$ is normal and the marginal distribution of $x$ is unrestricted (see Hansen, Econometrics, p. 141).
As to $theta$, in a frequentist setting parameters are just unknown numbers, and one would rather write $p(y;x,theta)$, but in a bayesian setting parameters are random variables. And Murphy is going to introduce Bayesian concept learning...

About Murphy's notation: why is $p(y|x, theta)$ a conditional expectation when there is no probabilistic interpretation on $x$ or $theta$?

One Answer

Add your own answers!

Ask a Question