What does $mathbf{w^Tx}+w_0$ graphically mean in the discriminant function?

Question

I found a post explaining the discriminant function very detailed. But I am still confused about the function $g(mathbf{x})=mathbf{w^Tx}+w_0$ in 9.2 Linear Discriminant Functions and Decision Surfaces. What does it represent graphically? Could anyone explain it, probably with figure 9.2?

Does it mean the distance between the origin and the hyperplane?

discriminant analysis machine learning

Does it mean the distance between the origin and the hyperplane?

Graph4Me Consultant · Answer

Indeed, there is a lot to understand from the $g(mathbf{x}) = mathbf{w}^T mathbf{x}+w_{0}$.
So let us break down what can be geometrically interpreted:
The set ${mathbf{x} mid g(mathbf{x}) = 0}$ is a hyperplane, if $mathbf{w} neq mathbf{0}$. Then, the dimension of the hyperplane is $n-1$, if $mathbf{x}  in mathbb{R}^{n}$. So for $n=2$ a hyperplane is a line, for $n=3$, the hyperplane is a plane.
The vector $mathbf{w}$ is the normal vector. It is orthogonal to the hyperplane.
So if you take vectors $mathbf{x}',mathbf{x}''$ of  the hyperplane, the vector $mathbf{x}''-mathbf{x}'$ will be orthogonal to $mathbf{w}$.
Thus in your figure, $mathbf{w}$ defines the orientation of the hyperplane.
Now $w_{0}$ translates the hyperplane by a corresponding vector $mathbf{t}$ ( see here for more details).
In the following, lets assume $||mathbf{w}|| = 1$, which does not change the hyperplane.
We can always find a basis $mathbf{w},mathbf{y}_{2},ldots,mathbf{y}_{n}$ of $mathbb{R}^{n}$ such that $mathbf{w} bot mathbf{y}_{i}$ for all $i$, that is $mathbf{w}$ is orthogonal to all $mathbf{y}_{i}$.
Now given $mathbf{x}$, there are coefficients $lambda_{1},ldots,lambda_{n} in mathbb{R}$ such that
$mathbf{x} = lambda_{1} mathbf{w} + lambda_{2} mathbf{y_{2}}+ ldots + lambda_{n} mathbf{y_{n}}$.
Therefore $ g(mathbf{x})-w_{0} = mathbf{x}^{T} mathbf{w} = langle mathbf{x},mathbf{w} rangle = langle lambda_{1} w+ sum_{i = 2}^{n}lambda_{i} mathbf{y}_{i},mathbf{w} rangle = lambda_{1} langle w,wrangle = lambda_{1}$
and thus $ g(mathbf{x}) = lambda_{1}+w_{0}$.
The geometric conclusion is therefore: $g$ depends on how you move $mathbf{x}$ in direction of $mathbf{w}$.
(1) Given a vector $mathbf{x}$, and you consider $mathbf{x}+mathbf{u}$ for some $mathbf{u} bot mathbf{w}$, then $g(mathbf{x}) = g(mathbf{x}+mathbf{u})$.
So if you move $mathbf{x}$ "along" the direction of the hyperplane, $g$ does not change its value. So given $x',x''$ of the hyperplane, if you move $mathbf{x}$ by $mathbf{x}'-mathbf{x}''$, you will have $g(mathbf{x}+(mathbf{x}'-mathbf{x}'')) = g(mathbf{x})$.
(2) If you move $mathbf{x}$ further away from the plane (along the line that contains $mathbf{w}$), the value $g(mathbf{x})$ will either increase or decrease, depending in which direction you move. So if you move $mathbf{x}$ by $mu mathbf{w}$, you will have $mu+lambda_{1}+w_{0} = g(mathbf{x}+mu mathbf{w}) neq g(mathbf{x}) = lambda_{1}+w_{0}$, if $mu neq 0$.
(3) $g(mathbf{x}) = 0$ means $lambda_{1} = -w_{0}$.
In your image, you therefore have:
(1) A point $mathbf{x}$ on the plane has value $g(mathbf{x}) = 0$.
(2) Each point $mathbf{x}$ above the plane has value $g(mathbf{x}) > 0$.
(3) Each point $mathbf{x}$ below the plane has value $g(mathbf{x}) < 0$.

steam_engine · Answer

It's a standard linear function; for a linear function $g$, all points $x$ that satisfy $g(x) = a$ form a hyperplane. In case of decision function, $g(x) = 0$ defines a hyperplane that is called decision boundary (yellow rectangle); if $g(x) < 0$, then the point $x$ is on the one side of this hyperplane and should be classified as one class, but if $g(x) > 0$, it is on the other side and should be classified as another class.

Answered by steam_engine on January 6, 2021

What does $mathbf{w^Tx}+w_0$ graphically mean in the discriminant function?

2 Answers

Add your own answers!

Ask a Question