Mathematics Asked on January 7, 2022
For simplicity, I’ll focus on the bivariate case. Let $(X_1,X_2)$ be a random vector that obeys bivariate Bernoulli. $X_i$ takes either zero or one. The associated pdf can be written as
$$p(x_1,x_2)=p_{11}^{x_1x_2}p_{10}^{x_1(1-x_2)}p_{01}^{(1-x_1)x_2}p_{00}^{(1-x_1)(1-x_2)}.$$
Now, consider a categorical random variable $Y$ that takes four values ${11,10,01,00}$ with probability ${p_{11},p_{10},p_{01},p_{00}}.$
The associated pdf can be written as
$$p(y)=p_{11}^{[y=11]}p_{10}^{[y=10]}p_{01}^{[y=01]}p_{00}^{[y=00]},$$
where $[y=z]=1$ if and only if $y=z$.
So, it looks like any bivariate Bernoulli random vector can be represented using a categorical random variable.
However, if we think about the following multivariate Bernoulli random vector $Z$, the categorical distribution can also be represented using a multivariate Bernoulli.
Let $Z=(Z_1,Z_2,Z_3,Z_4).$ Each $Z_i$ is a Bernoulli variable that takes either zero or one. Z differs from the general multivariate Bernoulli in that only one of the four variables can take value one.
The pdf of this random vector can be written as
$$p(z_1,z_2,z_3,z_4)=p_{1000}^{z_1(1-z_2)(1-z_3)(1-z_4)}p_{0100}^{(1-z_1)z_2(1-z_3)(1-z_4)}p_{0010}^{(1-z_1)(1-z_2)z_3(1-z_4)}p_{0001}^{(1-z_1)(1-z_2)(1-z_3)z_4}.$$
Now, we have a multivariate Bernoulli random vector that represents the categorical variable in the above.
My question is what is the relationship between the two random variable/vector and their associated distributions?
Focusing on the $n=2$ case
Let me introduce the following probability mass function: begin{align*} p(y_1, y_2) = pi_1^{y_1}(1-pi_1)^{1-y_1}pi_2^{y_2}(1-pi_2)^{1-y_2}left(1 + rho frac{(y_1 - pi_1)(y_2 - pi_2)}{sqrt{pi_1(1-pi_1)pi_2(1-pi_2)}}right) end{align*} which is known as Bahadur's model. You can indeed verify that begin{align*} sum_{(y_1, y_2) in {0, 1}^2} p(y_1, y_2) &= 1 \ text{Corr}(Y_1, Y_2) &= rho end{align*} There is a bijection between $(p_{11}, p_{10}, p_{01}, p_{00})$ and $(pi_1, pi_2, rho)$ through the relations begin{align*} p_{11} &= pi_1pi_2left(1 + rhofrac{(1-pi_1)(1-pi_2)}{sqrt{pi_1(1-pi_1)pi_2(1-pi_2)}}right) \ p_{10} &= pi_1(1-pi_2)left(1 - rhofrac{(1-pi_1)pi_2}{sqrt{pi_1(1-pi_1)pi_2(1-pi_2)}}right) \ p_{01} &= (1-pi_1)pi_2left(1 - rhofrac{pi(1-pi_2)}{sqrt{pi_1(1-pi_1)pi_2(1-pi_2)}}right) \ p_{00} &= (1-pi_1)(1-pi_2)left(1 + rhofrac{pi_1pi_2}{sqrt{pi_1(1-pi_1)pi_2(1-pi_2)}}right) end{align*} so Bahadur's model is just a parametrization of the bivariate binary model. Now let $rho = -1$ and $pi_1 = 1 - pi_2 = pi$. This gives begin{align*} p_{11} &= 0 \ p_{10} &= pi\ p_{00} &= 0 \ p_{01} &= 1-pi end{align*} So, the two-category categorical model is just a special case of Bahadur's model when the correlation is $rho = -1$. This makes sense; a categorical random variable is basically a multivariate binary with hugely negative correlations among the entries to force only one selected category. We use this to generalize the result.
Generalizing the result
Bahadur's model can be expanded to multivariate binary random variables $p(y_1, cdots, y_n)$ with the representation begin{align*} p(y_1, cdots, y_n) = left[prod_{i=1}^npi_i^{y_i}(1-pi_i)^{1-y_i}right]left(1 + sum_{k=2}^{n}rho_ktext{Sym}_k(mathbf{r}_n)right) end{align*} where begin{align*} r_i &= frac{y_i - pi_i}{sqrt{pi_i(1-pi_i)}} \ mathbf{r}_n &= (r_1, cdots, r_n) \ text{Sym}_k(mathbf{r}_n) &= sum_{i_1<cdots<i_k}r_{i_1}cdots r_{i_k} end{align*} I'm not entirely sure what choice of the parameters can lead to a genuine categorical random variable (will think about this and post if I have a positive result), but this is a starting place.
Answered by Tom Chen on January 7, 2022
3 Asked on October 15, 2020 by simplex1
1 Asked on October 15, 2020 by user830531
1 Asked on October 14, 2020 by daron
0 Asked on October 14, 2020 by saul-rojas
algebra precalculus rational numbers rationality testing real analysis
0 Asked on October 14, 2020 by oddly-asymmetric
4 Asked on October 13, 2020 by dhruv-agarwal
4 Asked on October 13, 2020 by doctor-reality
0 Asked on October 11, 2020 by dfnu
1 Asked on October 10, 2020 by thomasmart
1 Asked on October 10, 2020 by ricky_nelson
compactness proof explanation real analysis solution verification uniform continuity
1 Asked on October 9, 2020 by subbota
1 Asked on October 9, 2020 by michael-morrow
2 Asked on October 8, 2020
functional analysis grassmannian hilbert spaces mathematical physics optimization
1 Asked on October 8, 2020 by nx37b
1 Asked on October 8, 2020 by fdez
lebesgue integral lebesgue measure measure theory riemann integration
0 Asked on October 7, 2020 by aspiringmathematician
1 Asked on October 7, 2020 by ashids
3 Asked on October 6, 2020 by rosita
0 Asked on October 5, 2020 by emre-yolcu
Get help from others!
Recent Questions
Recent Answers
© 2023 AnswerBun.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP