# Relationship between multivariate Bernoulli random vector and categorical random variable

Mathematics Asked on January 7, 2022

For simplicity, I’ll focus on the bivariate case. Let $$(X_1,X_2)$$ be a random vector that obeys bivariate Bernoulli. $$X_i$$ takes either zero or one. The associated pdf can be written as
$$p(x_1,x_2)=p_{11}^{x_1x_2}p_{10}^{x_1(1-x_2)}p_{01}^{(1-x_1)x_2}p_{00}^{(1-x_1)(1-x_2)}.$$

Now, consider a categorical random variable $$Y$$ that takes four values $${11,10,01,00}$$ with probability $${p_{11},p_{10},p_{01},p_{00}}.$$

The associated pdf can be written as

$$p(y)=p_{11}^{[y=11]}p_{10}^{[y=10]}p_{01}^{[y=01]}p_{00}^{[y=00]},$$
where $$[y=z]=1$$ if and only if $$y=z$$.

So, it looks like any bivariate Bernoulli random vector can be represented using a categorical random variable.

However, if we think about the following multivariate Bernoulli random vector $$Z$$, the categorical distribution can also be represented using a multivariate Bernoulli.

Let $$Z=(Z_1,Z_2,Z_3,Z_4).$$ Each $$Z_i$$ is a Bernoulli variable that takes either zero or one. Z differs from the general multivariate Bernoulli in that only one of the four variables can take value one.

The pdf of this random vector can be written as

$$p(z_1,z_2,z_3,z_4)=p_{1000}^{z_1(1-z_2)(1-z_3)(1-z_4)}p_{0100}^{(1-z_1)z_2(1-z_3)(1-z_4)}p_{0010}^{(1-z_1)(1-z_2)z_3(1-z_4)}p_{0001}^{(1-z_1)(1-z_2)(1-z_3)z_4}.$$

Now, we have a multivariate Bernoulli random vector that represents the categorical variable in the above.

My question is what is the relationship between the two random variable/vector and their associated distributions?

Focusing on the $$n=2$$ case

Let me introduce the following probability mass function: begin{align*} p(y_1, y_2) = pi_1^{y_1}(1-pi_1)^{1-y_1}pi_2^{y_2}(1-pi_2)^{1-y_2}left(1 + rho frac{(y_1 - pi_1)(y_2 - pi_2)}{sqrt{pi_1(1-pi_1)pi_2(1-pi_2)}}right) end{align*} which is known as Bahadur's model. You can indeed verify that begin{align*} sum_{(y_1, y_2) in {0, 1}^2} p(y_1, y_2) &= 1 \ text{Corr}(Y_1, Y_2) &= rho end{align*} There is a bijection between $$(p_{11}, p_{10}, p_{01}, p_{00})$$ and $$(pi_1, pi_2, rho)$$ through the relations begin{align*} p_{11} &= pi_1pi_2left(1 + rhofrac{(1-pi_1)(1-pi_2)}{sqrt{pi_1(1-pi_1)pi_2(1-pi_2)}}right) \ p_{10} &= pi_1(1-pi_2)left(1 - rhofrac{(1-pi_1)pi_2}{sqrt{pi_1(1-pi_1)pi_2(1-pi_2)}}right) \ p_{01} &= (1-pi_1)pi_2left(1 - rhofrac{pi(1-pi_2)}{sqrt{pi_1(1-pi_1)pi_2(1-pi_2)}}right) \ p_{00} &= (1-pi_1)(1-pi_2)left(1 + rhofrac{pi_1pi_2}{sqrt{pi_1(1-pi_1)pi_2(1-pi_2)}}right) end{align*} so Bahadur's model is just a parametrization of the bivariate binary model. Now let $$rho = -1$$ and $$pi_1 = 1 - pi_2 = pi$$. This gives begin{align*} p_{11} &= 0 \ p_{10} &= pi\ p_{00} &= 0 \ p_{01} &= 1-pi end{align*} So, the two-category categorical model is just a special case of Bahadur's model when the correlation is $$rho = -1$$. This makes sense; a categorical random variable is basically a multivariate binary with hugely negative correlations among the entries to force only one selected category. We use this to generalize the result.

Generalizing the result

Bahadur's model can be expanded to multivariate binary random variables $$p(y_1, cdots, y_n)$$ with the representation begin{align*} p(y_1, cdots, y_n) = left[prod_{i=1}^npi_i^{y_i}(1-pi_i)^{1-y_i}right]left(1 + sum_{k=2}^{n}rho_ktext{Sym}_k(mathbf{r}_n)right) end{align*} where begin{align*} r_i &= frac{y_i - pi_i}{sqrt{pi_i(1-pi_i)}} \ mathbf{r}_n &= (r_1, cdots, r_n) \ text{Sym}_k(mathbf{r}_n) &= sum_{i_1 I'm not entirely sure what choice of the parameters can lead to a genuine categorical random variable (will think about this and post if I have a positive result), but this is a starting place.

Answered by Tom Chen on January 7, 2022

## Related Questions

### If $frac{a}{b}$ is irreducible, then the quotient of the product of any $2$ factors of $a$ and any $2$ factors of $b$ are irreducible.

3  Asked on October 15, 2020 by simplex1

### cell structure of $S^2times S^2$ with $S^2times {p}$ identified to a point

1  Asked on October 15, 2020 by user830531

### How does strong convexity behave under Minkowski sums?

1  Asked on October 14, 2020 by daron

### Miscellaneous Problem Chapter I ex.22 on G.H.Hardy’s book “A course of pure mathematics”

0  Asked on October 14, 2020 by saul-rojas

### Must a series converge after a finite number of ‘césaro mean’ applications if it does after infinitely many.

0  Asked on October 14, 2020 by oddly-asymmetric

### Why am I getting derivative of $y = 1/x$ function as $0$?

4  Asked on October 13, 2020 by dhruv-agarwal

### How to determine the span of two vectors: $(4,2)$ and $(1, 3)$

4  Asked on October 13, 2020 by doctor-reality

### Putnam 2018 – Exercise A.5 – proof check

0  Asked on October 11, 2020 by dfnu

### Difference of Consecutive Terms in a Recurrence Sequence

1  Asked on October 10, 2020 by thomasmart

### Proof verification: Baby Rudin Chapter 4 Exercise 8

1  Asked on October 10, 2020 by ricky_nelson

### Mixture Problem help

1  Asked on October 9, 2020 by gi2302

### Substituting large values of $n$ into Stirling’s formula, given the outcomes of other $n$ values

1  Asked on October 9, 2020 by subbota

### Kernel of $k[a,b]to k[r^3,r^4], ;;f(a,b)mapsto f(r^3,r^4)$

1  Asked on October 9, 2020 by michael-morrow

### How far can an $N$-fermion wavefunction be from the nearest Slater determinant?

2  Asked on October 8, 2020

### Can you place any variable in a function, and will it remain the same function?

1  Asked on October 8, 2020 by nx37b

### Interchange of limit and integral

1  Asked on October 8, 2020 by fdez

### Extending an operator from $C([0,1])$ to $L^2([0,1])$

0  Asked on October 7, 2020 by aspiringmathematician

### Solve for particular solution for $y”+9y=-6sin(3t)$

1  Asked on October 7, 2020 by ashids

### How to give the sketch of a set

3  Asked on October 6, 2020 by rosita

### Edge contraction-like graph operation

0  Asked on October 5, 2020 by emre-yolcu