TransWikia.com

Dirichlet distribution: Normalization of alpha values

Cross Validated Asked by user60674 on November 20, 2021

I’m a programmer and currently trying to apply the Latent Dirichlet Allocation algorithm by Blei et al. on a text mining problem. I am using a library called gensim for this, which takes, among others, a vector of alpha values as parameters – the same $alpha$ values that determine the Dirichlet distribution.

So, if we specifiy $k$, we have a vector of $k$ $alpha$-values. For $k = 3$ we may choose our $alpha$ values like this: https://en.wikipedia.org/wiki/Dirichlet_distribution#mediaviewer/File:Dirichlet_distributions.png.

If I understood it correctly, a “flat”/neutral Dirichlet distribution can be achieved by choosing ${a_k}=1$. First question: Is that correct?
I’ve read this in chapter 2.2.1 (“The Dirichlet Distribution”) of “Pattern Recognition and Machine Learning” by Christopher M. Bishop, if that’s of any interest.

The problem is that the library I use takes the alpha values only in normalized form, which means that all $alpha$ values have to sum up to 1. Therefore my second question is: How do I normalize $alpha$ values? I’ve found a normalization constant on https://en.wikipedia.org/wiki/Dirichlet_distribution#Probability_density_function, but it seems to apply to the probability density function only; otherwise it doesn’t quite work out (the $alpha$ values don’t sum up to 1).

I am asking the second question because if, for example, $alpha = {1, 1, 1}$ generates a “flat” Dirichlet distribution and $alpha = {5, 5, 5}$ does not, I can’t see how to normalize these values without the former having the same normalized values as the latter (e.g. $alpha_{normalized} = {frac{1}{3}, frac{1}{3}, frac{1}{3}}$), which would mean that they’d generate the same distribution.

Again, I’m not a mathematician, and I’m pretty sure this probably seems like a pretty stupid question to you. Don’t be too harsh, please 🙂

I’d appreciate any help!

Best Regards & thanks for your time,
MG

Edit: Unfortunately I don’t seem to be allowed to neither comment nor accept answers since I posted this question without an account, so in response to…

  • @tristan: Yeah, that’s what I thought. Something like a concentration parameter would be necessary.
  • @whuber: Well, turns out I just misunderstood the library’s documentation. The library allows for normalized parameter, but also accepts non-normalized data (e.g. {1, 1, 1}).

So, sorry for the confusion, guys. And thanks for your answers!

2 Answers

As others noticed in the comments, it wouldn't make much sense to have normalized parameters for Dirichlet. Notice that for $alpha = (1/3, 1/3, 1/3)$, $alpha = (1, 1, 1)$, or $alpha = (100, 100, 100)$, the results of $alpha' = alpha / sum_i alpha_i $ would be in each case the same, i.e. $alpha' = (1/3, 1/3, 1/3)$.

You can check the What exactly is the alpha in the Dirichlet distribution? thread to learn more about parameters of Dirichlet, but if all the values of $alpha_i$ are the same than the distribution is symmetric, but only for $alpha_1 = alpha_2 = dots = alpha_k = 1$ it is uniform. For high values of $alpha$'s it is more concentrated, while for low values, the values are pushed more to the extremes, so those are very different distributions. You can find some examples below, check also the linked thread.

enter image description here

Answered by Tim on November 20, 2021

In answer to your first question, yes: if $ a_i = 1$ for all $i$ you get the uniform distribution. For second question I suspect you need to find a concentration parameter (call it $r$ for sake of argument), so that you would specify $ r=3$ and $ a_1=a_2=a_3=1/3$.

Answered by tristan on November 20, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP