# Are Neural Nets a Special Case Of Graphical Models?

Cross Validated Asked on January 3, 2022

Are Deep Neural Nets Graphical Models?

In the talk, here at NIPS, they say that:

GANs and VAEs are Graphical Models, just with a particular CPD and cost function. They are bipartite complete graphs.

How can that be explained? I thought that you need probabilities enmeshed in the models, with variables having dependence relationships. In neural nets, they have all sorts of other things like ReLu nodes etc. i.e there are no probability relationships just a series of non linearities alongwith with priors such as regularization or convnet structure.

This seems to be very different view than that is explained at What's the relation between hierarchical models, neural networks, graphical models, bayesian networks? .

If you focus on the generative part, GANs and VAEs are actually mathematically the same object (1), i.e. Gaussian latent variable models, where $$z$$ is a latent Gaussian random variable pointing to an observed $$x$$:

The difference is that VAEs are prescribed models that output a random variable $$x$$ with a probability density, while GANs are likelihood-free implicit models (2) that directly specifies a (deterministic) procedure with which to generate data.

Concretely, the VAE's graphical model is implemented as the decoder/inference network, while the GAN's graphical model is implemented as the generator network; the GAN's discriminator network does not appear in the graphical model (similarly to how the VAE's encoder/recognition network doesn't show up) because it is merely an auxiliary object created to approximate the Jensen-Shannon divergence or other f-divergences (1):

(Image from Lilian Weng)

# References

(1): I've linked the relevant timestamp of the video recording of a tutorial (slides here) by Shakir Mohamed and Danilo Rezende from DeepMind at UAI 2017; Ferenc Huszar also explains the equivalence on Reddit. The VAE's graphical model is also explained in Stanford's "CS236 Deep Generative Models" notes.

(2): The distinction between prescribed and implicit models is described in greater detail in "Learning in Implicit Generative Models" by Shakir et al. (2016).

# Extra: Another perspective with augmented graphical models

On Unifying Deep Generative Models (Hu et al., 2017) illustrates augmented graphical models that encompass both the GAN generator and discriminator, and an analogous model for the VAE where we assume a perfect discriminator.

Arrows with solid lines denote generative process; arrows with dashed lines denote inference; hollow arrows denote deterministic transformation leading to implicit distributions; and blue arrows denote adversarial objectives.

Answered by Christabella Irwanto on January 3, 2022

You can view a deep neural network as a graphical model, but here, the CPDs are not probabilistic but are deterministic. Consider for example that the input to a neuron is $vec{x}$ and the output of the neuron is y. In the CPD for this neuron we have, $p(vec{x},y)=1$, and $p(vec{x},hat{y})=0$ for $hat{y}neq y$. Refer to the section 10.2.3 of Deep Learning Book for more details.

Answered by Hossein on January 3, 2022

## Related Questions

### Prediction Intervals (Conformal Predictions) for Regression Problems

0  Asked on December 29, 2021 by bioinformatics_student

### Why not use % change in regression instead of log diff?

0  Asked on December 29, 2021 by tjaqu787

### Estimating expected values for correlated data using random effects models

4  Asked on December 29, 2021 by nicolas-molano

### Should multiple testing correct with bonferroni ever reduce a p value’s size?

1  Asked on December 29, 2021

### Compute Mean of a Clipped Normal Distribution

2  Asked on December 29, 2021 by ahsan

### What is an example of perfect multicollinearity?

3  Asked on December 27, 2021 by tsteatime

### Should I use a seasonal arima or stl decomposition and model residuals only?

1  Asked on December 27, 2021 by string_is_hard

### Endogenous controls in linear regression – Alternative approach?

2  Asked on December 27, 2021 by sgtbp

### Variable selection in logistic regression model

3  Asked on December 27, 2021

### EM Algorithm Derivation, Discrete Case

1  Asked on December 27, 2021

### Should we really do Re-Sampling in Class Imbalance data?

2  Asked on December 27, 2021 by baktaawar

### How to determine data size is statistically efficient?

0  Asked on December 27, 2021 by 1111ktq

### (Non-limit) distribution of maxima from different univariate, discrete and stationary time series

1  Asked on December 27, 2021

### Confused about stationarity and ARIMA processes

2  Asked on December 27, 2021

### How to remove correlated features?

1  Asked on December 27, 2021 by ichait

### Between- and within-person level effects when using multilevel modelling for longitudinal data in R

1  Asked on December 27, 2021 by af1402

### How can I separate the abundance factor of the incidence results?

0  Asked on December 27, 2021

### Chi Square Test on non-numeric data in R

0  Asked on December 27, 2021 by ruffybeo

### How to make correlation test with compositional data?

0  Asked on December 27, 2021

### What are the parameters in signal recovery? Whether source of these parameters are the sampling property of impulse response?

0  Asked on December 27, 2021 by lakshman