# Quantifying the universal approximation theorem

Cross Validated Asked by ofow on January 3, 2022

Let $$mgeq 1$$ be an integer and $$Fin mathbb{R}[x_1, dots, x_m]$$ be a polynomial. I want to approximate $$F$$ on the unit hypercube $$[0, 1]^m$$ by a (possibly multilayer) feedforward neural network. The activation function is $$mathrm{tanh}$$ for all the connections.

Let $$varepsilon>0$$ be a real number. If I want the approximation to deviate from $$F$$ by less than $$varepsilon$$ in the $$L^2$$ norm what is the smallest possible number of non-zero weights?

It is kind of stupid to approximate a function that is known to be polynomial by a neural network but I just wanted to get more quantitative insight into the universal approximation theorem (and polynomials seem to be the most accessible class of functions).

