Quantifying the universal approximation theorem

Cross Validated Asked by ofow on January 3, 2022

Let $mgeq 1$ be an integer and $Fin mathbb{R}[x_1, dots, x_m]$ be a polynomial. I want to approximate $F$ on the unit hypercube $[0, 1]^m$ by a (possibly multilayer) feedforward neural network. The activation function is $mathrm{tanh}$ for all the connections.

Let $varepsilon>0$ be a real number. If I want the approximation to deviate from $F$ by less than $varepsilon$ in the $L^2$ norm what is the smallest possible number of non-zero weights?

It is kind of stupid to approximate a function that is known to be polynomial by a neural network but I just wanted to get more quantitative insight into the universal approximation theorem (and polynomials seem to be the most accessible class of functions).

approximation machine learning neural networks optimization polynomial

Add your own answers!

Ask a Question

Get help from others!

Recent Answers

Peter Machado on Why fry rice before boiling?
haakon.io on Why fry rice before boiling?
Jon Church on Why fry rice before boiling?
Lex on Does Google Analytics track 404 page responses as valid page views?
Joshua Engel on Why fry rice before boiling?