TransWikia.com

Basis expansion for regression using neural network?

Data Science Asked by Adarsh Gupta on September 25, 2021

I am trying to approximate a nonlinear function using a neural network. There are 3-4 input units. The network is struggling a bit to generalize the function outside the vicinity of the training data set.
I asked someone and he suggested that basis expansion might help. Can someone please provide a reference for the same, I am not able to find any. Also, he suggested “basis expansion using kernel method”.

One Answer

I guess expanding basis is somehow looking at the problem using linear algebra perspective. For explaining it using a trivial example, in linear algebra, for the 3D space, there are three bases namely, $i$, $j$ and $k$. It means for referring to any vector in 3D space you just need a linear combination of these three vectors to construct any desired vectors. Each basis for each space has a number of properties like being orthogonal or not or maybe being linearly independent and some other important properties. By adding extra features, you increase the number of features if they are linearly independent.

In machine learning, there are situations that comparing each feature with the output directly may not lead to a good decision boundary while using some features alongside each other can make decision boundaries which can lead to a better decision boundary. All of that means that if you change the input space to space with more number of features which the features are extracted using the current feature space, you can find a space that the data can be separated better. Finding a feature space such with properties can be done using feature extraction methods. Kernel methods are some of them.

Kernel trick is a kind of mathematical approach for reducing the difficulties of transforming the data in the current feature space to the appropriate feature space. Suppose you have one million rows of raw data and the feature space belongs to $R^{3}$. Suppose you find out that a feature space that can have a good generalisation belongs to the space which belongs to $R^{95}$. Trying to transform the data in hand to the desired feature space is a time-consuming task if possible. Instead of transforming the data directly, the kernel trick tries to calculate the outcome of the transformed data without transforming the data into the new space.

Answered by Media on September 25, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP