# What is wrong with my approach on a custom way of creating Gabor-filter convolution kernels?

Cross Validated Asked by G.S. Luimstra on January 12, 2021

Disclosure: I am not a prominent mathematician (current bachelor student) like others on this website and my approach has been mostly pragmatic. Please do tell me if I can improve the formulation of my question and/or should provide more background information.

For the past few weeks I have been investigating in what I feel like is a smart way of improving on the way the first convolution layer of a neural network is created.
In the ‘vanilla’ way, the convolution kernels are simply initialized randomly and the kernel ‘weights’ are being tweaked respectively through back-propagation.

In a fairly recent blog post by Distill Pub, in which they investigate the shape and behaviour of early convolution kernels of various popular neural networks, it is found that rougly 44% of the initial kernels are very similar to Gabor filters.
My idea is to constrain the ‘hypothesis space’ of the first convolution layer in such a network to just be Gabor filters where the neural network may tweak the parameters of the Gabor filter rather than the raw kernel ‘weights’.

The formula I use for (real valued) Gabor filters is as follows:

where $$x’ = xcos(theta) + ysin(theta)$$ and $$y’ = -xsin(theta) + ycos(theta)$$

I have spent quite a lot of time on implementing this idea, however I cannot get the gradient descent to work properly. I suspect there is something fundamentally wrong with my approach, which is why I am turning to the experts.

My current experiment setup Currently, I have an input image that I convolve with a predefined (hardcoded) kernel, with theta set to $$pi/4$$ (the other variables may be considered constant) to produce an output $$x$$. This makes it so that I can see what kernel would yield output $$x$$.
Next, I initialize the Gabor filter to be updated with a theta of $$- pi / 4$$, which I expect to be corrected to the actual value of $$pi/4$$. At the start of the training loop, when convolved with the image the convolution does not yield input $$x$$.
Through back propagation the $$theta$$ parameter should now be tweaked in order to obtain output $$x$$. I am currently using a flavor of RMSE, but this should not be important.

Results
In actuality, this is not what happens at all. The $$theta$$ parameter is being updated, albeit in a sensible direction (as measured by the loss value), at some point is jumps back to having a
terrible loss value and this cycle seems to continue indefinitely. I have tried implementing Momentum without any better results.

My main suspicion are the cycling components of the partial derivative of the Gabor function, which I have let Wolfram Alpha compute for me here. Furthermore, could it be that the Gabor function does not have a nice reachable minimum with respect to $$theta$$.

Now to my overarching question. Where does this break down? Does the partial derivative just not posses the right properties for it to find a proper global minimum? Did I simply make a mistake in the code, since this should theoretically be possible? How do I proceed to debug my approach?

Note: I have omitted any code here, as I am more interested in the theoretical feasibility of my approach rather than an implementation.

## Related Questions

### How to interpret “quantile residuals”

1  Asked on December 20, 2021

### Is the regressor (sometimes called “independent” variable) actually independent of the response from a probabilistic perspective?

1  Asked on December 20, 2021 by 24n8

### Laplace approximation in high-dimensions

2  Asked on December 20, 2021 by dionysis-m

### Extremely basic question: how are data assumed to be generated in machine learning?

0  Asked on December 20, 2021 by frass

### Is it appropriated to use an ‘Invariant’ variable in multivariate test?

0  Asked on December 20, 2021 by terauser

### Equation for weighted average with normalization(?)

0  Asked on December 20, 2021

### guessing a number between 1 and 100

1  Asked on December 20, 2021 by dynamic89

### Normalizing posterior distribution

1  Asked on December 20, 2021

### chi squared test in python libraries

1  Asked on December 20, 2021 by eurohacker

### Structural break test for non-stationary time series

0  Asked on December 20, 2021

### How to test for multicollinearity among dummy explanatory variables?

1  Asked on December 18, 2021 by kellyyang

### A routine to choose eps and minPts for DBSCAN

3  Asked on December 18, 2021 by mehraban

### Mixture of Gaussians on Log of Data

2  Asked on December 18, 2021 by zhubarb

### Profile likelihood

1  Asked on December 18, 2021 by denby47

### Johansen cointegration testing: rejecting at 10% vs. 1% level

2  Asked on December 18, 2021

### Retrieving time series from a smoothed periodogram

1  Asked on December 18, 2021 by bayesisbaye

### Which likelihood function is used in linear regression?

3  Asked on December 18, 2021 by floyd

### Distributions of Quadratic form of a normal random variable

1  Asked on December 18, 2021 by xorion-1997

### Euclidean distance from zero

0  Asked on December 18, 2021

### Inconsistent results from partial Mantel test on (non)distance matrices

0  Asked on December 18, 2021 by ian-lane