TransWikia.com

Should I delete or average repeating training inputs from a Gaussian Process?

Cross Validated Asked by MvHaren on December 31, 2020

I’m facing a problem with my GP regression, where I have (noisy) observations with repeating training
inputs x.

I.e. I see observations f(x)=[1.1 1.2 3.0 2.9 4.3 4.4 4.9 5.0] for x = [1 1 2 2 3 3 4 4 5 5].
However, in my case I have 8 different training locations, each with 13 noisy observations, making a total of 104 observations.
I am unsure what to do with these duplicate training inputs/observations.

I see some posts about merging data points, since the kernel matrix inversion might get singular. Indeed I do see that the rank of my 104*104 kernel matrix is only 8, but when a noise term is added to the diagonal of the kernel (optimized with marginal likelihood) it is possible to invert the matrix.

Furthermore, when I compare the following two methods:

  1. Use all 104 observations as input to the GP,
  2. Take the mean of each different training location, making the amount of inputs to the GP 8,

I see that method 1 actually gives better performance. Could this be coincidence or does this make sense?

Thanks

One Answer

It makes perfect sense to use the "repeated training examples" as they endoce information about the noise in our readings.

What you observed is no coincident; the occurance of repeated $x$ instances allows us to more easily capture the noise variability $sigma_n$. We have a very good initial estimate about how much regularisation we should consider. It also gives us as modellers a direct insight as to how much should we trust our data readings. Regarding this last point, it is worth noting that we should check that we do not have corrupted data. Indeed, having repeated $x$ instances does not inform us as to how different points $x$ covary (to estimate something like our length scale $l$) or the magnitude of that covariance (to estimate something like $sigma_f$); the off-diagonal shape of the covariance is not directly informed but the usefulness of these readings should not be downplayed as the noisevariance $sigma_n$ directly affects both our fitting procedure as well as the associated intervals.

Correct answer by usεr11852 on December 31, 2020

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP