TransWikia.com

Replacing mean by median over batch-size to lessen the impact of outliers

Data Science Asked on January 11, 2021

In the case of training a Neural Network on a regression task. Assuming the data has a significant amount of outliers. Provided that the error needs to be RMS and not MAE. Can it be better (as in less sensitive to the outliers) to replace the average over batch size in the weights update by a median over batch size computation?

For a batch size large enough, this should lessen the impact the contribution from the outliers. It does not seem to be common though, at least to current knowledge. What are the shortcomings of this approach?

One Answer

One shortcoming is that the median is often more computational expensive to calculate than the mean. The median can be calculated with a variation of quickselect which is linear worst-case performance. Calculating the mean only requires the sum and count of the numbers.

Answered by Brian Spiering on January 11, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP