# Is it possible to detect overfitting automatically/programmatically after model creation?

Cross Validated Asked by Ayberk Yavuz on December 9, 2020

The definition of overfitting is “the production of an analysis that corresponds too closely or exactly to a particular set of data, and may therefore fail to fit additional data or predict future observations reliably” (the model is good at training data and it is bad at test data).

But, is there a way to define overfitting programmatically ? For example; if a classification model’s accuracy/f1 score is between %99 and %90 at training data and the model’s accuracy/f1 score is equal or less than %80 at test data, the model overfits. Or if a regression model’s rmse value is equal or less than 0.7 at training data (target variable ranges from 0 to 1000) and the model’s rmse value is equal or more than 5.0 at test data, the model overfits.

## Related Questions

### How to compare gender proportions in a population?

1  Asked on January 4, 2021 by new

### Pseudo R2 and prob>chi2

1  Asked on January 3, 2021 by nsamwa

### Saddle-free Newton method for SGD – while Newton attracts saddles, is it worth to actively replel them?

1  Asked on January 3, 2021 by jarek-duda

### Relative Error is not normally distributed

1  Asked on January 3, 2021

### Tensor product between an ispline and a bspline for fitting data that should be monotonic in one dimension

0  Asked on January 3, 2021

### Interpretation of TSA::arimax output model is presented in R

1  Asked on January 2, 2021 by wasif

### Training samples with no labels: To include or not to include?

1  Asked on January 2, 2021 by aishwarya-a-r

### Custom Loss Function – Inducing sparsity

1  Asked on January 2, 2021 by mark-f

### Belief propagation on Polytree

0  Asked on January 2, 2021 by jonasc

### Q: Dividing maximum value by minimum value and reporting the difference “in times”

0  Asked on January 2, 2021

### Hypothesis test for difference of mean when two groups have different size population

1  Asked on January 1, 2021 by ambleu

### Combining Error Terms into a General Error Term

1  Asked on January 1, 2021

### Should I delete or average repeating training inputs from a Gaussian Process?

1  Asked on December 31, 2020 by mvharen

### Does data point ordering matter in LASSO regression?

0  Asked on December 31, 2020 by rik

### Bayesian inference on mean of statistic from population

1  Asked on December 31, 2020 by helmut

### How to plot $x^{1700}(1-x)^{300}$?

3  Asked on December 30, 2020

### Relaxed Lasso Logistic Regression: Estimating second penalty parameter

2  Asked on December 30, 2020 by joanne-cheung

### Chi squared test questions

0  Asked on December 30, 2020 by woodpigeon

### QQ plot comparison of z-normalized datasets

1  Asked on December 30, 2020 by prinzvonk