Is my 57% sports betting accuracy correct?

Question

I have been creating sports betting algorithms for many years using Microsoft access and I am transitioning to the ML world and trying to get a grasp on determining the success of my algorithms. I have exported my algorithms as CSV files dating back to the 2013-14 NBA season and imported them into python via pandas.
The purpose of importing these CSV files is to determine the future accuracy of these algorithms using ML. Here are the algorithm records based on the Microsoft access query:

A System: 471-344 (58%) +92.60

B System: 317-239 (57%) +54.10

C System: 347-262 (57%) +58.80

I have a total of 8,814 records in my database, however, the above systems are based on situational stats, e.g., Team A fits an algorithm if they have better Field Goal %, Played Last Game Home/Away, More Points Per Game, etc...
​
Here is some of the code that I wrote using Jupyter to determine the accuracy:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

clf = LinearSVC(C=1.0, penalty="l2", dual=False)

clf.fit(X_train, y_train)

pred_clf = clf.predict(X_test)

scores = cross_val_score(clf, X, y, cv=10)

rfe_selector = RFE(clf, 10)

rfe_selector = rfe_selector.fit(X, y)

rfe_values = rfe_selector.get_support()

train = accuracy_score(y_train, clf.predict(X_train))

test = accuracy_score(y_test, pred_clf)

print("Train Accuracy:", accuracy_score(y_train, clf.predict(X_train)))

print("Test Accuracy:", accuracy_score(y_test, pred_clf))

print(classification_report(y_test, pred_clf, zero_division=1))

print(confusion_matrix(y_test, pred_clf))

print("Accuracy: %0.2f (+/- %0.2f)" % (scores.mean(), scores.std() * 2))

​
Here are the results from the code above by system:
A System:

Train Accuracy: 0.6211656441717791
Test Accuracy: 0.5153374233128835
F1 Score: 0.52
CONFUSION MATRIX: [[16 50] [29 68]]
Accuracy: 0.55 (+/- 0.10)

B System:

Train Accuracy: 0.6306306306306306
Test Accuracy: 0.5178571428571429
F1 Score: 0.52
CONFUSION MATRIX: [[49 23] [31 9]]
Accuracy: 0.55 (+/- 0.08)

C System:

Train Accuracy: 0.675564681724846
Test Accuracy: 0.5409836065573771
F1 Score: 0.54
CONFUSION MATRIX: [[15 29] [27 51]]
Accuracy: 0.57 (+/- 0.16)
​

In order to have a profitable system, the accuracy only needs to be 52.5%. If I base my systems off of the Test Accuracy, only the C System is profitable. However, all are profitable if based on Accuracy (mean & standard deviation).
My question is, can I rely on my Accuracy (mean & standard deviation) for future games even though my Testing Accuracy is lower than 52.5%?
If not, any suggestions are greatly appreciated on how I can gauge the future results on these systems.

Neil Slater · Answer

My question is, can I rely on my Accuracy (mean & standard deviation) for future games even though my Testing Accuracy is lower than 52.5%?

If by Accuracy you mean training accuracy, then absolutely you should not trust those values. For almost all machine learning algorithms there is a problem with overfitting to training data, which will result in reporting over-estimates for metrics against the training data. This is why you should always have a test data set of unseen data, because the performance on the training data is not what you truly care about - it is the performance on new unseen data, or how the model generalises, that matters when you want to use it to predict new values.
The test data set gives you a measure of how well your model will generalise, because it simulates the performance of the model against unseen data.
However, the test data set is not perfect for all uses. Random chance will cause your test measurements to vary. If you use your test results to select a model from a list of models based on performance, then:

The model you selected does have the highest likelihood of being the best one, as you want. It is not guaranteed to be the best though.

The test result is likely to be an over-estimate of general performance. That is because you cannot separate the random fluctuations in test results from real performance improvements, you only see the combination.

The more tests you run, the more likely you will have an inflated view of the performance of your best model.
The usual fix for this is to use cross validation. Cross validation uses yet another data set to help you with the first step of choosing the best model. After you have chosen your best performing model using cross validation, then you can use a test set that you have kept in reserve to measure the performance. Because you have not used that test set to select a model, then it will give you an unbiased measure of performance. You should still bear in mind that this measure still comes with implied error bars (and with other caveats, such as any inherent sampling bias).
When predicting future results from past data, you do also need to be concerned about population drift and non-stationary problems. This is a common issue in any data set that includes complex behaviour that can evolve over time. This is very likely to affect results from sports teams where many conditions affecting performance evolve over the same timescales that you are trying to predict. In practice this means you will want to feed in new data ad re-train your models constantly, and despite this your models will tend to lag behind reality. It is unlikely you will achieve even the test result accuracy in the long term.
You can add one more thing to your testing routines to help assess the impact of non-stationarity - when reserving data for the test or cross-validation sets, don't do it at random, instead reserve all the latest results (e.g. last four weeks of data) for test only. You are likely to see a reduction in the metrics when you do this for a problem domain like sport, but that should give you a more realistic assessment of the model you are building.

Is my 57% sports betting accuracy correct?

One Answer

Add your own answers!

Ask a Question