Comparing multi-class vs. binary classifiers in predicting a single class

Question

I've pretty much read the majority of similar questions, but I haven't yet found the answer to my question.
Let's say we have n samples of four different labels/classes namely A, B, C, and D. We train two classifiers:

First classifier: we train a multi-class classifier to classify a sample in data to one of four classes. Let's say the accuracy of the model is %x.
Second classifier: now let's say all we care about is that if a
sample is A or not A. And we train a binary classifier for classifying samples to either A or non-A. Let's say the accuracy of this models
is %y.

My question is, can we compare x and y as a way to measure the performance of classifiers on classifying A? In other words, does a high performance in a multi-class classifier mean that the classifier is capable of recognizing the single classes with high performance as well?
The real-world example of this is that I've read papers that trained multi-class classifiers on a dataset that contains four different types of text. They achieved pretty high performance. But all I care about is for a model to be able to correctly classify one specific type of text. I trained a binary classifier that achieves a lower accuracy. Does this show that my model is working poorly on that type of text and the multi-class classifier is doing better? Or shouldn't I compare these two?

Erwan · Accepted Answer

In general we can't compare the performance of a multiclass classifier with the performance of a binary classifier since the former expresses how good the classifier is at classifying any instance of any class. So if there are $n_A$ samples labelled A, there's only a proportion of $n_A/n$ of the global accuracy of the multiclass classifier which is about A. In particular a multiclass classifier usually tends to favor the largest classes, so if class A happens to be a small proportion of the data then the global performance will not reflect how good it is at classifying A: for example it might have 90% accuracy simply because class B is 90% of the data, this doesn't prove anything about class A. By contrast the performance of the binary classifier is by definition solely about class A.
However if one has access to the detailed evaluation of the multiclass classifier, typically the confusion matrix, then it becomes possible to calculate the performance of the classifier for a single class, say class A. Actually by merging all the B,C,D rows together and all the B,C,D columns together in the confusion matrix one obtains exactly a binary classification confusion matrix, and from that one can calculate a performance which can be compared against another binary classifier. But in this setting the multiclass classifier is at a disadvantage for the reason mentioned above: it also has to deal with the other classes and this can cause it to "sacrifice" a class, whereas the binary classifier doesn't have this issue.

Comparing multi-class vs. binary classifiers in predicting a single class

One Answer

Add your own answers!

Ask a Question