TransWikia.com

ROC Curve for data sets with large negative bias

Cross Validated Asked by Malek on December 27, 2020

For context, I’ve read this forum here regarding a similar issue, and it seems the conclusion on there was that precision-recall curves are better-suited for data sets where there is a large negative bias. I’ve made precision-recall graphs, and I’ve gotten good results from them, but recently one of my superiors asked me to come back to the ROC curves and try to find a way to get a reasonable looking plot, so I’m trying to find a way to convey the essential information of the ROC curve while ignoring the large negative bias of our data. For reference, here are what our ROC curves looked like initially without doing any data manipulation/undersampling:

Here you can see that the ROC curves are getting "squished" to the left due to the high TN count for most of our classes

I read in this paper that ROC graphs "depict relative tradeoffs between benefits (true positives) and costs (false positives)"; I understand this concept. What I don’t understand—and I must be missing something here—is why you couldn’t simply plot TP vs. FP if the end goal is to simply see the trade off between those 2 metrics. I made some plots of TP vs. FP just for comparison sake, and you can see that the results look much more akin to a typical ROC curve plot. Is there something wrong with continuing with this idea?

enter image description here

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP