ROC Curve for data sets with large negative bias

Question

For context, I've read this forum here regarding a similar issue, and it seems the conclusion on there was that precision-recall curves are better-suited for data sets where there is a large negative bias. I've made precision-recall graphs, and I've gotten good results from them, but recently one of my superiors asked me to come back to the ROC curves and try to find a way to get a reasonable looking plot, so I'm trying to find a way to convey the essential information of the ROC curve while ignoring the large negative bias of our data. For reference, here are what our ROC curves looked like initially without doing any data manipulation/undersampling:

I read in this paper that ROC graphs "depict relative tradeoffs between benefits (true positives) and costs (false positives)"; I understand this concept. What I don't understand---and I must be missing something here---is why you couldn't simply plot TP vs. FP if the end goal is to simply see the trade off between those 2 metrics. I made some plots of TP vs. FP just for comparison sake, and you can see that the results look much more akin to a typical ROC curve plot. Is there something wrong with continuing with this idea?

ROC Curve for data sets with large negative bias

Add your own answers!

Ask a Question