TransWikia.com

Suggestions for identifying the most "important" image labels

Cross Validated Asked by nlapidot on January 28, 2021

I have a table with images and their assigned object (i.e. specific sub-parts) and image (i.e. image as a whole) labels. Each image may have multiple object labels but only 2 whole-image labels. I also have confidence scores for each of these labels. I want to try to use this information to see if I can identify the most important labels across all of my images. One thought I’ve had is to run LDA on the text portions, treating each image as a "document" and then using the labels as the associated text for that document. I’m wondering if anyone has any other suggestions for creative ways to try to extract importance using the abovementioned information. I don’t have a strict definition of "importance," so I’m open to different interpretations.

One Answer

It is up to you what you consider as important. You could just count the categories and treat the most common ones as most "important", though this would be probably a rather meaningless statistic. Moreover, importance depends on context and is subjective. If you were building a food classification algorithm for restaurants, any food on images would be important. If you were to classify animals, you would ignore the food (at least as far as it doesn't attract the animals).

If you want to find the parts of the pictures that people would consider as most important, than you would need a dataset that is tagged for this purpose, e.g. show the examples to people while using eye tracking, though again, same problems apply. For example, if you had eye-tracking data for kids, the results could differ as compared to eye tracking data for elderly people, etc.

Yes, I understand that. Ultimately, what I hope to gain out of this is an understanding of which labels are most descriptive of an image.

If image shows a dog, a giraffe, and a parrot, which one is the most important? Also keep in mind that if you used some automated approach (e.g. you could use something like TF-IDF), than the result would highly depend on how the dataset was tagged and what was the quality of tagging. To use the example one more time, if your dataset was tagged for food items, than it doesn't have to be the case that the food items are the most important on the picture. If you had a photo of Marilyn Monroe eating french fries, than most people would consider Marilyn Monroe more important than the french fries, but for food-tagged dataset and food-classifying algorithm, french fries would be the key part of the picture.

I'm not saying that it's not doable. There is a number of approaches you could take (one mentioned above), but the results might not end up what you expect them to be. You need to come up with a definition of what you consider important, because this would guide your choice for the algorithm to find such features and enable you to validate quality of the results.

Answered by Tim on January 28, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP