TransWikia.com

Detecting address labels using Tensorflow Object Detection API

Data Science Asked by Danny Morris on March 16, 2021

I am experimenting with the Tensorflow Object Detection API on a Windows 7 machine. I am trying to detect US address labels (and similar blocks of text) as they appear on a piece of mail or an envelope. I am not trying to detect individual words or lines, but rather the full rectangular block of text. My address labels are typically isolated on the letter or envelope and they are surrounded by whitespace. For example:

enter image description here

I followed the tutorial to train a customer object detector. I used the pre-trained SSD Inception V2 COCO model along with 50 images of letters/envelopes containing address labels that were annotated with LabelImg. To annotate my objects (address labels), I drew the bounding box around the entire label with about 5-15 pixels of padding. After 200 training steps, I reached a loss of 2.5 and stopped the training.

Using the same tutorial, I obtained the trained inference graph. Finally, I adapted this tutorial for doing inference. I tested two images, once containing a dog and the other containing an address label. The dog was detected, but nothing was detected in my image with the address label was detected.

My questions are:

  1. Is it reasonable to expect to detect the full block of text given it is surrounded by whitespace and not solid edges?
  2. Am I annotating properly? I left about 5-15 pixels of padding between the address text and the bounding box when I annotated with LabelImg.
  3. Do I have enough images for one-class detection in which the address labels vary little from letter to letter
  4. Should I let the training go longer?

One Answer

  1. It is difficult to say whether the algorithm will detect full box of text or not. This is kind of difficult problem because object to detect does not have proper structure. I might be wrong here!

  2. You are labeling correctly.

  3. You have very few images to learn from. More images are always better for learning tasks.

  4. Don't look at loss value which appears in the terminal, instated check total loss in TensorBoard (I am sure this loss will be more).

    For how long to train? in the example config file from TensorFlow API they say on pets dataset (which has only 7349 images) you need to train model for 200000 steps to get satisfactory results. If this problem is learnable, training for more steps should work.

Answered by Sagar Shelke on March 16, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP