Using Information from the rest of a Sequence to Predict the Label for any one Item

Question

I have a dictionary of variable-length sequences: [(file_name[-10:], len(tag_is_header_list)) for file_name, tag_is_header_list in HEADER_PATTERN_DICT.items()] [('37bd1.html', 25), ('0bcce.html', 40), ('90364.html', 28), ('8f9c7.html', 24), ('d12d4.html', 73), ('46837.html', 37), ('adb92.html', 53), ('0a1e7.html', 69), ('da077.html', 43), ('9366a.html', 21), ('6ae4d.html', 37), ('f62ee.html', 19), ('73aee.html', 33), ('e090a.html', 35), ('8b093.html', 44)] These contain a label for each item as to whether or not they are a subject heading: HEADER_PATTERN_DICT[sorted([(file_name, len(tag_is_header_list)) for file_name, tag_is_header_list in HEADER_PATTERN_DICT.items()], key=lambda x: x[1])[0][0]] [(None, True), ('

Dave Babbitt · Answer

The fastest way to train a model to predict each item's label is using Conditional Random Fields (CRT) like in this example. h/t @erwin

Using Information from the rest of a Sequence to Predict the Label for any one Item

One Answer

Add your own answers!

Ask a Question