TransWikia.com

What are the rules when extracting SVO triples from preprocessed text?

Data Science Asked on January 23, 2021

If you have some already preprocessed text that is tagged, what are the rules to extract Subject-Verb-Object (SVO) triples if you want a triple like (word, word, word). Can you give the sentence as example and extract all triples? Do you just need to find all combinations without repetition from set of N words?

One Answer

The goal of Subject-Verb-Object (SVO) triples is to extract a single triple for a sentence.

The sentence:

A rare black squirrel has become a regular visitor to a suburban garden.

results in the following SVO:

(squirrel, become, visitor)

Triplet Extraction From Sentences by Rusu et al. outlines how to do that. First, you'll need a parse tree ( Stanford Parser and OpenNLP are the most common). The three items then can be extracted:

The subject will be found by performing breadth first search and selecting the first descendent of NP that is a noun.

… the predicate of the sentence, a search will be performed in the VP subtree. The deepest verb descendent of the verb phrase will give the second element of the triplet.

… we look for objects. These can be found in three different subtrees, all siblings of the VP subtree containing the predicate. The subtrees are: PP (prepositional phrase), NP and ADJP (adjective phrase). In NP and PP we search for the first noun, while in ADJP we find the first adjective.

Answered by Brian Spiering on January 23, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP