Passing Dependency/Constituency trees to a Neural Machine Translator

Question

I am working on a project on Neural Machine Translation in the English-Irish domain. I am not an  expert and have researched entirely on my own for a technology exhibition so apologies if my question is simple.
I am trying to parse all of my English corpus to constituency trees. Of course, the format of a sentence when using the Stanford Parser is something like:

(ROOT (S (NP (VBG cohabiting) (NNS partners)) (VP (MD can) (VP (VB make) (NP (NP (NNS wills)) (SBAR (WHNP (WDT that)) (S (VP (VBP favour) (NP (DT each) (JJ other)))))))) (. .)))

Of course, when dealing with simple sequences, each word is used, it's not symbolism like NP or NNS in constituency trees.
Right now, I'm working with PyTorch and Fairseq to produce all my models and have gotten a working seq2seq model. But, can I simply just pass my English input like shown above to a model and expect it to train? Do I need to make a new model from scratch that can deal with tree structures? I've tried very hard to research this, reading papers, books and playing around with tools but since I'm not in a class for this and since it's not really documented, I'm finding it hard to find this on my own.
Any help would be greatly appreciated

Jindřich · Answer

First of all, I would like you to discourage you from using structured input in NMT. In most cases, the best thing you can do is just do some subword input and output segmentation and learn simple sequence-to-sequence conversion.
You can certainly pass the parsed input in the format that you showed above and the worse thing you can expect the model will to ignore the mark-up. Some modest improvements from using syntax in this way are shown in this WTM19 paper from Edinburgh.
People were more successful when using graph convolutional networks for processing the structured input, AFAI used for the first time in 2017. In this case, you would need to add a custom encoder into FairSeq and specify a model with this encoder. There are several PyTorch implementations of graph convolutional networks that you can reuse in FairSeq, e.g.,

gcn-over-pruned-trees

pygcn

AllenNLP also has some tools for structure input.

Note, however, that once you change model architecture, you need to be careful about the learning rate schedule that is otherwise tuned for the standard Transformers.

Passing Dependency/Constituency trees to a Neural Machine Translator

One Answer

Add your own answers!

Ask a Question