TransWikia.com

Twitter POS and NER: What is state-of-the-art?

Data Science Asked by SanMelkote on August 2, 2020

What is the current state-of-the-art for pos tagging and named entity recognition for twitter data? Are industrial-strength programs like Spacy and SparkNLP accurate for such texts? How about FlairNLP and Stanford’s CoreNLP accuracy measures?

One Answer

SOTA is changing so rapidly in NLP that even Data Science professionists struggle to cope with it. I have two main sources that I constantly check to gain some insights on SOTA:

  • NLP Progress from Sebastian Ruder. It contains updates on NLP on a whole lot of subfields, NER and POST included.

  • Paper with code contains a section on NLP. That's a great website for ML in general.

I know these links do not tackle the problem of Twitter specifically, however I don't think that domain is qualitatively different from others. IMO, of course.


About your other question:

Are industrial-strength programs like Spacy and SparkNLP accurate for such texts? How about FlairNLP and Stanford's CoreNLP accuracy measures?

As I wrote above, it's mostly a matter of personal preference and/or contingent project needs. There's no right or wrong tool. Personally, I found Stanford tools to be the best, for either the quality of their predictions and the amount of models available from a single pipeline. But as I said it's very subjective.

Answered by Leevo on August 2, 2020

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP