How large should the corpus be to optimally retrain the GPT-2 model?

Artificial Intelligence Asked by Andreas Toresäter on September 20, 2020

I just started working with the GPT-2 models and want to retrain one on a pretty narrow topic, so I have problems finding training material.

How large should the corpus be to optimally retrain the GPT-2 model? And what is the bare minimum size? Should it simply be as large as possible or can it flip over and make the model worse in some way?

I am also not certain how many steps you should let the retraining run. I have been using 6000 steps when testing, and it seems not much happens after that, loss only moved from 0.2 to 0.18 last 1000 steps.

deep learning gpt natural language processing tensorflow training

Add your own answers!

Ask a Question

Get help from others!