When is it time to switch to deep neural networks from simple networks in text classification problems?

Question

I did an out of domain detection task (as a binary classification problem) and tried LR and Naive Bayes and BERT but the deep neural network didn't perform better than LR and NB. For the LR I just used BOW and it beats the 12-layer BERT.

In a lecture, Andrew Ng suggests "Build First System Quickly, Then Iterate", but it turns out that sometimes we don't need to iterate the model into a deep neural network and most of the time traditional shallow neural networks are good/competitive enough and much simpler for training.

As this tweet (and its replies) indicate, together with various papers [1, 2, 3, 4 etc], traditional SVM, LR, and Naive Bayes can beat RNN and some complicated neural networks.

Then my two questions are:

When should we switch to complicated neural networks like RNN, CNN, and transformer and etc? How can we see that from the data set or the results (by doing error analysis) of the simple neural networks?       
The aforementioned experiments may be caused by the simple test set, then (how) is it possible for us to design a test set that can fail the traditional models?

Lerner Zhang · Answer

Source: https://blog.easysol.net/building-ai-applications/

When data is too big, complex and nonlinear it's time to try deep learning. It's always good to try to add some layers to see it can eliminate bias and don't lead to high variance.

Deep learning models can be tweaked(hyperparameters) and regularized(parameters), and it is worth the work.

When is it time to switch to deep neural networks from simple networks in text classification problems?

One Answer

Add your own answers!

Ask a Question