TransWikia.com

How Adaboost calculates error for each weak learner in training?

Data Science Asked by heresthebuzz on May 3, 2021

I am studying the Adaboost classification algorithm because i would like to implement it from scratch. I understand how it works, but i am not able to understand where some steps are placed.

I will describe the Adaboost training steps in my understanding (sorry for any incorrect formalism):

  • Initialize a weak learner $k$
  • Define a weight for each sample in the dataset equally $w =frac{1}{N}$
  • Fit $k$ to the dataset
  • Calculate error $e = sum_{i=0}^{N}e_iw_i$
  • Calculate importance $alpha$ of $k$, i.e. $alpha = frac{1}{2}log(frac{1-e}{e})$
  • Recalculate weights for the correct classified samples: $w_{t+1} = w e^{alpha}$
  • Recalculate weights for the incorrect classified samples: $w_{t+1} = w e^{-alpha}$
  • Normalize new sample weights: $w_{normalized} = frac{w}{sum_{i=0}^N w_i}$
  • For all sequent learners, select samples based in weighted random choice until get a dataset with same size as the original and do the same proccess.

My question is: how error is obtained? Regarding the implementation, should i firstly fit the dataset and then get the error from predicting the same dataset? This doesn’t seems correct.

I’ve tried to read different sources about this and even a great explanation from the Statquest channel wasn’t able to make this clear.

Thanks!

One Answer

When I started learning about ensemble methods, this youtube video help me greatly.

The idea is that you fit the model on the data, calculate the error and then work with the error. So yes you should use a weak learner to predict your data and then, use another weak learner.

There are some more videos in the same channel that give a great explanation to have an intuition.

Answered by Carlos Mougan on May 3, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP