I've read some documentation on how Adaboost works but have some questions regarding it.
I've also read that Adaboost also picks best features from data apart from weighting weak classifiers to and use them in testing phase to perform classification efficiently.
How does Adaboost pick best features from the data?
Correct me if my understanding of Adaboost is wrong!
In some cases the weak classifiers in Adaboost are (almost) equal to features. In other words, using a single feature to classify can result in slightly better than random performance, so it can be used as a weak classifier. Adaboost will find the set of best weak classifiers given the training data, so if the weak classifiers are equal to features then you will have an indication of the most useful features.
An example of weak classifiers resembling features are decision stumps.