I'm currently studying Machine Learning at university with the help of the Foundations of Machine Learning (Second Edition) textbook. I've learned about PAC Learning, Rademacher Complexity, Growth Functions, and VC-Dimension. Throughout the textbook and course, we seem to be spending a lot of time finding bounds.
I'm somewhat confused as to why we're doing this.
Neither the textbook nor my professor have been particularly helpful in answering my questions.
Thanks.
Decision boundary is the basic principle to identify pattern (the most simplistic view to split something in 2 sets). From this, the Generalization Bounds is:
In the theory of statistical machine learning, a generalization bound – or, more precisely, a generalization error bound – is a statement about the predictive performance of a learning algorithm or class of algorithms.
So the whole point of finding the bounds is to allow you to identify the performance (or efficiency) of the algorithm.
This may seem like a trivial question; as the answer is simply that because the learning algorithm can search the entire hypothesis space looking for its optimal solution. While this answer is correct, we need a more formal answer in light of the generalization inequality we’re studying.
If your question is more like: why a lot of time (so you emphasize the time spent on this), or in other words: why do we need to consider every possible hypothesis? (more):
This may seem like a trivial question; as the answer is simply that because the learning algorithm can search the entire hypothesis space looking for its optimal solution. While this answer is correct, we need a more formal answer in light of the generalization inequality we’re studying.