Sumario: | Support vector machines (SVMs) are one of the most successful algorithms on small and medium-sized data sets, but on large-scale data sets their training and predictions become computationally infeasible. The author considers a spatially defined data chunking method for large-scale learning problems, leading to so-called localized SVMs, and implements an in-depth mathematical analysis with theoretical guarantees, which in particular include classification rates. The statistical analysis relies on a new and simple partitioning based technique and takes well-known margin conditions into account that describe the behavior of the data-generating distribution. It turns out that the rates outperform known rates of several other learning algorithms under suitable sets of assumptions. From a practical point of view, the author shows that a common training and validation procedure achieves the theoretical rates adaptively, that is, without knowing the margin parameters in advance. Contents Introduction to Statistical Learning Theory Histogram Rule: Oracle Inequality and Learning Rates Localized SVMs: Oracle Inequalities and Learning Rates Target Groups Researchers, students, and practitioners in the fields of mathematics and computer sciences who focus on machine learning or statistical learning theory The Author Ingrid Karin Blaschzyk is a postdoctoral researcher in the Department of Mathematics at the University of Stuttgart, Germany.
|