Polynomial Cascades for Data Mining - Classification and Regression for large, high dimensional datasets

Polynomial Cascades were originally developed by Greg Grudic et.al. for high-dimensional regression problems. An extension for classification, Polynomial MPMC Cascades, were developed by Sander Bohte, Greg Grudic and myself. On this page I list some of the features of the regression and the classification method.

Features (Regression)

  • Fast - can handle large, high-dimensional problems.
  • Accurate models - minimizes mean-squared error
  • Can fit non-linear functions without the use of kernels.
  • No parameter tuning required - can be run by non-experts.
  • Interpretable model - the learned hypothesis is one high-degree, multi-variate polynomial. The hypothesis is not an ensemble of classifiers.

Features (Classification)

  • Fast - Runtime for using the features only is O(L x N x d) with L being the number of levels (data dependant, but usually <400), N being the number of training examples and d being the dimensionality of the problem. The method scales to very large problems with millions of examples (read: millions of rows in terms of SQL Databases).
  • No parameter tuning required - You can just run the algorithm and it will give you a good model. This means no cross validation is needed to determine a suitable kernel and kernel parameters.
  • Can classify non linearly separable classification problems without fine tuning of parameters
  • Competitive Performance on benchmarks with the Minimax Probability Machine for Classification (MPMC) and Support Vector Machines (SVM)
  • Incremental building of the model - You can stop the learning at any point in time and have a usable model. This can be useful if you have time constraints and a classifier must be learned in real time.
  • Bounded misclassification error - The PMC generates a worst case bound on the misclassification error for future, unseen data.
  • The underlying assumptions are minimal - No specific distributions are assumed.
  • Interpretable model - the learned hypothesis is one high-degree, multi-variate polynomial. The hypothesis is not an ensemble of classifiers.
  • Feature Selection - One feature per level is used, chosen by a theoretically well founded metric.
  • Kernels - the algorithm can be easily extended by kernels.

Future Work - Possible extensions

  • Handling multiple classes - currently the PMC only handles binary classification problems.
  • Handling missing values by not accounting for them during the computation of mean and covariance.

Code

You can download Matlab code for the classification cascade from here: Download Polynomial MPMC Cascades sourcecode.

Papers

Back to the top   [Sitemap]

This page is Copyright © Markus Breitenbach 2010. All rights reserved. Any opinions expressed here are my own and might not reflect my employers opinion.
[This page: http://markus-breitenbach.com/machine_learning_cascades_ensemble.php was last modified: July 08 2006 15:41:43.]   [Home].   Email me
-

classification regression data mining machine learning scalable ensemble interpretable