Simply Salford Blog

Choosing Your Own Preferred MARS Model

Posted by Dan Steinberg on Wed, Aug 20, 2014 @ 09:46 AM

When MARS develops a model it actually develops many and presents you with the one that it judges best based on a self-testing procedure.  But the so-called MARS optimal model may not be satisfactory from your perspective.  It might be too small (include too few variables), too large (include too many variables), too complex (include too many splines, basis functions, or breaks in variables), or otherwise not to your liking based on your domain knowledge. So what can you do to override the MARS process?

Read More

Topics: data mining, Variable Importance, MARS, data science, predictive modeling, predictive model, data analysis, Dan Steinberg, statistics, machine learning

How Do I Avoid Overfitting My MARS Model?

Posted by Dan Steinberg on Fri, Mar 21, 2014 @ 11:05 AM

Overfitting is an issue for most machine learning tools. The learners are very flexible and can thus adapt to the noise in the data, as well as to the signal. A classic technique to avoid overfitting is to ensure that we have both learn and validate (or test) data, and then to monitor the learning the process; comparing the goodness of fit or performance on learn and validate data as a function of the amount of training.

Read More

Topics: MARS, overfitting

Getting Results From 'Out-Of-Bag' Cross-Validation In MARS

Posted by Dan Steinberg on Tue, Jun 25, 2013 @ 09:09 AM

In this post we continue the discussion of saving OOB (out-of-bag) predictions when testing via cross-validation with MARS. The principles for MARS are the same as they are for CART and the organization of the file saved follows the same high-level logic. However, as the details are a little different we thought it would be worthwhile exhibiting the OOB results and how we get them in the context of MARS as well. Recalling that when using K-fold cross-validation we actually develop K different models each tested on a different test sample (CVBIN) and that the final model and results are reported for an overall model built on all the data where nothing has been held back for test. The topic of discussion is how to obtain the equivalent of test sample predictions so that we can manipulate and further analyze the test sample residuals (for regressions).

Read More

Topics: OOB, Nonlinear Regression, MARS, Cross-Validation

Reuse MARS Regression Spline Basis Functions in a New Dataset

Posted by Dan Steinberg on Wed, Apr 17, 2013 @ 07:29 AM

MARS(Multivariate Adaptive Regression Splines), introduced by Stanford University data mining guru Professor Jerome H. Friedman in 1988, is one of the landmarks in the evolution of regression methods. For the first time analysts could leverage a search mechanism intended to automatically discover nonlinearity and interactions in the context of classical regression. The MARS procedure involves a forward stepwise model building stage followed by a backwards elimination of unneeded predictors to arrive at surprisingly high performance models, all automatically. At the heart of the MARS algorithm is the search for "knots" or breaks in the range of a predictor allowing a regression model containing that predictor to have different slopes in each region. Breaking predictors into regions permits nonlinearity, and when interactions are constructed from regions of predictors, remarkable discoveries are enabled.

Read More

Topics: Nonlinear Regression, basis functions, Regression Splines, MARS

Discussion Questions from "The Evolution of Regression"

Posted by Heather Hinman on Thu, Apr 4, 2013 @ 11:30 AM

During the course of Salford Systems' 4-part webinar series "The Evolution of Regression," some very good questions from the audience have made their way to presenter Dr. Dan Steinberg, CEO and Founder. Here are a few responses that we thought would benefit everyone who is interested in regression, nonlinear regression, regularized regression, decision tree ensembles and post-processing techniques.

Read More

Topics: TreeNet, Random Forests, GPS, MARS, Regression, Webinar, SPM 7

Hands-On Webinar "The Evolution of Regression Modeling"

Posted by Heather Hinman on Tue, Mar 12, 2013 @ 04:08 AM

Part 2 - Hands-On Component is this Friday! [March 15, 2013, 10 am PST (San Diego, CA)]

Overcoming Linear Regression Limitations

Regression is one of the most popular modeling methods, but the classical approach has significant problems. This webinar series addresses these problems. Are you working with larger datasets? Is your data challenging? Does your data include missing values, nonlinear relationships, local patterns and interactions? This webinar series is for you! We will cover improvements to conventional and logistic regression, and will include a discussion of classical, regularized, and nonlinear regression, as well as modern ensemble and data mining approaches. This series will be of value to any classically trained statistician or modeler.

Read More

Topics: GPS, MARS, Regression, Webinar, SPM 7

Video: The Evolution of Regression [Part 1]

Posted by Heather Hinman on Tue, Mar 5, 2013 @ 04:00 AM

Whether you were able to attend Part 1 of the webinar series "The Evolution of Regression Modeling: From Classical Linear Regression to Modern Ensembles" or not, the on-demand recording is now available. Watch the video and download the slides.

Read More

Topics: GPS, MARS, Regression, Webinar, SPM 7

The Evolution of Regression: An Upcoming Webinar Series

Posted by Heather Hinman on Tue, Feb 5, 2013 @ 05:35 AM

Regression is one of the most popular modeling methods, but the classical approach has significant problems. This webinar series address these problems. Are you working with larger datasets? Is your data challenging? Does your data include missing values, nonlinear relationships, local patterns and interactions? This webinar series is for you! We will cover improvements to conventional and logistic regression, and will include a discussion of classical, regularized, and nonlinear regression, as well as modern ensemble and data mining approaches. This series will be of  value to any classically trained statistician or modeler.

Read More

Topics: GPS, MARS, Regression, Webinar

Handling Missing Values in MARS

Posted by Dan Steinberg on Wed, Dec 19, 2012 @ 02:37 PM

Question: How does MARS (Multivariate Adaptive Regression Splines) deal with missing values?

Read More

Topics: MARS, missing values

Case Study: Making Predictions with MARS

Posted by Heather Hinman on Tue, Oct 16, 2012 @ 03:41 PM

At the 2012 Salford Analytics and Data Mining Conference, Maria Lupetini from Qualcomm gave an easy-to-understand overview of how they are able to make predictions using Salford Systems' MARS (Multivariate Adaptive Regression Splines). A portion of the recorded presentation is below, enjoy!

Read More

Topics: MARS, prediction