# Nonlinear Regression: Modern Approaches and Applications

## What is Nonlinear Regression?

Nonlinear regression is a form of regression analysis in which observational data are modeled by a function which is a nonlinear combination of the model parameters and depends on one or more independent variables. In the past, advanced modelers would work with nonlinear functions, including exponential functions, logarithmic functions, trigonometric functions, power functions, Gaussian function, and Lorenz curves. Some of these functions, such as the exponential or logarithmic functions, would then be transformed so that they would be linear. When so transformed, standard linear regression would be performed, but the classical approach has significant problems, especially if the modeler is working with larger datasets and/or if the data includes missing values, nonlinear relationships, local patterns and interactions.

This paper, and others, will cover improvements to conventional and logistic regression, and will include a discussion of classical, regularized, and nonlinear regression, as well as modern ensemble and data mining approaches. We will begin with Multivariate Adaptive Regression Splines (MARS).

**Nonlinear Regression Techniques:**

- Logistic Regression
- Regularized Regression: GPS Generalized Path Seeker
- Nonlinear Regression: MARS Regression Splines
- Nonlinear Ensemble Approaches: TreeNet Gradient Boosting; Random Forests; Gradient Boosting incorporating RF
- Ensemble Post-Processing: ISLE; RuleLearner

This whitepaper will focus on MARS nonlinear regression and offer case study examples.

## What is "MARS" Nonlinear Regression

MARS, an acronym for Multivariate Adaptive Regression Splines, is a multivariate non-parametric regression procedure introduced in 1991 by world-renowned Stanford statistician and physicist, Jerome Friedman (Friedman, 1991). Salford Systems? MARS nonlinear regression, based on the original code, has been substantially enhanced with new features and capabilities in exclusive collaboration with Friedman.

## How does MARS nonlinear regression differ from linear regression?

Linear regression models typically fit straight lines to data. MARS approaches model construction more flexibly, allowing for bends, thresholds, and other departures from straight-line methods. MARS builds its model by piecing together a series of straight lines with each allowed its own slope. This permits MARS to trace out any pattern detected in the data.

## Overview of the MARS Nonlinear Regression Methodology

The MARS nonlinear regression procedure builds flexible nonlinear regression models by fitting separate splines (or basis functions) to distinct intervals of the predictor variables. Both the variables to use and the end points of the intervals for each variable-referred to as knots-are found via a brute force, exhaustive search procedure, using very fast update algorithms and efficient program coding. Variables, knots and interactions are optimized simultaneously by evaluating a "loss of fit" (LOF) criterion. MARS chooses the LOF that most improves the model at each step. In addition to searching variables one by one, MARS also searches for interactions between variables, allowing any degree of interaction to be considered.

The "optimal" MARS nonlinear regression model is selected in a two-phase process. In the first phase, a model is grown by adding basis functions (new main effects, knots, or interactions) until an overly large model is found. In the second phase, basis functions are deleted in order of least contribution to the model until an optimal balance of bias and variance is found. By allowing for any arbitrary shape for the response function as well as for interactions, and by using the two-phase model selection method, MARS is capable of reliably tracking very complex data structures that often hide in high-dimensional data.

## Core Capabilities

**MARS core capabilities include:**

- Automatic variable search Large numbers of variables are examined using efficient algorithms, and all promising variables are identified.
- Automatic variable transformation Every variable selected for entry into the model is repeatedly checked for non-linear response. Highly non-linear functions can be traced with precision via essentially piecewise regression.
- Automatic limited interaction searches MARS repeatedly searches through the interactions allowed by the analyst. Unlike recursive partitioning schemes, MARS models may be constrained to forbid interactions of certain types, thus allowing some variables to enter only as main effects, while allowing other variables to enter as interactions, but only with a specified subset of other variables.
- Variable nesting Certain variables are deemed to be meaningful (possibly non-missing) in the model only if particular conditions are met (e.g., X has a meaningful non-missing value only if categorical variable Y has a value in some range).
- Built-in testing regimens The analyst can choose to reserve a random subset of data for test, or use v-fold cross-validation to tune the final model selection parameters.

## Applications

This new, flexible regression modeling tool is applicable to a wide variety of data analyses, particularly those in which variables possibly may be in need of transformation and interaction effects are likely to be relevant. The nonlinear regression software can assist a data analyst to rapidly search through many plausible models and quickly identify important interactions-insights that can lead to significant model improvements. Further, because the regression modeling software can be exploited via intelligent default settings, for the first time analysts at all levels can easily access MARS innovations.

## Graphical User Interface

Salford Systems' MARS has an easy-to-use, intuitive graphical user interface (GUI). As shown below, the interface allows the user to control the variables and functional forms to be entered into the model and the interactions to be considered or forbidden, while allowing the MARS nonlinear regression algorithm to optimize those parts of the model the analyst chooses to leave free.

Once the model is selected, the user can easily remove or add terms, instantly see the impact of changes on model fit, review diagnostics that assist in model selection, save the model and apply the model to new data for prediction. Other MARS GUI features include an optional batch/command-line mode, spreadsheet-style browsing of the input data set, and summary text reports. The enhanced MARS text report includes extensions to the "classic" output (e.g., addition of residual sums of squares, loglikelihood, and other useful diagnostics), making the results easier to comprehend and assisting the analyst in refining the model in subsequent runs. In addition, the MARS interface provides all essential data management facilities for:

- New variable creation and deletion,
- Aorting, merging, and concatenating of data sets,
- Deletion of cases,
- Random, stratified and exact count sampling, and
- Filtering of cases into training and/or test and hold-out samples.

**Visualization of Results**

In addition to summary text reports, MARS results are also displayed in the Results dialog box. The GUI output includes ANOVA decomposition, variable importance, and final model tables as well as graphical plots. MARS automates both the selection of variables and the non-parametric transformation of variables to achieve the best model fit. Variable transformation is accomplished implicitly through the piecewise regression function used by MARS to trace arbitrary non-linear functions. MARS communicates this non-parametric transformation graphically, displaying the predicted response as a function of either one or two variables. MARS automatically produces 2-D plots for main effects (response variable as a function of each predictor) and 3-D surface plots for interactions, with options to spin and rotate. For higher-order interactions, the user can choose slices of the function for display of 2-D and 3-D subspaces. Examples of main effects and interaction plots are shown below.

**References**

Friedman, J. H. (1991a), Multivariate Adaptive Regression Splines (with discussion), Annals of Statistics, 19, 1-141(March).

Steinberg, D. and Colla, P. L., (1995), CART: Tree-Structured Nonparametric Data Analysis, San Diego, CA: Salford Systems.

Steinberg, D., Colla, P. L., and Kerry Martin (1999), MARS User Guide, San Diego, CA: Salford Systems.