Dr. Grant Humphries, from the Zoology department at the University of Otago, New Zealand, has spent the last three years studying how a bird species called Sooty Shearwaters can help predict upcoming El Niño occurrences. After much time and research, he has figured out a way to do so using data mining.
When MARS develops a model it actually develops many and presents you with the one that it judges best based on a self-testing procedure. But the so-called MARS optimal model may not be satisfactory from your perspective. It might be too small (include too few variables), too large (include too many variables), too complex (include too many splines, basis functions, or breaks in variables), or otherwise not to your liking based on your domain knowledge. So what can you do to override the MARS process?
Many analysts highly value the ability to rank predictors in a database. It comes down to knowing what matters and what does not. Especially when working with a large number of variables being able to focus on a relatively small number aids decision makers to have confidence in communication with others. In the SPM software suite every one of Salford Systems’ data mining engines offers a plausible ranking of the available predictors, but Random Forests offers a unique twist on this concept.