Simply Salford Blog

A Different Kind of Data Mining Automation

Posted by Heather Hinman on Thu, Mar 6, 2014 @ 06:18 AM

wind up robot web 2When your hear data mining automation, what do you usually think of? In general, the automation world has us thinking about the following process: you upload a dataset, select a multitude of modeling engines, then out comes the best model, and voil√†, you're done. There are many positives in this scenario; you save time and it's generally reliable regarding the 'best model.' However, what about automating model building EXPERIMENTS? You ask: what do you mean by experiments?

The Salford philosophy of modeling automation is to assist the modeler as far as possible by anticipating the routine stages of modeling experimentation, allowing the modeler to rapidly make good decisions and to avoid making of mistakes by failing to run some useful tests and diagnostics. The goal is NOT to replace the analyst; but, to help the analyst do a better and faster job of predictive modeling development.

A few examples of Salford automation include: 

  1. BATTERY TARGET is one of the special automation features available on the Advanced Battery Tab in the SPM software. This feature can be used for testing multivariate relationships among predictors.  The Battery will take one variable at a time, using it as the target variable, and build a model using the remaining variables as the predictors. (Available in Pro, Pro Ex, Ultra)
  2. BATTERY SHAVING helps to identify subsets of informative predictors within large datasets containing noisy and possibly correlated variables. With this automation, you may accomplish significant model reduction with minimal (if any) sacrifice to model accuracy. (Available in Pro Ex, Ultra)
  3. BATTERY PRIORS is typically used in fraud detection applications and hotspot analysis. For example, the analyst may be interested in identifying different sets of rules associated with fraud. BATTERY PRIORS tries a range of prior probabilities supplied by the analyst to build a sequence of models. Based on the sequence, one can identify small segments with potentially 100% fraud, or may find larger segments with a lesser probability of fraud, and everything in-between. (Available in Pro Ex, Ultra)

Salford Systems' enhanced technology accelerates the model building process by automating substantial portions of the model exploration, discovery and refinement process. While the analyst is always in full control, we anticipate the next best steps to take and offer to run these for the analyst. The parameters, settings and options are clearly highlighted giving the best results according to various criteria such as:

  • Classification accuracy
  • Area under ROC curve (sensitivity vs. specificity, precision vs. recall), and
  • Lift in the top percentiles
  • R-squared for regression

This technology is perfect for the analytics professional who sees the value in automation at every stage of the model building process. Have you tried some of the 'Batteries' included in the SPM software suite? Whether you have or have not, I highly recommend that you watch an awesome training video series that Salford Systems' Senior Scientist Mikhail Golovnya recorded just for you (those who see value in this automation).


data mining automation 

Like what you've read? Subscribe to this blog!

Topics: video, Battery

Subscribe to Simply Salford and receive Email Updates

Try the Salford Predictive Modeler software
blog on data mining and predictive analytics, as explored by a pair of data scientist
Targeted Marketing Case Study
Subscribe to Afternoon Analytics Podcast

Follow Salford Systems

Most Popular Posts

Latest Posts