Simply Salford Blog

A Different Kind of Data Mining Automation

Posted by Heather Hinman on Thu, Mar 6, 2014 @ 03:18 AM

When your hear data mining automation, what do you usually think of? In general, the automation world has us thinking about the following process: you upload a dataset, select a multitude of modeling engines, then out comes the best model, and voilà, you're done. There are many positives in this scenario; you save time and it's generally reliable regarding the 'best model.' However, what about automating model building EXPERIMENTS? You ask: what do you mean by experiments?

Read More

Topics: video, Battery

How To Win A Data Mining Challenge: DataLab USA Knows [customer success]

Posted by Heather Hinman on Tue, Nov 12, 2013 @ 10:22 AM

Congratulations to DataLab USA! The database marketing solutions company took first place in the 2013 Direct Marketing Association (DMA) modeling challenge with their successful use of the SPM Ultra software.

Read More

Topics: video, Battery, marketing, direct marketing, data mining challenge

Probabilities in CART Trees (Yes/No Response Models)

Posted by Dan Steinberg on Tue, Oct 15, 2013 @ 12:43 PM

Probabilities in CART trees are quite straightforward and are displayed for every node in the CART navigator.  Below we show a simple example from the KDD Cup ‘98 data predicting response to a direct mail marketing campaign.

Read More

Topics: Battery, CART, classification

Supercharging Predictive Modeling: A Beginners' Perspective

Posted by Heather Hinman on Fri, Apr 26, 2013 @ 06:05 AM

Salford Systems recently attended the Predictive Analytics World (PAW) conference in San Francisco as a sponsor. Manning the exhibit booth was yours truly, and I was fortunate to meet many analysts, predictive modelers, and data scientists of all experience levels. Even though this is always an entertaining break from every-day office life, my favorite part of the conference was being able to participate in a workshop offered by Dean Abbott, President of Abbott Analytics, entitled "Supercharging Prediction: Hands-On With Ensembles Models."

Read More

Topics: Battery, TreeNet, Random Forests, stochastic gradient boosting, beginner

6 Tips for Optimizing TreeNet Gradient Boosting Models

Posted by Dan Steinberg on Tue, Jan 15, 2013 @ 08:22 AM

While TreeNet (Stochastic Gradient Boosting) can work phenomenally well out of the box it almost always pays to try to tune your control parameters. Devoting time to optimizing a TreeNet model can improve its out of sample performance noticeably.

Read More

Topics: Battery, TreeNet

CART on Windows Memory Matters (Part 1)

Posted by Dan Steinberg on Tue, May 15, 2012 @ 04:24 AM

This note is actually relevant to all Salford data mining engines. CART, and SPM generally, can use quite a bit of memory for storing the trees grown and all the model statistics and graphs relevant. This is because CART gives you full access to every size of tree grown with performance stats available for each. In a session I just ran today I ran about 50 individual CART trees (I wrote a small command script for this) and about 1,000 additional CART trees using the BATTERY mechanism. Every tree grown, and every sub tree was available for me to examine instantaneously. When I ran another model and wanted to drill down into the details I received a message that the "system was running low on resources" and I was advised to close some open windows. Naturally, I was not eager to start closing windows one-by-one, so what to do? 

Read More

Topics: Battery, SPM, CART