Simply Salford Blog

Advantages of Running SPM on Win7 vs. WinXP

Posted by Dan Steinberg on Mon, Mar 25, 2013 @ 05:23 AM

I have a feeling I was not the only one still hanging on to my Windows XP machine, even though the new Windows 7 OS was widely released way back towards the end of 2009.  After a sour experience with Windows Vista, as well as a disastrous upgrade to some .Net software, I was reluctant to take any chances modifying a system that had been working well for me for years. So I doggedly stuck with XP throughout 2010 and 2011. However, about a year ago I decided to take the plunge and added a new 64-bit Windows 7 VM on my MacBook and I am happy to report that I am very glad I finally made the change. The point of this post is simply to recommend the upgrade to Windows 7 for all Windows-based users of Salford Systems' software still running XP. You will not regret it.

The advantages that I have seen include first and foremost the access to far more RAM within the 64-bit working environment. This allows SPM more workspace for the GUI and will make the operation of even older 32-bit SPMs smoother. If you need to work with  data files larger than can comfortably fit within the older 1 GB limits (due to Microsoft limitations) you will have far more flexibility under Windows 7. Further, you can now upgrade your license to push to the limits supported by your hardware.  At Salford Systems our best in-house server is equipped with 256GB RAM and we regularly leverage all of it when working with larger data files. My MacBook now has 16 GB RAM of which I allocate 8 GB to Windows 7 and the ability to work with substantial data sets has been a joy.

Windows 7 is also a fair bit more responsive as a working environment for anyone with a newer multi-core machine. The higher versions of SPM 7.0 can also leverage multiple cores for faster run times but you will need Windows 7 to take advantage of this. Good luck with your upgrade!

If you've also compared performance across different operating systems, let us know your experiences and takeaways in the comments!
Read More

Topics: SPM, Windows, VM, Mac

Supercharging Prediction: Hands-On Data Mining Workshop [Sponsored]

Posted by Heather Hinman on Fri, Mar 1, 2013 @ 05:00 AM

Attending Predictive Analytics World (PAW) in San Francisco? Don't miss this workshop "Supercharging Prediction: Hands-On with Ensemble Models" sponsored by Salford Systems.

Read More

Topics: SPM, workshop

SPM Handling of Categorical Variables - Stop the Confusion!

Posted by Dan Steinberg on Wed, Jan 23, 2013 @ 08:01 AM

When a variable is declared categorical in SPM, each value in the dataset is treated as a separate category. This can cause confusion if some of the values of a numeric categorical variable are very close together. Consider, for example, a variable Y with values 0, .9996, 9998, .9999, and 1.0. If declared categorical, and the default display precision of three decimal places is used, the target frequency table might look something like this:

Read More

Topics: SPM, Categorical Variables

How to recover the your session logs in SPM

Posted by Dan Steinberg on Wed, Dec 12, 2012 @ 09:35 AM

How Did I Get That Model? (click for the slides)

Read More

Topics: SPM

A Quick Tip on Faster Processing in SPM

Posted by Dan Steinberg on Mon, Oct 8, 2012 @ 11:06 AM

For those comfortable working scripts we have a tip on getting your results faster and with less overhead and little to no screen clutter:

From the menus select File/Submit Command File

Read More

Topics: SPM, command line, GUI, speed

Is my dataset too big to build a model in SPM?

Posted by Mikhail Golovnya on Thu, Jul 12, 2012 @ 09:55 AM

How much memory is needed in order to complete a model building run in SPM?

Read More

Topics: SPM

CART on Windows Memory Matters (Part 1)

Posted by Dan Steinberg on Tue, May 15, 2012 @ 04:24 AM

This note is actually relevant to all Salford data mining engines. CART, and SPM generally, can use quite a bit of memory for storing the trees grown and all the model statistics and graphs relevant. This is because CART gives you full access to every size of tree grown with performance stats available for each. In a session I just ran today I ran about 50 individual CART trees (I wrote a small command script for this) and about 1,000 additional CART trees using the BATTERY mechanism. Every tree grown, and every sub tree was available for me to examine instantaneously. When I ran another model and wanted to drill down into the details I received a message that the "system was running low on resources" and I was advised to close some open windows. Naturally, I was not eager to start closing windows one-by-one, so what to do? 

Read More

Topics: Battery, SPM, CART

Prediction Generation in the Salford Predictive Modeler

Posted by Dan Steinberg on Wed, May 2, 2012 @ 10:00 AM

Once you have built an SPM model (CART, MARS, TreeNet, RandomForests) and
have saved the grove (.GRV) file you are in a position to make predictions
for any other data set containing relevant predictors. Thus, if you trained
your model on file A using variables X1, X2,...,X50, for example, you can now
predictions for file B, provided that file B contains at least some of the same
variables (and preferably all of the variables actually used in the model).

This process of prediction generation is called SCORING in our software and
most models are built specifically so that they can be put into production to
generate predictions. The process can also be used for SIMULATION. In this case
you prepare a data set which will also contain the columns X1, X2, ...,X50 but
the values appearing may not necessarily be real data. Instead the file could contain
hypothesized or imagined values, or forecasted values, as in the case when you
want to make predictions for certain possible future scenarios.

Read More

Topics: SPM, prediction