Let's get right to it! You're a beginner, and you want to know what is needed to start data mining and become an experienced data scientist overnight. We get it - this is the world we live in - quick and dirty. So here we go, take notes!
Familiarize yourself with CART decision tree technology in this beginner's tutorial using a telecommunications example dataset from the 1990s. By the end of this tutorial you should feel comfortable using CART on your own with sample or real-world data.
This note is a quick tip on getting around the “import data” function of SPM via the File…Open menu. Suppose you start with this and see something like the following:
The explosion of interest in predictive analytics and in particular sophisticated predictive analytics has led to renewed interest in the topic of how to plan for an predictive analytics project. The topic is hardly new but anyone involved in planning for, or conducting a serious analysis project needs to cognizant of the fundamentals. Here we abstract from insights and ideas from two classic discussions, the 1996 paper by Fayyad, Piatetsky-Shapiro, and Smyth ‘From Data Mining to Knowledge Discovery in Databases,” and the 1999 “Cross-Industry Standard Process (CRISP) for Data Mining” developed by analytical practitioners at Daimler-Chrysler, ISL (later acquired by SPSS and now part of IBM) and NCR (Teradata). While these first-generation discussions clearly date from an earlier technological era the principle ideas are still relevant and are being rediscovered and translated into modern terminology daily. Here we offer a summary of the essentials viewed from our perspective of 20 years of experience in the world of data mining and predictive analytics.
Salford Systems recently attended the Predictive Analytics World (PAW) conference in San Francisco as a sponsor. Manning the exhibit booth was yours truly, and I was fortunate to meet many analysts, predictive modelers, and data scientists of all experience levels. Even though this is always an entertaining break from every-day office life, my favorite part of the conference was being able to participate in a workshop offered by Dean Abbott, President of Abbott Analytics, entitled "Supercharging Prediction: Hands-On With Ensembles Models."
This week's blog article features a great entry-level tutorial video on how to build predictive models with TreeNet stochastic gradient boosting. Check it out, enjoy, and share your comments!