I’ve grown up reading Tom Clancy and probably most of you have at least seen Red October, so this book caught my eye when browsing used books for a recent trip. It’s a fairly human look at what’s involved in sailing on a Trident missile submarine…
Data mining is a great way to help your company make decisions and predict future values. The Data Mining Algorithms built into SQL Server Analysis Services gives you this power.
The Adventure works data base comes with views that are already set up to perform data mining. Let’s take a look at one of these views. Open the Target Mal view by right clicking on it in SSMS and select Design.
The design view shows you the query used to create the view. The dim Customer Table and another view are used to create the Target Mail view. From Dim Customer we get a lot of the customer information and the other view has the bike purchase information. This gives you a simple view of who has bought a bike and the attributes of those customers.
In the query you can see a Case Statement that just gives a one for bike buyers and a zero for non bike buyers.
Now that you understand the data you can go into BIDS and perform some data mining.
In BIDS start an Analysis Services project, and then create a Data Source and a Data Source View containing the Target Mail View.
The first step is to create a mining structure. You can think of the mining structure as the blue print for the data mining models that are going to be created on the mining structures.
1. Right Click on the Mining Structures folder in the Solution Explorer and select New Mining Structure. Next->
Now you are on the screen asking what data you want to use to train your mining model. Here you will place a check next to the columns you want to use in determining which members will most likely be bike buyers based on your existing customers. You also need to select a Key and a Predict column. The Key will be the Customer Key and should already be checked. The Predict columns will be the BikeBuyer because that is what we are trying to predict.
6. Place a check next to Bike Buyer under Predict.
7 .Place a check under input next to the following columns: Age, Commute Distance, Gender, House OwnerFlag, Mairtal Status, NumberCarsOwned, NumberChildrenHome, Region, TotalChildren, and Yearly Income. Next->
8. Click the detect button. This is setting the data types for the mining model. A description of each of these types can be found here. http://msdn.microsoft.com/en-us/library/ms174572.aspx
You have now created the Mining Structure and a Mining model using the Decision Tree Algorithm. Now it is time to process and deploy the model. Click on the Mining Models Tab and view the Model. Notice the Bike Buyer is set to Predict Only and the Customer Key is the Key. The rest of the columns are set to input.
11. Click on the Mining Model Viewer, you will receive a popup asking to process and deploy, click yes. You might receive other popup warning about the number of rows, click yes.
12. Click Run in the process screen.
13. Click Close on the process screen once the process is complete.
14. Click Close the process screen too.
Now you will be in the Mining Model Viewer and be able to see the Decision Tree model Notice how age is in the first level of the tree. Now browse the model and you can see what traits of the customers are most likely to buy a bike. Change the background to 1 to see the most likely bike buyers. Slide the level over to the right to see all levels.
Click on a level and look at the properties on the right and see the probability of them being a bike buyer.
Congratulations you have created you first Data Mining Structure and Model.