Technical Article

Data Mining Introduction Part 5: the Neural Network Algorithm

,

In earlier articles I explained the following Microsoft Data Mining Agorithms:

There is also an introduction to this series if you are interested.

Using these algorithms, we examined a view in SQL Server, and we predicted the probability for  customers to buy a bike from the fictitious company, Adventureworks. In this new chapter we will talk about the Neural Network algorithm. This one is my favorite one.

As the name says, the Neural Network is a pretty nice algorithm based on the way we think the brain works. Let’s start comparing the human being with the Microsoft Neural Network with a simple example: the baby example

When the babies come to the earth, they experiment with the environment. They eat dirt, flies, and papers. They learn with the experiences.They receive the dirt as input, and if they like it, it will be part of their menu. In their brain, using input, the neural network system creates connections, and babies learn what the best is for them and what food can be rejected.

 The Microsoft Neural network is similar to the babies and the human being:

There are three layers. The input, the hidden layer and the output.

The Input Layer

If we think about the baby, the input would be the dirt. The baby eats the dirt and tastes it, and decides if he likes it. In Microsoft Data Mining we use a view with the past experience of customers who bought a bike or not. With that input, the Neural Network can take some inferences. They predict with the input. The more data it has, the more precise the prediction is.

The Hidden Layer

In the baby example, the brain creates different conections and sends electricity through different paths. When a baby eats dirt, the brain sends a bad electrical sensation and the baby learns that the dirt does not taste good (for some babies).

In our example, the Microsoft Algorithm tests different combinations of possibilities. It analyzes if people from 30-45 years old have a high possibity of buying a bike. If the results is positive, it keeps the results and continues comparing the different attributes of the user (gender, salary, cars, etc).

The Output Layer

The output is the result of the experience: if the baby likes the dirt or not. He will experience with his mouth the taste of the food, and he will determine what is the best for himself.

Neural Networks can be applied to OCR, speech recognition, image analysis, and other artificial intelligence taks. In this case we are going to use neural networks for our Data Mining example.

In the Microsoft Neural Networks, the system test the differents combinations of states and find the option that best suites the needs. The output is the result of different tests made by the algorithm.

Getting started

In the part 2 and part 3 of these articles I explained how to create the other algorithms based on a simple View with the customer information. Based on that information, we created a Data Mining Model and added the different Algorithms.

We are going to continue using the model of earlier chapters and add the new Neural Network Algorithm. Follow these steps.

  1. Open the Adventureworks project used in earlier chapters and double click in the targeted Mailing.

  1. In that project we already added views, inputs to the Data Mining Project, now we are going to add the Neural Network algorithm. In the Mining Model tab, press the Create a related mining model icon.

  1. Write any name for the Model Name textbox and choose the Microsoft Neural Network as the algorithm name.

  1. If everything is OK, a new algorithm should be created:

  1. In the Mining Model tab click the Process the mining structure icon.

  1. In the process mining Model Tab, press the run button.

  1. Once the process is done, close the window.

  1. In order to see the model, go to the Mining Model Viewer and select My neural network.

  1. You will find that the customers older than 88 years old would not buy a bike (Favors 0). This is because they are too old to ride a bike. The same for people from 74-79 years old or 79-88. On the other hand people from Pacific will likely buy a bike, and they are potential customers (Favors 1). If the customer has 4 cars, he may not buy a bike.

If the customers have 3 children they may not want to buy a bike and if the age is between 40 and 45 years old they may want to buy a bike.

In that chapter we asked the model the probability to buy a bike of a prospective customer who is 40-45 years old, with a commute distance of 5-10 miles, with high school, female, single, house owner, with 3 cars and 3 children to buy a bike.

  1. Finally, in order to test the method we are going to apply the same steps used in earlier chapters. If you did not read earlier chapters refer to the article about Naïve Bayes step 20 to 26: http://www.sqlservercentral.com/articles/Data+Mining/97948/

  2. We will select the Neural network model using the select Model button.

  1. Choose the My neural network model.

  1. Using the Singleton option specify the customer characteristics (age, gender, marital status, etc) and use the PredictHistogram function to specify the probability to buy a bike.

  1. Verify the Results.

The probability to buy a bike for a female with 40-45 years, single, etc is 40 % (0,4085014051).

Conclusion

In this chapter we used a new algorithm or method named Neural Network. The neural network is one of the most exciting algorithms and it can be used to predict complex models.

Even when the algorithm is complex, using it with Microsoft Data Mining is very simple. In the next chapter we will talk about

 References and images

http://msdn.microsoft.com/en-us/library/ms174806%28v=sql.110%29.aspx

http://en.wikipedia.org/wiki/Neural_network

http://pijamasurf.com/2010/05/comer-tierra-aumenta-la-inteligencia-te-pone-de-buenas/

http://msdn.microsoft.com/en-us/library/ms174572.aspx

Rate

You rated this post out of 5. Change rating

Share

Share

Rate

You rated this post out of 5. Change rating