Microsoft Neural Network Algorithm (SSAS)

  • I've just been asked about the Microsoft Neural Network Algorithm in SSAS. I read BOL (http://msdn2.microsoft.com/en-us/library/ms174941.aspx), but was in over my head very quickly.

    Anyone have any helpful hints, links to share on this? A simple example to start me off with this would be welcome.

    __________________________________________________________________________________
    SQL Server 2016 Columnstore Index Enhancements - System Views for Disk-Based Tables[/url]
    Persisting SQL Server Index-Usage Statistics with MERGE[/url]
    Turbocharge Your Database Maintenance With Service Broker: Part 2[/url]

  • First you need some data to play around with. The famous Iris data can be found at

    http://archive.ics.uci.edu/ml/datasets/Iris

    Then you need to put this (or whatever data you want to play with) into a SQL table. This table needs to have a Primary Key column - the NN algorithm will ask you for it.

    Once it is in a table, go to Visual Studio and create a new "Analysis Services MultiDimensional..." type project.

    Step 1

    =====

    In the Solution Explorer, create a new Data Source by right-clicking on "Data Sources". A Wizard will begin. Select "Create based on new connection", and click the "New" button.

    A window will open; set the Provider to "Native OLE DB\SQL Server Native Client 11.0" (mine says 11.0, yours may differ?)

    Enter the name of your server (localhost may work if its on the same machine). I noticed that the drop-down didn't work for me, so I just typed my server name into the box.

    I use Windows Authentication, so that's what I chose.

    Once there is a server name, you will be able to enter a database name below ("Connect to a database"; "Select or enter database name"). Again, you can just type in the name if the drop-down doesn't work.

    Then click on OK. You will return to the Wizard.

    Click Next. You will now have to enter authentication information. I use the "specific user name and password" option because it worked for me; I don't know what the others do and I can't advise you on which one is right for you. (But I will say this - the "current user" option didn't work for me even though I was using the current user's creds...).

    In the final box, name this new data source and Finish.

    Step 2

    =====

    In the Solution Explorer, right-click "Data Source Views" and create a new view. Another Wizard will begin. Select your data source (actually you could have done Step 1 from the "New Data Source" button...) and click Next.

    You may get a "Name Matching" window. I don't know what this is, I just click Next.

    Then select your table. Make sure it appears in the right-hand column (there is an arrow between the columns to Add/Remove).

    Click Next, name your view and Finish.

    Step 3

    =====

    In the Solution Explorer, right click on Mining Structures to create a new mining structure. A Wizard will open; in the first screen choose "from existing data".

    In the second screen you will select "Microsoft Neural Network" from the drop-down and click Next.

    Select your Data Source View and click Next.

    In the Specify Table Types screen there is a Case check-box next to the name of your table. It is checked by default and you should leave it checked. I don't know why or what it is for, but I read it somewhere. Click Next.

    The next screen is where you start to define your model. You will see the columns of your table, you have to specify which ones are for Input and which ones are for Prediction. A column could be for Input and Prediction.

    Remember I said you had to have a primary key column? This is where you specify it as your Key column.

    You don't need to use all your data columns, one input and one output is sufficient.

    Click Next.

    Now you have to define your columns' "content and data types". The data type is self-explanatory, the options appear in drop-downs. Let me talk about the content instead.

    a. Continuous - treat the data as real numbers. Every continuous input variable will have one associated node in the input layer.

    b. Cyclical - I don't know what this is

    c. Discrete - The input values should be treated as "different things" rather than numbers. For example, if you input data is the name of a building material ("Wood", "Metal", "Tin") or consists of people's names, you would choose Discrete. The engine will create one input node for each option (one for "Wood", one for "Metal"...), and when input comes in it will send a 1 to that input's node and 0 to the others ("Wood" => 1, "Metal" => 0, "Tin" => 0).

    d. Discretized - the input values are continuous, but you want to put them into "classes" before processing. Suppose your input data are examination marks on a scale of 0-100, you may not want to treat every possible score as an input. Instead you may want to treat 0-25 as one group, and 26-50 as another, etc. This option does that for you.

    e. Ordered - I don't know what this is.

    Click Next. On the next screen, click Next. Then name your structure and Finish the wizard.

    Step 4

    =====

    The DMM should have opened to the left of the Solution Explorer, if not just double-click on it from the Sol.Ex.

    You will see multiple tabs. Click on the second one (Mining Models).

    You will see two columns, one with the columns and one with their types. If you right-click on the second column you will find "Set Algorithm Parameters". Don't do anything with this now, when you look for it later this is where you will find it.

    On any column you will be able to open its Properties window from Right-click; Properties. The only use I have had for this (so far) is that for Discretized-type columns, you can indicate how the clustering should be done. Go to the Data Type section in the Properties window, if Content is set to Discretized then you will see two more entries for the Bucket Count and the Method.

    I don't know what the Bucket Count is for. You have three options for Method - Automatic, Equal Areas and Clustering.

    Clustering does classification based on adjacent entries' frequencies. If you drew a histogram and saw a bunch of peaks, each peak would be a cluster.

    Equal Areas does something else. I don't know what it does, but I think it forces the classes to be of the same size. Either that or it divides the x-axis for the histogram into equal intervals. The MSDN documentation will tell you.

    Ok, then click on the "Mining Model Viewer" tab. It will ask you about deploying the project, click yes, it may ask you more questions just click yes/ run/ continue until it does what its supposed to.

    AT THIS POINT you may run into security-related trouble. Basically (sometimes) Analysis Services does not have permission to access your data. I found a bunch of solutions online, and it looks like different things work for different people, and if I understood it I would tell you what to do. But since I don't, I will just tell you what I did.

    I didn't make any changes on the DB side i.e. I didn't add any new Roles or modify permissions on the DB or tables.

    I didn't make any changes to the Login account for the AS services running in services.msc.

    All I did was set the authentication in Step 1 to "specific user and password", and that worked for me. If it doesn't work for you, you will have to do some googling 🙁

    ...

    Ok, suppose it deploys and runs correctly. The "Mining Model Viewer" tab will now show you a bunch of probabilities. By modifying the Inputs in the top table, you can actually get conditional probabilities.

    You can stare at this for a while, it may or may not tell you anything, depending on your data and on how comfortable you are with looking at probability distributions.

    Instead, click on the "Mining Model Prediction" tab.

    This tab has a button bar (you will see the usual Save button). Click on "Singleton query".

    You will see the rows from your Model connected to rows for your Singleton query input.

    What this allows you to do is to feed some input data into your model and see what it predicts. Enter some input values into the query input.

    In the box below, click on the first Source field. A drop-down will show Prediction Function, select this.

    Under "Criteria/ Argument", type in the name of the column whose value you want to predict. This name is from your Mining Model, which may be different from the SQL table's column name. Put it in [] to prevent stupid errors.

    In the tab's button bar, to the left of the Save button, there is a drop-down button. Select "Result" to run the query.

    Hopefully you will see a prediction. If you get an error instead, go back to that drop-down and select "Query". It is easier to spot errors from the Query view rather than the Design view.

    That's all I can do for you now! Good luck.

Viewing 2 posts - 1 through 1 (of 1 total)

You must be logged in to reply to this topic. Login to reply