Data Mining Decision Tree

  • First, what I am using: Sql 2K, with standard Analysis Services installation. I have several ( about 15 ) input columns and 4 predictable columns, all with Continuous data in decision tree algorithm. I use different data to train and test the models. Now the problem. Using any of the three methods to Discretize the predictable columns I get a max of five buckets, one of which is "Missing". I would like to increse the number of buckets. I have attempted creating categories, and listing this as discrete data but the models then only model existence. The option "Ignore nulls" is missing from the drop down list. Can Anyone give me any advice on either modelling the actually discrete Category number( as apposed to modelling the existance) or how to increase the number of categories created using the dicretize() function. Thanks all. PS: How do you format these mails??????

  • Not sure if this will help, but, if you search BOL for 'Mining Parameters', somewhere in your returned list (mines ranked 9th) is the topic Microsoft Decision Trees.  Scroll down and look through the Complexity_Penalty and Minimum_leaf_Cases parameters - these should help out in creating more categories.

    Steve.

  • HI. Thanks foor the esponse. I have tried this. Even at Complexity levels below 0.1( highly likely to cause a split) the tree is Very short. I am currently using Categorized Prediction Columns with continuous input columns which does produce better results, but again Only at Complexity_Penalty >= 0.1.

Viewing 3 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic. Login to reply