• Grant Fritchey (12/17/2015)


    Thanks for all the help guys. I'm thinking that I'm going to back off and rethink my blog post. I can't get my head around what I need to do to properly prepare the data in order to run it through these algorithms. For the short term I think I'll concentrate on learning the basics of the language instead of immediately trying to apply it.

    Thanks again for the assist. Sorry I'm letting you down.

    Just food for thought.

    Why not start with a data-adaptive approach where you start with the data, with little consideration of the model at first, and then when find useful predictors, fit the model to the data you have? Assume you're boss hands you a data set and find out if you can use what you have first, then move onto fine-tuning that data second and modeling third.

    I say this because it's what I would do in Python. I would start with the data first, then load it into Python second. This at a very high-level regardless of the algorithms, get's me to a solid foundation of generating the data and streamlining it into the application.

    From there, I would explore the data in something like Python Notebook (R has similar) in real-time in my web browser and try to model the data from there. The end result will hopefully be a solid foundation to start processing the data and outputing results that can be usable to answer questions like, "How likely is a salesperson related to a region Mr. AngryDBA?"

    I only say that because I think you can visualize the flow of what needs to happen. It's the algorithm that is throwing you off from moving forward. Screw the algorithm, just output some data, load it into R and start exploring what you have. I think once you start doing that, you will start to find ways to fit the model to the data and completing the objective you want to accomplish here.