Forecasting with SQL

  • Allen Nugent (2/9/2010)


    Nice work, Mark.

    I'll second that - a very nice article indeed.

    Of course, there are weird and wonderful techniques for fitting data to nonlinear functions, but it might not be practical to do it with SQL.

    Well, if you can do it say in Fortran+Dystal+LAPACK+FSPAK it's probably not too hard in SQL. What sort of package do people generally use for this sort of thing today? I'm way out of date (forgotten nearly all of my stats and most of my curve fitting) so I wouldn't try to do it myself. But if I asked an expert for predictions I would definitely want to know what methods he used, what assumptions he made, and maybe look at the code to check it didn't include the most common idiocies (and SQL is a language I can read, because it was designed for readability rather than to enable maximum obscurity like some pouplar languages).

    Tom

  • Excellent article. I have a lot of forecast and actual time-series data and will check out the methods this weekend.

  • Mark,

    Very informative and useful article,

    Do you have any suggestion on how to incorporate a weighting element into the SQL. i.e. if i wanted to give more weight to last years data over the year before etc.

  • Hi Grasshopper,

    I do not have any examples of weighted averages, but you may considering looking at exponential smoothing which weights, more heavily, recent periods.

    I do suggest making sure that you prepare the data before forecasting on it. By using a moving median, you can eliminate a lot of the outliers.

    Also, it's worth looking at the forecasting datamining algorithm in SSAS, which is a great tool.

    Good luck,

    Mark

  • Here http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=93911 is examples of both moving average, weighted average and weighted moving average.


    N 56°04'39.16"
    E 12°55'05.25"

  • Hi Mark,

    Nice article on forecasting using T-SQL! I know it's been a while since you wrote this article but wanted to know if you think this technique would work with a scenario I am faced with on a project.

    My scenario is a table of risks with a value for severity, probability and a criticality index value. I would like to predict or forecast the number of risks most likely on the fly so not stored in a table for a charting solution. Basically, I would chart the actual values and then based upon a choosen date display the forecast values to the right and the actual values to the left of the choosen date.

    As of now I am rolling up the number of risks by a category or risk type for a monthly rollup which works fine. I'd like to take this data and then do a forecast for future months like you do in this article for say 6 to 12 months ahead from a choosen (parameter) now date.

    So in your article you use products and in my scenario I have risks. Do you think I could use this technique to do the forecasts? The forecasts would only have to project an integer or count value based on the historical number of risks in previous months of a project.

    Thanks in advance!

    Chad E. Downey, CDMP - Certified Data Management Professional
    Consultant - SQL Server, SSIS, SSRS, SSAS

  • Yes. That should work fine for you to give a directional insight into your data. Products, customers or risk types are simply discrete values by which to partition the forecast to make it more useful. Any type of discrete data can be plugged in.

    Good luck and have fun exploring forecasting!

    Mark

  • Its very useful for the beginner. If the same sample, if available using SSAS would be great...

  • Mark Lovely work and making my life much easier for a painful problem of how to cast time series data forward.

    On addition i would to understand is how to make ll the NULL values come through as blank so I can automatically process them into a charting web service.

    There seems to be multiple places as I have tried to zero them out but i would prefer blanks so the charting does not plot them at all.

    I will go hunting for more tidbits you have uploaded to broaden my knowledge.

    Thanks a lot

  • mark.wojciechowicz@gmail.com - Monday, February 8, 2010 8:33 PM

    Comments posted to this topic are about the item Forecasting with SQL

    Hi Mark, I believe there is an error in your code. When looping and using y = a+bx to forecast yi you are using x(i-1) then adding 1 (i.e. your parenthesis are off) instead of using xi:

    MAX(A) + (MAX(B) * MAX(Forecastkey) + 1), -- Trendline
       (MAX(A) + (MAX(B) * MAX(Forecastkey) + 1))*
       (SELECT ...

    Should be:
    MAX(A) + (MAX(B) * (MAX(Forecastkey) + 1)), -- Trendline
       (MAX(A) + (MAX(B) * (MAX(Forecastkey) + 1)))*
       (SELECT ...

    Outside of this, thanks for the very useful code! Feel free to contact me direct if you'd like to discuss.

    James

  • james.waugh28 - Thursday, March 30, 2017 3:51 PM

    mark.wojciechowicz@gmail.com - Monday, February 8, 2010 8:33 PM

    Comments posted to this topic are about the item Forecasting with SQL

    Hi Mark, I believe there is an error in your code. When looping and using y = a+bx to forecast yi you are using x(i-1) then adding 1 (i.e. your parenthesis are off) instead of using xi:

    MAX(A) + (MAX(B) * MAX(Forecastkey) + 1), -- Trendline
       (MAX(A) + (MAX(B) * MAX(Forecastkey) + 1))*
       (SELECT ...

    Should be:
    MAX(A) + (MAX(B) * (MAX(Forecastkey) + 1)), -- Trendline
       (MAX(A) + (MAX(B) * (MAX(Forecastkey) + 1)))*
       (SELECT ...

    Outside of this, thanks for the very useful code! Feel free to contact me direct if you'd like to discuss.

    James

    James,
    Thanks for reviewing this and finding the bug.  I have submitted a correction to the article and attachment.
    m

  • We are starting to investigate SQL + R.  Would that be a good approach to this problem?  Is it still too soon for this technology combination to have caught hold in the SQL community?

  • rchantler,
    R would be a brilliant approach and probably much easier to implement.  If you google forecasting + R or linear regression + R you'll come up with a lot of links.  Here's one from the top of the search http://a-little-book-of-r-for-time-series.readthedocs.io/en/latest/src/timeseries.html  
    Glancing through it, they support many different forecast models, including this one.  Ideally, you would test with a number of models and select the one that performed the best for that particular dataset.  In our company, we use different models for every product, because their behavior is different.  
    Give R a try and write an article!

  • rchantler - Friday, May 19, 2017 8:18 AM

    We are starting to investigate SQL + R.  Would that be a good approach to this problem?  Is it still too soon for this technology combination to have caught hold in the SQL community?

    Depends on what you're trying to do. R + SQL is a good in my opinion, but in the sense R lives outside of SQL versus on the server to allow for separation of church and state, especially if you're analyzing larger datasets with R. 

    When it comes to SSAS over R, then R may be the better option to actually validate your data. This article for example is not actually validating the forecast with statistics and common approaches why the specific variables were chosen over the others. It's using the SSAS features to easily do a linear regession over a set of data to show you how easily it can be done. But, if you can't explain the results, then it's very dangerous to attempt. R with SQL on the other hand makes it a bit easier to understand what is going on and provides greater access to validate not only the end result, but your variables one-by-one with their own analysis. It's just a matter if you want to go through the process of creating R scripts versus a more drag-and-drop approach.

  • I did the first step using my data that only had four columns year, month, period(Fkey), revenue.
    I completed the first step, but second step seems a bit confusing.  Is there any data available  that was used to design this query?

Viewing 15 posts - 16 through 30 (of 34 total)

You must be logged in to reply to this topic. Login to reply