SSIS using Stored Procedures to do incremental loads into DW

  • Is it insane to have SSIS to use a Stored Procedures to do incremental loads into DW, if the frequency is like every 5 minutes?

    Here is the bacground. We use SSRS to report off of our production server via stored procs. We are now thinking of creating a replicated server to which we will point the report into. However, we also have a need to give the users access to report data (stored procs), so they can link them with other data sets as they need to using MS Query in excel. Problem is that users won't be able to link data sets if they are based off of stored procs. Have already tried encapsulating SPs inside of Functions but that also does not seem to work.

    What my boss wants:

    - Forget the replicated server.

    - Create a DW which will hold the report data sets in a table per report, to be populated by the existing report stored procedures via SSIS which will be updated every 5 minutes. Some of our SPs are highly complex and have aggregated result sets.

    - Have all the reports point to their corresponding tables in the DW

    - Have the users access these tables so they can link the dataset as they need to.

    JC

  • It is not insane, not sure why to include SSIS when storedprocs alone can do the ETL process.

    On the other hand, I would refer to the database as a Reporting Database 'cause it doesn't look like a proper Data Warehouse e.g. "one table per report".

    _____________________________________
    Pablo (Paul) Berzukov

    Author of Understanding Database Administration available at Amazon and other bookstores.

    Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.
  • jtc900502 (9/14/2012)


    Is it insane to have SSIS to use a Stored Procedures to do incremental loads into DW, if the frequency is like every 5 minutes?

    In a simple answer, no. Unusual, yes, but not insane. I don't usually keep my warehouses up to date this much though, I'll usually replicate at that point and deal with it that way.

    However, a warehouse is not what I'm understanding your intent to be here.

    Here is the bacground. We use SSRS to report off of our production server via stored procs. We are now thinking of creating a replicated server to which we will point the report into.

    So far, so good. Real time reporting off your OLTP system can hose you up a bit if you have a lot of volume. Transactionally replicating the database to another server can offload a lot of lock blocking and other concerns.

    However, we also have a need to give the users access to report data (stored procs), so they can link them with other data sets as they need to using MS Query in excel. Problem is that users won't be able to link data sets if they are based off of stored procs. Have already tried encapsulating SPs inside of Functions but that also does not seem to work.

    Here's where you lost me. Giving them access to the procs just requires them having execute permissions on the proc. What are they doing now with the stored procs in the original database that they can't do in the replicated database? You mention they can't link data sets if they are from stored procs... how are they linking data sets *now*?

    Also, don't stick Sprocs inside CREATE FUNCTIONs in SQL Server... it doesn't end well. You're already finding some of the problems. Just... ow, yeah, Functions in Procs, not Procs in functions.

    What my boss wants:

    - Forget the replicated server.

    - Create a DW which will hold the report data sets in a table per report, to be populated by the existing report stored procedures via SSIS which will be updated every 5 minutes. Some of our SPs are highly complex and have aggregated result sets.

    - Have all the reports point to their corresponding tables in the DW

    - Have the users access these tables so they can link the dataset as they need to.

    JC

    Continuously running those procs on your oltp system may end up with more headaches than it's worth, however, I don't know your system to have a solid idea of the resultant impact. But, that's definately not a warehouse, it's a report result storage. And every 5 minutes you're going to truncate and rebuild those solutions, and if they're taking a decent amount of time to run your business is only going to get what, 2-3 minutes every 5 to actually use the data? That seems less effective.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • Thanks for the reply Craig.

    Currently the users are not accessing the SPs. They have tried to pull data into excel (MS Query) using the physical tables but had a hard time using the same joins used in the report or could not replicate the complex calculations, so they thought if they have access to the report data set it would make their lives easier and all they have to do is join them to other report data sets.

    I could just let them use the SPs in excel (MS Query) but then they wont be able to join it with any other data set. I think that might be a limitation of Excel or MS Query.

  • jtc900502 (9/14/2012)


    Thanks for the reply Craig.

    Currently the users are not accessing the SPs. They have tried to pull data into excel (MS Query) using the physical tables but had a hard time using the same joins used in the report or could not replicate the complex calculations, so they thought if they have access to the report data set it would make their lives easier and all they have to do is join them to other report data sets.

    Out of curiousity, why not let them export the report(s) to excel directly and then do whatever they needed to with them, instead of fighting to create tables repeatedly that you'll be constantly changing on them mid-usage?

    I could just let them use the SPs in excel (MS Query) but then they wont be able to join it with any other data set. I think that might be a limitation of Excel or MS Query.

    I think I'm starting to see your difficulty. You're trying to allow non-designers design level access to the results so they can ad-hoc themselves into infinity.

    Um, ow.

    I think you might be approaching this the wrong way, but I'm not sure if I've even understood the problem in its complete difficulty. However, it sounds like your users are trying to do some serious BI work. My first question for that is why do they need real time data for BI work, that's pattern analysis that happens over weeks or months, not the last 15 minutes. If they're constantly just trying to pull up a particular pattern of recent data for business needs, well, build the reports for 'em and then look to another solution for the BI work.

    My guess is you've got a twizzler of a problem where you've just got too many strands of different issues all wrapped up together. Split 'em out. Some users will have immediate data needs. Look those over, figure out THAT best solution, which will probably be you building a few more reports and some tweaks to get them 'just right'.

    Next, look to SSAS for your BI analysis. Once a cube is built some rudimentary training can get the marketing and business guys to be able to manipulate and look into their pattern data out of real-time with more historical stuff. A little practice at that and they can describe to you the cube they actually WANTED instead of the one they got originally.

    Mind you, I'm guessing and inferring a LOT here as to the problems and what you're hoping for the end result beyond what you've described to me. There might be value in simply backing up the database and leaving it available to everyone to restore to a local install of SQL Express on their own machines and letting 'em loose on it. I would not however, under any circumstances, allow direct table access to non-developers in a real-time upkeep environment. You're begging for a disaster and a few thousand migraines.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • PaulB-TheOneAndOnly (9/14/2012)


    It is not insane, not sure why to include SSIS when storedprocs alone can do the ETL process.

    PAUL FOR PRESIDENT!!!! 🙂

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • jtc900502 (9/14/2012)


    Is it insane to have SSIS to use a Stored Procedures to do incremental loads into DW, if the frequency is like every 5 minutes?

    Here is the bacground. We use SSRS to report off of our production server via stored procs. We are now thinking of creating a replicated server to which we will point the report into. However, we also have a need to give the users access to report data (stored procs), so they can link them with other data sets as they need to using MS Query in excel. Problem is that users won't be able to link data sets if they are based off of stored procs. Have already tried encapsulating SPs inside of Functions but that also does not seem to work.

    What my boss wants:

    - Forget the replicated server.

    - Create a DW which will hold the report data sets in a table per report, to be populated by the existing report stored procedures via SSIS which will be updated every 5 minutes. Some of our SPs are highly complex and have aggregated result sets.

    - Have all the reports point to their corresponding tables in the DW

    - Have the users access these tables so they can link the dataset as they need to.

    JC

    If you need updates every 5 minutes, SQL Agent needs to be involved whether it is executing sp's or SSIS pkgs or raw SQL. It's not insane to use SSIS, especially if you are seeking to add logging, auditing, etc. SSIS could be superior for mass data moves, but you'd likely have to pull out the stop watch to be sure for your specific needs.

    One table per report is a simple way to hurry up and provide for your users, but is not at all a data warehouse, as was mentioned, and is more of a garage where things are just brought in and stacked up to be forgotten. Space is cheap, time is precious, nothing wrong with a garage. It will simply get filled up with items that are very similar, maybe even identical as time goes by.

    Where I would get my butt fried is in hitting production every 5 minutes to regenerate a pile of static reports. Don't kid yourself - the hit to production is not going to decrease over time, it will only increase. More importantly, repeatedly refreshing the static does not create a dynamic environment.

    I agree with Craig that the more likely benefit is in actual BI, that refreshing data so often is painful and possibly pointless. Create a real data warehouse and from that, generate as many cubes as topics/grains of interest.

    What struck me is that if a user needs a refresh of static info every 5 minutes then either they are seeking a dashboard (how do my numbers look RIGHT NOW = dynamic) or they are attempting to trend in a painful fashion or they are unclear on their real needs.

    I got lost in what it is that the users actually need vs the various technical ways in which to serve up data, and I would also not be keen on users determining linkages, but, I do not know your users nor their level of sophistication. Sounds like a maintenance nightmare whereas a cube has linkages built right in, users simply select fields of interest.

  • Is it insane to have SSIS to use a Stored Procedures to do incremental loads into DW, if the frequency is like every 5 minutes?

    It's not insane to use stored procs, although I use them in conjunction with SSIS. The crazy part is the five minute update. Data warehouses have a place in operational reports, but five minutes is really too frequent. As someone has already pointed out, that sounds more like a task for a dashboard. Warehouses are used primarily to discern trends, both long and short. Ultimately you have to do what your boss wants, of course, but you might what to understand what the intent is.

  • I had a "near real time" dashboard that was grabbing data every ten seconds. Now that was insane! 😀

    I'd agree with the earlier posts about just doing the work inside T-SQL; grab SSIS if you need it, but the chances are it's just going to be an added complication without adding any value.

    Also, from a design perspective, try to isolate "tactical reporting" (immediate to short term) from "strategic reporting". This means you may decide to go with a layered approach and produce a relational data store for the tactical reporting, with a data mart for the strategic reporting. Doing that can save you from a lot of heartache in the future!

  • jtc900502 (9/14/2012)


    Is it insane to have SSIS to use a Stored Procedures to do incremental loads into DW, if the frequency is like every 5 minutes?

    Here is the bacground. We use SSRS to report off of our production server via stored procs. We are now thinking of creating a replicated server to which we will point the report into. However, we also have a need to give the users access to report data (stored procs), so they can link them with other data sets as they need to using MS Query in excel. Problem is that users won't be able to link data sets if they are based off of stored procs. Have already tried encapsulating SPs inside of Functions but that also does not seem to work.

    What my boss wants:

    - Forget the replicated server.

    - Create a DW which will hold the report data sets in a table per report, to be populated by the existing report stored procedures via SSIS which will be updated every 5 minutes. Some of our SPs are highly complex and have aggregated result sets.

    - Have all the reports point to their corresponding tables in the DW

    - Have the users access these tables so they can link the dataset as they need to.

    JC

    Hi,

    Very recently, I was in the same situation. The approach which I am planning or might take is to build a data warehouse reporting complex reports which are used by the Sr.Exec's. most of them are summarized reports on monthly basis running to year. There are reports which are run on 30 mins basis. For this requirement I'm suggesting to be run on the backup copies of the live system for now but intending to build a cube later say 6 months to address this or an ODS which is run for certain period of time for realtime reporting.

    I didn't understand why there is a need to use SSIS to use SP's for incremental load. I can understand SP's on their can be used to perform incremental data load to storage area which can be used of real-time reporting.

    Now coming to the point how users can access the data. My preference is using SSRS subscription to publish the reports to a location or email and render them in Excel or which ever format.

    Does this help? Do keep us posted on what steps you are taking.

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply