A Reporting System Architecture

  • did you find the root cause of why production server take more than 5 minutes and the reporting server take less than 1? caused by CPU contention? or the "read uncommit" solved the locking issue? if it's the later one, then the performance issue can be solved by transation level turn on "read uncommit".

    sometime report just take too long by joining tables, when table grow, the cardition join will get worse and worse, have to think about de-normalization and BI/OLAP solution at that point.

    just a thought.

    jiulu

  • the reason for our production reporting issues were MANY

    1) lack of proper indexes

    2) poorly written sprocs

    3) way too many joins

    4) too much logic in the report sprocs

    5) poorly designed DB

    6) blocking

    7) IO subsystem not configured correctly

    8) Huge use of User defined functions in the reports forcing non-set based queries

    9) Reports were using dynamica sql to create the correct sorting and grouping options

    10) etc etc etc

    not only have we moved the reports into their own little world to shield the OLTP system from their potential damage, we have also greatly optimized all the reports and we have way more indexes in the Reporting DBs.

    Furthermore, our ETL Jobs pre-calculate many of the aggregated values and the Reporting DB is denormalized to a great extent.

    it's a complete redesign literally apples to oranges

    GAJ

    Gregory A Jackson MBA, CSM

  • How do you synchronise Reporting1 and Reporting2? If Reporting1 is toggled as the 'loading' DB, and therefore Reporting2 is the 'live' DB, then at some point (after the load is finished?) Reporting2 will need the transactions just loaded into Reporting1; it isn't clear (to me anyway :)) just how and when that occurs.

  • I had started to write a somewhat similar article just last week. Although my concerns and constraints are somewhat different, and final solution completely different, I enjoyed your article. Here's how my situation differs from yours:

    * Ad hoc reporting must not cause locks for OLTP transactions.

    * Live data for reporting. There can be no delay between transactions and reporting.

    * No improvement in report performance necessary, no budget for new hardware.

    So the main issue we are correcting is that Office (SQL Query), Access, and Crystal all connect to SQL Server and have data isolation levels at "read committed" rather than the much more OLTP friendly "read uncommitted." As a result, it takes a table lock to return a table of data...

    So our solution has been to create another database which is a schema copy of the OLTP database, where every view and table from the OLTP database is a view of the original table in the "reporting" database. Refreshing the schema has been automated so that any table or view change on the OLTP database (which are frequent as our live system is still under constant development) are refreshed in the reporting database on a scheduled basis (2x per day).

    We've seen a dramatic improvement in performance due to the fact that the reporting system no longer generates locks and we can now continue to offer allow ad hoc queries from the system users, allowing our business to remain nimble and freeing the development team from a never-ending list of new data requests. This is admittedly more of a brute force solution then a highly engineered data-warehouse could provide, but I think it offers many of the advantages of more expensive solutions, without the cost of massive software re-engineering or severely limiting access to the data.

  • Rpt1 and Rpt2 are not aware of each other.

    They are never synched with each other. Each of them are individually and seperately synched with the Prod OLTP system.

    basically we have a table that tracks the last updated date for each db. Then our synch scripts run and catch each db up with changes since they were last updated....

    hope this makes a little more sense.

    GAJ

    Gregory A Jackson MBA, CSM

  • Here's a question. How does correctly partitioning your tables compare? For example, if there were a partitioning schedule (say once a week for intensive tables, once a month for others, and eventually per year partitions). It's my understanding (although I have yet to put it in practice) that these partitions can be in different FileGroups which can then be in different disk resources. So, it could potentially keep reporting apps using the same production database but there wouldn't be as much contention because reports would be using different FileGroups than the live OLTP data. I've read that Jobs can be created to manage this as well. Does anyone have experience with this?

    Also, I've used Crystal (7 - 11.5) in the past. If I never have to use it again that will be great! I redesigned our entire Healthcare reporting to use DevExpress reporting where the User's could basically design their own reports (you're always going to have to make sure you have proper indexes regarless if a user makes a report or you). Anyway, they would layout the report real-time using the Grid for sorting, grouping, and filtering and then we'd store that "schema" as Xml in the database. Really nice design, fast and we never had to design 150+ reports for each client ever again. I think there were 5 Crystal reports left after we re-wrote everything (and it only took 2 months to rewrite the reports).

  • Can you [tymberwyld] provide more details on DevExpress. Is this from devexpress.com? Which product?

  • good points....

    Partitioning is a really great tool but for my case (and likely in most cases) it doesnt solve all the problems associated with using your transactional system for reporting needs.

    When using a transactional systme for reporting needs, you really cant Optimize for OLTP or for OLAP.

    In a Reporting system, you'll likely want more denormalized schemas, more indexes, different fill factor settings, etc etc etc.

    Certainly Partitioning "could" help alleviate some of the disk io contention but I dont believe it would really get you what you wanted in most scenarios.

    GAJ

    Gregory A Jackson MBA, CSM

  • GregoryAJackson (1/21/2008)


    Don,

    I dont understand your post regard the negatives to SSIS and regarding blocking....

    (I think I might disagree with you but I want to make sure before I reply in that regard...)

    Please expand here....

    Thanks

    GAJ

    I second the request for more detail on the SSIS concerns.

    We are also building a model, and several of the same concepts apply. We have gone the replication route because we have a clearly defined set of reports, currently, that have a signifcantly smaller set of data than the OLTP, and this implementation (even thought I disagreed with it), is focussing on only supplying the data required for these reports.

    So, our articles are vastly smaller than the whole OLTP databases. I'd be interested to know the impact (additional space requirement), your "Mirror and Snapshot" design added.

    Also, as implied by databases, we have several source databases - some of them new world, where we are able to require Created and LastUpdate datetime columns, and other legacy databases where we don't have that remit. This has led to our current 'KeyLookup CDC' design (we want to get SQL 2008,. and look into using hte CDC functionality, but that will be down the line).

    Another advantage of our replicated databases, is that they can be indexed for the data accessing that will happen there, so we can index to improve our ETL process.

    SSIS (via several packages, run in parallel via an 'orchestration package' is our current design - this is still a work in progress - 3-4 months away from live.

  • The performance of SSIS has not been an issue for us in any way. In fact, the ability of SSIS to modify the batch sizes and the ability to use bulk copy within SSIS has helped immensely. We call SSIS from the command line and the performance has been fine.

    SSIS does have some bugs in it still but most of them have been fixed in SP2.

    The mirror and snapshot space requirements also are not that much of an issue. We'd have the additional space consumed using any of the other methods of transferring the data anyway (Replication, log shipping, etc). The snapshot of the mirror does not "Double" the space requirement as I thought it would. If you research the snapshot option a little you'll see what I mean.

    In all, we are happy with our design\approach.

    We might modify some of our ETL SSIS packages to optimize the way they are implemented, but other than that, we have no current plans to change our approach.

    hope this helps,

    GAJ

    Gregory A Jackson MBA, CSM

  • Thanks for the article Gregory - just the kind of thing I subscribe to Server Central to read.

    You've also generated a fair number of discussion topics I see. I was curious about you having to use 2 reporting databases due to the load generated by your ETL queries. Just from a personal perspective these kind of queries are the one's you just leave once they work, which means they're the one's that are pulling redundant fields out of the DB, and are prone to more than a few SELECT *'s etc etc. Just for reference I've had a lot of joy (I say joy, I mean mind numbing boredom, of course ;)) using cursors for ETL work - especially when denormalising, I've obtained performance improvements in the 1000s of percent doing this.

    I realise you've done some tuning work, but what I was most curious about was whether you've tried timing a stored procedure against an SSIS package doing the same job??? Also, how many users are using the system?

  • our transactional system is used by thousands of Auto Dealers all across the country and in Canada.

    the reporting system is mostly used by tech support (not a huge number of users)

    I have not actually time compared a sproc outside of SSIS vs the same sproc within SSIS but I dont really think there would be a huge difference.

    the logic in our ETL is so complex that trying to do the ETL with simple SQL would have been way too difficult.

    Performance is currently not giving us issues.

    GAJ

    Gregory A Jackson MBA, CSM

  • Gregory,

    I'm curious... What size (in gig or tera-bytes) is your OLTP and Reporting databases?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • not huge

    currently the OLTP side is ~50GB and the Reporting DB is probably ~60GB or so.

    growing pretty quickly though this is a very new system.

    Our next phase in this project will be to regularly purge the OLTP data so that it never has more than about 30 dyas of data in it.

    gaj

    Gregory A Jackson MBA, CSM

  • Since your budget does not include another server, I am curious how much another server would cost you.

    If the staging area was a separate server, then could it be a less expensive server in the $2,000 to $3,000 range?

    Also, one of the SQL Server licenses could be Standard with minimal CALs, so wouldn't that get you into an easy to justify range?

Viewing 15 posts - 16 through 30 (of 51 total)

You must be logged in to reply to this topic. Login to reply