Need to improve performance of multi million record table with calculated columns

  • Hello Friends,

    I was working on stored procedure to get the snapshot of data from single AS400 table to my local SQL server 2005 by openquery, which is scheduled to run on every day.There are around 15mil records which are getting refreshed every day(ALL).I'm selecting around 10 columns from AS400 table which im dumping into my SQL server table .To put so many records it is taking around 3.5 hours

    In addition to this I'm also adding up 15 new calculated columns to the same SQL Server table which are used in multiple reports developed. I'm adding indexes to these columns. But to reflect these calculated columns , sp takes another 4 hours for 15 mil records. the total cost is 7.5 hrs every day which I cannot afford to.

    is there any simple technique to improve performance of this sp?Is there any way the processing of calculated columns is faster? is there any effective way to row processing of 15mill records?

    any suggestions and help highly appreciated.

    Cheers,

    Mayur

  • Hi ,

    Frustrating as it is,you really need to give better details as to exactly what is slow.

    Try this link[/url] and then this link[/url], and see if that helps



    Clear Sky SQL
    My Blog[/url]

  • mayurbvb (7/9/2009)


    In addition to this I'm also adding up 15 new calculated columns to the same SQL Server table which are used in multiple reports developed. I'm adding indexes to these columns. But to reflect these calculated columns , sp takes another 4 hours for 15 mil records. the total cost is 7.5 hrs every day which I cannot afford to.

    This is the bit that intrigues me!

    No doubt once you have read Dave Ballantyne's links and posted more details, all my questions will be answered, but I am fascinated by these 'calculated columns'.

    If you mean SQL Server 'computed columns', that raises one set of questions relating to persistence and indexing.

    If you mean new columns with values based on the result of a computation, that is entirely different.

    If I had to bet on it, I would guess that you are using ALTER TABLE to add columns one by one, populating the calculated values using UPDATE, and then creating (unspecified) indexes.

    It's a horrible idea, but about the only way I can see for that process to take over four hours. I stand ready to be surprised, however 😉

    Paul

  • Hi there,

    sorry for late reply..

    It was a bit frustrating ....but I have achieved significant improvement in performance...I have reduced the time to 3.15 hours (I’m yet to believe this :-)) ...

    I was always knowing where the process was taking much time....first getting records from AS400 (16 Mil), adding calculated columns to this and then while creating indexes on this table.

    The method which I incorporated are:-

    Created a staging table which is partitioned (created 14 partitions):- I couldn’t achieve much here getting records from AS400. (need to look indexes at AS400).Portioning table made lot of difference

    Removed the calculated columns funda here and just used simple update statement to the columns (earlier calculated columns):- i have created staging table with null values initially for these columns.

    Created Indexes on the staging table....improved performance here but not as much as expected.

    Renamed this staging table and its indexes with corresponding production one.

    Voila ...and its done....but to be frank im still not satisfied (to be specific my manager!)..

    If any more ideas for the things mentioned above steps where it is consuming much more time?

    Thanks in advance....

    Cheers

    Mayur

  • We could use the DDL for the table (including indexes), some sample data (readily consummable), and your update queries.

    Did you say that partitioning the table did not help the load from the AS400?

  • Lynn Pettis (7/23/2009)


    We could use the DDL for the table (including indexes), some sample data (readily consummable), and your update queries.

    Absolutely. Without this it will be impossible to help you.

  • Hi,

    1. Are you using any function with query then remove it and use it with query.

    2. if are your using more join then please remove by changes in table structure.

    3. use #tmptable avoid use of table variable.

    4. Dont use the same table two or more time in single query

    5. user nolock option with table.

    6. Index must be there on the table.

    Thanks & Regards,

    Bhavin Mer

Viewing 7 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply