Huge size of Reporting tables

  • Jeff Moden wrote:

    S_Kumar_S wrote:

    We have dedicated Sr. People for optimization and they have been doing it for more than a decade for some good clients. So we trust their understanding for indexes and execution plans.  Will see if it will be possible to put the indexes and schema here.

    If that's true, then why are you asking here? 😉

    That's an excellent point actually.

    Columnstore is also something definitely worth investigating.  But presumably your Sr. people have already thought of that as well.

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • Yes, the point is excellent, and to me it seems there is no question.

    If you have a look at the topic starter:

    tables have 1 month worth of data

    there will no doubt that they have transactional tables clustered by auto-incremental ID, which are "quasi-partitioned" by months.

    I bet those tables are well de-normalized - for reporting purposes, of course. Which only adds to the trouble.

    Such a structure is certainly, let's say, less than optimal, and no indexing can help tuning for better performance.

    Columnstore is a popular choice for databases lacking proper structure, but here it will be multiple columnstores - at least one per months. So, querying it won't be much faster.

    So, to me there is no doubt - those Sr.People might be holding a bunch of MS certificates, but they've got no clue how to get things going.

    _____________
    Code for TallyGenerator

  • Since we have NO previous knowledge of your tables and how they're used, we need specifics to make any reasonable judgements.  It's rather frustrating when we ask such things to be told, "our current code, table design and hardware are perfect, it's just running too slow."  Hmm, doesn't it seem that something must not be perfect if your response time is that slow?!

    At a minimum, I don't feel comfortable without looking at missing index stats and index usage stats.  I can provide a query for you to run that will give us those stats.  I, and others here, can then agree that indexes are great as is or suggest changes.  However, if you're not going to make any changes anyway, you're right to tell us that now, and please do so.  I don't want to waste time designing changes you won't consider anyway.

    Also, after I see those stats, if I believe your existing indexes are not perfect -- in particular, if the clustered index(es) seem wrong to me -- then I can comfortably drop out once I know you're not going to make changes to them.  Because, as I see it, with the wrong clus key, you will forever have performance issues and relegated to constantly tweaking non-clustered indexes.  And personally, I have no interest in volunteering for that, it's just not for me.

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • ScottPletcher wrote:

    I still have my doubts on the clus index keys.  But you could be right.

    By the way, are most of the tables clustered on an IDENTITY column?  If so, you almost certainly don't have the best clus index keys on your tables.

    No, they are not on Identity columns. They are like AccountId,TransactionId etc...which are int or bigint columns...

    Online Trainer For SQL DBA and Developer @RedBushTechnologies with 18 yrs exp.

  • ScottPletcher wrote:

    Jeff Moden wrote:

    S_Kumar_S wrote:

    We have dedicated Sr. People for optimization and they have been doing it for more than a decade for some good clients. So we trust their understanding for indexes and execution plans.  Will see if it will be possible to put the indexes and schema here.

    If that's true, then why are you asking here? 😉

    That's an excellent point actually.

    Columnstore is also something definitely worth investigating.  But presumably your Sr. people have already thought of that as well.

    🙂

    They are good at analyzing execution plans, code review and identifying proper indexes and that's it. They can't make any comments on architecture side, hardware side, configuration side or anything which falls beyond their scope.

    Online Trainer For SQL DBA and Developer @RedBushTechnologies with 18 yrs exp.

  • S_Kumar_S wrote:

    ScottPletcher wrote:

    Jeff Moden wrote:

    S_Kumar_S wrote:

    We have dedicated Sr. People for optimization and they have been doing it for more than a decade for some good clients. So we trust their understanding for indexes and execution plans.  Will see if it will be possible to put the indexes and schema here.

    If that's true, then why are you asking here? 😉

    That's an excellent point actually.

    Columnstore is also something definitely worth investigating.  But presumably your Sr. people have already thought of that as well.

    🙂

    They are good at analyzing execution plans, code review and identifying proper indexes and that's it. They can't make any comments on architecture side, hardware side, configuration side or anything which falls beyond their scope.

    I certainly can't comment on your architecture and hardware either.  Good luck with this.

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • ScottPletcher wrote:

    Since we have NO previous knowledge of your tables and how they're used, we need specifics to make any reasonable judgements.  It's rather frustrating when we ask such things to be told, "our current code, table design and hardware are perfect, it's just running too slow."  Hmm, doesn't it seem that something must not be perfect if your response time is that slow?!

    At a minimum, I don't feel comfortable without looking at missing index stats and index usage stats.  I can provide a query for you to run that will give us those stats.  I, and others here, can then agree that indexes are great as is or suggest changes.  However, if you're not going to make any changes anyway, you're right to tell us that now, and please do so.  I don't want to waste time designing changes you won't consider anyway.

    Also, after I see those stats, if I believe your existing indexes are not perfect -- in particular, if the clustered index(es) seem wrong to me -- then I can comfortably drop out once I know you're not going to make changes to them.  Because, as I see it, with the wrong clus key, you will forever have performance issues and relegated to constantly tweaking non-clustered indexes.  And personally, I have no interest in volunteering for that, it's just not for me.

    Hi Scott

    I agree , I may not have given all needed stats to take a good judgmental call. I'll try during weekend to put the table structure and index and possibly execution plan as well.  For choice of clustered index columns, I am confident that they are correct. They are int or bigint columns and are very frequently used in many queries.

    I'll try to put the missing index stats also if I find any. I have the query to get them.

    Online Trainer For SQL DBA and Developer @RedBushTechnologies with 18 yrs exp.

  • S_Kumar_S wrote:

    For choice of clustered index columns, I am confident that they are correct. They are int or bigint columns and are very frequently used in many queries.

    on an average day, out of, say, 100 queries how many of them use those CI columns in range selection? Like

    WHERE AccountId between @AccountIdMin and @AccountIdMax ?

    _____________
    Code for TallyGenerator

  • Sergiy wrote:

    S_Kumar_S wrote:

    For choice of clustered index columns, I am confident that they are correct. They are int or bigint columns and are very frequently used in many queries.

    on an average day, out of, say, 100 queries how many of them use those CI columns in range selection? Like

    WHERE AccountId between @AccountIdMin and @AccountIdMax ?

    S_Kumar_S wrote:

    ScottPletcher wrote:

    Since we have NO previous knowledge of your tables and how they're used, we need specifics to make any reasonable judgements.  It's rather frustrating when we ask such things to be told, "our current code, table design and hardware are perfect, it's just running too slow."  Hmm, doesn't it seem that something must not be perfect if your response time is that slow?!

    At a minimum, I don't feel comfortable without looking at missing index stats and index usage stats.  I can provide a query for you to run that will give us those stats.  I, and others here, can then agree that indexes are great as is or suggest changes.  However, if you're not going to make any changes anyway, you're right to tell us that now, and please do so.  I don't want to waste time designing changes you won't consider anyway.

    Also, after I see those stats, if I believe your existing indexes are not perfect -- in particular, if the clustered index(es) seem wrong to me -- then I can comfortably drop out once I know you're not going to make changes to them.  Because, as I see it, with the wrong clus key, you will forever have performance issues and relegated to constantly tweaking non-clustered indexes.  And personally, I have no interest in volunteering for that, it's just not for me.

    Hi Scott

    I agree , I may not have given all needed stats to take a good judgmental call. I'll try during weekend to put the table structure and index and possibly execution plan as well.  For choice of clustered index columns, I am confident that they are correct. They are int or bigint columns and are very frequently used in many queries.

    I'll try to put the missing index stats also if I find any. I have the query to get them.

    Single column int/bigint?  Then I'm sure you don't have the best clus indexes on your tables.  Again, good luck with this.  I know the "default rule" for some people is to slap an IDENTITY column on the table and cluster on it, but that's a horrible practice, the most damaging myth in table design.  The clus index is the single most important factor in table performance, thus it should never be defaulted but instead always carefully chosen.

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • Just for your information, Thomas Grosher works with reporting databases that dwarf even the sizes reported on this thread.  He recently did a presentation for the "Cloud Driven Data" Meetup group at  the following link.

    https://www.meetup.com/cloud-data-driven/events/275858439/

    If you scroll down on the meeting notice to the YouTube link, you can get to the recording.  The direct link is ...

    https://www.youtube.com/watch?v=4kCwm853N4Q

    Now... fair warning, Thomas had some serious internet problems during the presentation.  DON'T let that put you off!  The information that Thomas presents is golden.  The cool part about recordings is that you can skip the fairly long internet "outages" and it's absolutely worth it.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 10 posts - 16 through 24 (of 24 total)

You must be logged in to reply to this topic. Login to reply