Columnstore Index testing- advice on different table sizes

  • I am going to be testing creating a clustered columnstore index on a very large table to see if it improves query speed. Version 2016. I have two databases with the same tables in them, one with 190,000,000 records and the other with 181.000.000 records. My question is simple: is it a valid test if the number of records in the two tables is different by 9 million records? If it is significant enough I will replace the smaller one with an updated data set. But don't want to do that if I don't have to. 
    Also, any tips on creating the index are very welcome... if there are any "gotchas" you have experienced etc. 
    Thanks, Diana

  • Well, the easiest thing to do to account for the differences, is to also test performance with just the normal indexes in place on both machines, and then you'll have a baseline of the differences between the two for capacity measures in terms of things like response time and query run time.   If you then eliminate the normal indexes and create the columnstore ones, you can test again on both machines, and then use the first test as your baseline...   Might be a fair amount more work, but I don't see this kind of problem to necessarily be as simple as expecting the times on the smaller data set to be exactly 181/190 of the times on the larger one.   The potential for a large number of other factors to come into play is huge.

    Steve (aka sgmunson) 🙂 🙂 🙂
    Rent Servers for Income (picks and shovels strategy)

  • Fortunately both databases are on the same server so that keeps other variables out of the mix. The only variable should be the difference in number of records. I like your idea though. Thanks!

  • dbodell - Thursday, June 14, 2018 12:17 PM

    Fortunately both databases are on the same server so that keeps other variables out of the mix. The only variable should be the difference in number of records. I like your idea though. Thanks!

    Well, it minimizes them, but doesn't entirely eliminate them.  The two databases may be on different hard drives, or on the same drive, and both of those have their own set of potential complicating factors.   Mind you, unless one of these databases has a serious fragmentation problem, most of those differences would be "speed of the disk drive related", and possibly not measurable if both DBs are on the same drive.

    Glad I could provide assistance...  Post back with your results, as someone else may have a future interest in them.

    Steve (aka sgmunson) 🙂 🙂 🙂
    Rent Servers for Income (picks and shovels strategy)

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply