Index - Whether or not my table should have one

Question

Index - Whether or not my table should have one

etl2016

Default port

Points: 1459
More actions
July 28, 2018 at 3:45 pm

#354787

Hi,
I have a typical DWH scenario of Source --> Destination ETL.
During this ETL, there are some intermediate tables constructed and dropped. Source --> Stage1 ---> Stage 2.....Stage N ---> Destination.
Each of these Stage tables are dedicated to this particular ETL and are not used by another process/user. Since any INDEX has its own housekeeping overhead, having one need not necessarily guarantee enhanced performance.
1) Thus, Record-count-wise, Depth-wise or Width-wise is there a thumb rule when we should think of Indexing a table....in my scenario, the Staging tables?
2) if not from Volume standpoint as said in (1) above, from the corresponding Execution Plan, is there anything to be watched for, that ALARMs the fact that, a particular area can be tuned? (such as RID Lookups as candidates for Covering Index or Hash Join/Merge/Nested joins to be converted into a more efficient mechanism etc)
thank you

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply

Joe Torre SSChampion Points: 10248 More actions · Answer 1

If indexing will enhance the performance of your intermediary staging tables then it's a good idea. Without a lot more information no one can give you meaningful advise. Try testing the load process with indexes vs without.

Grant Fritchey SSC Guru Points: 398690 More actions · Answer 2

etl2016 - Saturday, July 28, 2018 3:44 PM
Hi,
I have a typical DWH scenario of Source --> Destination ETL.
During this ETL, there are some intermediate tables constructed and dropped. Source --> Stage1 ---> Stage 2.....Stage N ---> Destination.
Each of these Stage tables are dedicated to this particular ETL and are not used by another process/user. Since any INDEX has its own housekeeping overhead, having one need not necessarily guarantee enhanced performance.
1) Thus, Record-count-wise, Depth-wise or Width-wise is there a thumb rule when we should think of Indexing a table....in my scenario, the Staging tables?
2) if not from Volume standpoint as said in (1) above, from the corresponding Execution Plan, is there anything to be watched for, that ALARMs the fact that, a particular area can be tuned? (such as RID Lookups as candidates for Covering Index or Hash Join/Merge/Nested joins to be converted into a more efficient mechanism etc)
thank you

No arguments with Joe. You have to make the determination yourself through testing what, where, and how to index any given table within your own system. Staging or not, the right indexes in the right place help performance.

As to looking at the execution plans, there isn't a single set of "do this/don't do this" that will work in all situations. Yes, an RID lookup tells you that you don't have a clustered index, you do have nonclustered indexes, and that you are having to go to the heap to retrieve the data that is not a part of that nonclustered index. Is that a problem? Well, is it for one row or one million rows? Will adding a clustered index slow down the insert or update that you're doing as part of the data load? Can you do the data load and then add the index in support of queries against the loaded data? After adding the cluster do you still need the nonclustered index? Will making the nonclustered index covering work as well or better than creating a clustered index? Can we make the index unique? Is the query doing an aggregation? What if we added a columnstore index? Clustered columnstore or nonclustered?

See what I mean? It's all far too dependent on the situation to give you a 10 point check list that will, guaranteed, cover 90% of issues.

"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt

Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning

ScottPletcher SSC Guru Points: 100947 More actions · Answer 3

Non-clustered index(es) are almost always a waste of resources on a staging table.

A clustered index is much more likely to be useful, although it's obviously not guaranteed to be. If you join the staging table a lot, and the join column(s) are the same as the clus key in the final table the staging table will be loaded into, then definitely clus the staging table that way.

Otherwise, as everyone has noted, it's on a case by case basis, but with special scrutiny (review) of any non-clus index(es), as there has to be a big, known advantage to bother with a non-clus index on a staging table.

SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 4

If the staging tables are a decent size (over a million rows), consider creating them with a clustered columnstore index.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass