RE: Clustered PK perf - – SQLServerCentral

SSC Guru

Points: 100950

June 19, 2018 at 4:14 pm

#1994558

Chris Harshman - Tuesday, June 19, 2018 3:16 PM
ranitb - Tuesday, June 19, 2018 8:27 AM
...
Background - There's a SP which builds a Fact table (say CT) with 30 odd columns, of which 12 are Dim Keys. Total records ~10M.
Table CT is drop and create. If the SP has to be divided into few logical steps (say 10 steps), in the 3rd or 4th step there's a ADD Constraint (PK+clustered). The 1st and 2nd steps, insert records into CT. The 5th and subsequent steps, they update various columns (measures) in these records by reading and computing data from other tables/facts. These joins are done on all the Dim Keys all together at once.
...Is this PK really required on all the DIM keys? I understand that since CT table is joined with other tables using all the DIM Keys together, so they are all being included in the PK.
...
I'm on - SS 2008 R2 (SP1) - Standard Edition
Regards,
RB
There are a number of things odd about this setup.
- SS 2008 R2 Service Pack 3 has been out for quite a while, why are you still on SP1?
- It seems counter-intuitive that you would regularly query the fact table with ALL the dimension keys. Some such as a date dimension would always be used but I could easily see others only being queried sometimes. Index for how reports and users queries will be run against this table not for how you load them.
- I agree with Scott that adding the clustered index before the data is loaded would probably be better overall than loading the data, then essentially rebuilding the table by adding the clustered index after.

The joins would always use all the dimension keys. Having the tables identically clustered allows for a merge join possibility. Merge joins are extraordinarily efficient when they're applicable.

SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.