Temp Table's Primary Key Default

  • Is a primary key on a temp table implemented as a clustered index by default, or as a unique index by default?

  • rchantler (9/25/2015)


    Is a primary key on a temp table implemented as a clustered index by default, or as a unique index by default?

    Both. As on a normal table, if a clustered index does not already exist, assigning a column as the PK will force a unique clustered index to be formed.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Thanks Jeff,

    The reason for this question is a perplexing performance problem. I am going to take some time to try to put it into compact, reproducible form but it boils down to what looks like 2 functionally identical definitions causing a 100 times slowdown in an update statement. With no other differences in the code def.1 is 100 times faster than def.2

    Definition 1:

    CREATE TABLE #hidvals(hid INT NOT NULL,

    extel2code CHAR(2),

    start SMALLDATETIME,

    finish SMALLDATETIME,

    value1 FLOAT,

    value2 FLOAT)

    CREATE CLUSTERED INDEX IDX_hidvals ON #hidvals(hid)

    Definition 2:

    CREATE TABLE #hidvals(hid INT NOT NULL,

    extel2code CHAR(2),

    start SMALLDATETIME,

    finish SMALLDATETIME,

    value1 FLOAT,

    value2 FLOAT,

    PRIMARY KEY (hid)) -- same as PRIMARY KEY CLUSTERED

    Here's the UPDATE statement.

    update H

    set value1 = X1.[1MPM], value2 = X2.[1MPM]

    from #hidvals H

    left join HD01.prod.USPROD_Calculations X1

    on H.hid = X1.hid and H.start = X1.adate

    left join HD01.prod.USPROD_Calculations X2

    on H.hid = X2.hid and H.finish = X2.adate

  • For each temporary tables currently allocated, the following script will return the table name, index type(s) (heap, clustered, non-clustered), reserved mb, and row counts. Using this, confirm the allocation of temp table for both code samples are equivalent.

    Also take a look at the execution plan to confirm it looks equivalent for both.

    use tempdb;

    select o.type_desc

    , substring(o.name,1,charindex('__',o.name)+1)+right(o.name,12) table_shortname

    , si.name index_name

    , case si.index_id when 0 then 'HEAP' when 1 then 'CLUSTERED' else 'NONCLUSTERED' end index_type

    , row_count

    , ((reserved_page_count * 8024) / 1024 / 1024)reserved_mb

    , create_date

    from sys.dm_db_partition_stats ps

    left join sys.objects o on ps.object_id = o.object_id

    left join sys.indexes si on si.object_id = o.object_id and si.index_id = ps.index_id

    where is_ms_shipped = 0

    order by reserved_page_count desc;

    One more thing to do when comparing runtime of two variations of scripts, you need to flush the page buffer cache between each execution to insure they both start on equal footing.

    DBCC DROPCLEANBUFFERS;

    DBCC FREEPROCCACHE;

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • Thanks Eric,

    Thanks for the script, I'll see what the results it gives.

    The execution plans are totally different. That's the point really. Why should the execution plans be vastly different if the two versions of the temp table are functionally equivalent?

  • rchantler (9/28/2015)


    Thanks Eric,

    Thanks for the script, I'll see what the results it gives.

    The execution plans are totally different. That's the point really. Why should the execution plans be vastly different if the two versions of the temp table are functionally equivalent?

    You said "I'll see what results it gives", but have you confirmed both versions of temp table have same index type, number of rows, and allocated storage?

    If both temp tables are equivalent, then reply back with attachment of both execution plans.

    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • rchantler (9/28/2015)


    CREATE CLUSTERED INDEX IDX_hidvals ON #hidvals(hid)

    That is not declared as UNIQUE, so not quite the same thing as a PRIMARY KEY's Clustered Index (SQL will add a tie-break value which will make the key size wider).

    Also, Primary Key will not allow NULLs which might effect how the optimiser treats the index - although you have declared [hid] as NOT NULL so maybe that is as broad as it is long ...

  • Thanks again Eric,

    Yes, they have identical indexes; CLUSTERED with the same # of rows.

    I am out of the office today; will get the very different execution plans to you when I get back.

    Why do I get the feeling I am missing something obvious?

  • Thanks Kristen,

    Yes, not precisely the same but in my mind (perhaps where the problem lies) functionally the same as the hid field is always unique, even if not declared so. This table version (with potentially non-unique rows) is the fast one, which also runs counter to my expectation.

  • Ostensibly, the Temp Table has only the required data in it and every row will be used/updated in some fashion. Try NOT having any indexes on the Temp Table and see what happens.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • rchantler (9/29/2015)


    Thanks Kristen,

    Yes, not precisely the same but in my mind (perhaps where the problem lies) functionally the same as the hid field is always unique, even if not declared so. This table version (with potentially non-unique rows) is the fast one, which also runs counter to my expectation.

    I understand you as saying that sample #1, with the non-unique clustered key, performs better than sample #2, which has a primary key. They should both be clustered, since primary keys are clustered by default. The mystery, however, is why sample #2 is not the better performer, because generally speaking a primary or unique constraint on an index will help tip SQL Server's execution plan optimizer.

    One thing I notice is that in sample #1 you are creating the clustered index in a separate operation. Is this index added before or after the table is inserted? If the table is indexed after insertion(s), then index fragmentation is minimal.

    In contrast, sample #2 creates it's index inline. This means that the index pages are allocated as insertion(s) take place, so there is more potential for fragmentation.

    Using the script below, compare the allocated size in MB for both temp tables, which indicates how organized the internal page structure is. This could play a role in table statistics and ultimately the chosen plan.

    I have modified my script to add a handful of additional columns indicating the primary and unique properties of each index and also a column listing what columns are in each index.

    use tempdb;

    select o.type_desc

    , substring(o.name,1,charindex('__',o.name)+1)+right(o.name,12) table_shortname

    , si.name index_name

    , case si.index_id when 0 then 'HEAP' when 1 then 'CLUSTERED' else 'NONCLUSTERED' end index_type

    , (

    select c.name + ', '

    from sys.index_columns ic

    left join sys.columns c on c.object_id = ic.object_id and c.column_id = ic.column_id

    where ic.object_id = si.object_id and ic.index_id = si.index_id

    order by ic.key_ordinal

    for xml path('')

    ) index_cols

    , si.is_unique

    , si.is_primary_key

    , si.is_unique_constraint

    , si.fill_factor

    , si.is_padded

    , row_count

    , ((reserved_page_count * 8024) / 1024 / 1024) reserved_mb

    , create_date

    from sys.dm_db_partition_stats ps

    left join sys.objects o on ps.object_id = o.object_id

    left join sys.indexes si on si.object_id = o.object_id and si.index_id = ps.index_id

    where is_ms_shipped = 0

    -- order by reserved_page_count desc;

    order by table_shortname;

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • Thanks Jeff,

    If I leave the temp table as a heap it performs slightly better than with the clustered index, i.e. est. subtree cost 15.59 versus 17.65 on actual runs. That is instructive, thanks.

    Still, the mystery is why the primary key definition leads to a different execution plan with an ext. subtree cost of 2,548.

    I tried another version where I created a primary key unclustered and then added the clustered index. This muddies the waters further, as it has the very slow execution plan. It seems as if the existence of the primary key sets the optimizer down the wrong track!

    One further point (clue?). If i switch the query to do joins instead of left joins the performance difference evaporates.

  • Thanks Eric,

    I collected the stats with your expanded script. The 2 indexes are both reported as CLUSTERED with hid as the only column, 4318 rows and 0 reserved_mb.

    As expected, the primary key version is also reported as a primary key and unique.

    I should also mention I'm running on 2008 R2.

  • Hi again Eric,

    Re the fragmentation - I've just tested this; it makes no difference performance-wise in the clustered index example whether the data is inserted before or after the index is created.

  • I've attached the 2 execution plans.

Viewing 15 posts - 1 through 15 (of 30 total)

You must be logged in to reply to this topic. Login to reply