Nested Loop Joins on Temp Tables

  • I have a #Temp table with between 50,000 and 150,000 rows in it and a permanent table with a couple of milion rows. Each table has a clustered index on an Int column and I'm trying to join the two on those columns.

    I had expected pretty good performance on this, but it's turned out to be very slow. The execution plan shows that the optimizer has chosen a nested loop join which doesn't sound good to me. It's done this because it's using the default "guess" of there only being 1 row in the temp table. I've read that creating the index after inserting data into the #Temp table will also create statistics, so that's what I did, thinking that the optimizer could use the statistics to pick a better join option.

    First question: Will the optimizer use statistics on a #Temp table to improve join performance?

    Second question: Should I use a join hint instead?

    Thanks very much for your insight.

    -Ryan

  • Try the index. Temp tables do keep stats, and it can make a HUGE difference.

    If that doesn't do what you need, try a join hint. I've only ever had to do that once, but it was on a temp table, and it did help.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • I have not tried this, but I would try to update the statistics of the temp table right after the insert, so you can make sure it will use the correct join type, based on the count of rows.

    I understand it's trying to use 1 as the number of rows, probably because it just inserted, that would make sense.

    I'm no statistics Guru, but I think the statistics will be used, even on a temp table, it should not matter. I would try to issue a Update statistics Table_Name to make sure the join is more efficient.

    Try this, and if it does not work, post the plan you get with the stats update.

    Cheers,

    J-F

  • Thanks for these amazingly fast responses!

    @GSquared -

    What do you mean "Try the index?" I created the #Temp table, inserted the rows and then created the index. The idea was that the index create would also create statistics.

    @J-F Bergeron -

    I'll try updating statistics expressly rather than using an index create to do that for me.

    Thanks again!

    - RbH

  • Ryan, could you possibly post up the code and an execution plan so we can see what is going on?

    __________________________________________________

    Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
    Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

  • Sorry, I don't mean to come off as secretive, but I'd probably get fired for posting code. That issue was raised by the very top of our management only a couple of weeks ago.

    If I can't solve it with the suggestions I've gotten so far, I'll scrub the code so that I can post it.

    Thanks a lot for your interest.

    -RbH

  • Ryan, I've come with a small test table, that makes me doubt over your statistics.

    IF EXISTS ( SELECT 1

    WHERE OBJECT_ID('tempdb..#temp') IS NOT NULL )

    DROP TABLE #temp

    --=== Generate a temp table to see the statistics update on a Temp Table.

    SELECT TOP 100000

    IDENTITY( INT,1,1 ) AS ID

    INTO #temp

    FROM master.sys.sysobjects o1

    CROSS JOIN master.sys.sysobjects o2

    SELECT *

    FROM #temp

    WHERE ID > 50000 -- Table scan, estimated 50000 (Correct estimate)

    CREATE CLUSTERED INDEX IX_#Temp_ID ON #temp (ID ASC)

    SELECT *

    FROM #temp

    WHERE ID > 50000 -- Clustered index seek (estimates 50000 (correct estimate))

    This small piece of code creates a temp table, feeds it with 100 000 sequential IDs, and selects from it. Everytime I select from it, even right after the creation, it "guesses" the stats of the temp table correctly, therefore giving me the right information over the counts.

    I did not test it with a join on a big table, but I suspect it would give the same behavior.

    Can you explain how you are creating the table?

    Cheers,

    J-F

  • @J-F Bergeron -

    This is how I'm creating the table:

    CREATE TABLE #Temp ( ID INT NOT NULL )

    INSERT INTO #Temp

    SELECT ID

    FROM PermanentTable WITH (NOLOCK)

    WHERE Date <= @Date

    ALTER TABLE #Temp ADD CONSTRAINT [PK_Temp] PRIMARY KEY CLUSTERED (ID)

    Then I join it to a large table - I originally said a couple of million rows - it's actually 150 million. The execution plan shows an estimated row count for that join of 1. The actuall row count was about 77,000.

  • Thanks everyone for the input. Here's what I found out:

    First I should disclose that I am not a DBA at my work, I am a developer. After some prodding, a got a DBA to tell me that statistics are disabled in tempdb. So, that explains it. I added the appropriate hints and performance is enormously better.

    I am always overwhelmed by the responsiveness of this community! Thanks again!

    - RbH

  • No problem, Ryan. We're not all DBA's. I'm a developer too.

    I'm kind of curious though. Why would anyone disable statistics on tempDB?

    I hope it's just a temporary condition.

    That can pretty much cripple the optimizer, as you just found out.

    __________________________________________________

    Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
    Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

  • I don't know why statistics are disabled. I'm afraid it's not a temporary condition, though... unless I can convince them to rethink it! The one person that I spoke with was hesitant... that's why I added the hints. Maybe he'll come around. Or maybe there's more going on that I don't know about.

  • Good luck. If that situation doesn't change, then every time you use a temp table you are going to have to use hints. Not good practice It pretty much defeats the point of the optimizer. There are quite a few people here who would agree.

    __________________________________________________

    Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
    Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

  • What I meant by "test the index" was just that. Try creating the index and see what that does.

    Obviously, it's not applicable, based on data you've since discovered.

    I'll definitely second the motion to ask why stats are disabled on tempdb. While you're at it, find out if stats are disabled on other databases by default on that server. And, if so, how often they are updated.

    What I suspect is that someone ran into a performance problem on a large, high-activity table, where disabling statistics on it greatly improved performance. That can happen. Then they went, "Wow! Look what that did! I'm gonna do that on ALL my databases!", not realizing the performance hit it would cause. Again, I've seen that happen.

    Might not be the case this time, but do question it.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • @Ryan:

    Maybe you could convince your "DBA" to start a thread here at SSC with a title like "What are the benefits/costs of disabled statistics on tempdb?"

    Side note: Usage of "DBA" instead of DBA is intentionally... 😉



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • ryan.hart (3/17/2010)


    I don't know why statistics are disabled. I'm afraid it's not a temporary condition, though... unless I can convince them to rethink it! The one person that I spoke with was hesitant... that's why I added the hints. Maybe he'll come around. Or maybe there's more going on that I don't know about.

    Wow and unbelievable! Does that "DBA" like pork chops? 😛

    I agree... there are a fair number of articles I've seen that recommend turning off statistics so an automatic statistics update doesn't take you by surprise during a busy time... but I've NEVER heard of anyone doing that to TempDB and would strongly recommend against it since it's not just user tables that are affected. You've already seen what happens when you do.

    If the "DBA" decides to stand his ground (no matter how bad an idea that may be), then you could create stats in TempDB for your tables just like you do your index. Still, if the other DB's have automatic stat updates turned off and the DBA hasn't been updating them on a reasonable basis, it still might not help.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

Viewing 15 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply