Worktables & Hash Tables

  • Hi All

    After testing and researching - I want to make sure that my understanding is correct regarding Worktables & Hash Tables

    >Worktables & Hash Tables are 2 different things

    >Hash tables are only created when there is a Hash Join.

    >Worktables are created when immediate results need to be stored somewhere

    >Worktables & Hash Tables are both created in Memory first and spill to tempdb if necessary

    >A Merge join will not employ a worktable if both inputs are unique

    Please advise if I am on the right track here?

    Thanks

  • Basically correct. Although, it really only matters to the internals engine. For your purposes, both are dealt with the same way. According the to the Internals book, a hash table goes to what is called a work file, but that's treated in all ways the same as a work table. Further, when you have a hash join and you look at the I/O stats of the query, that hash is evidenced as something called a work table. So, I just treat the hash as a type of work table since the external evidence is exactly the same and MS labels it that way.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Thanks

    Basically correct. Although, it really only matters to the internals engine. For your purposes, both are dealt with the same way. According the to the Internals book, a hash table goes to what is called a work file, but that's treated in all ways the same as a work table. Further, when you have a hash join and you look at the I/O stats of the query, that hash is evidenced as something called a work table. So, I just treat the hash as a type of work table since the external evidence is exactly the same and MS labels it that way.

    wrt to the memory/tempdb situation

    Are work/hash tables first created in memory and spilled to tempdb if necessary?

    Thanks

  • SQLSACT (1/31/2013)


    Thanks

    Basically correct. Although, it really only matters to the internals engine. For your purposes, both are dealt with the same way. According the to the Internals book, a hash table goes to what is called a work file, but that's treated in all ways the same as a work table. Further, when you have a hash join and you look at the I/O stats of the query, that hash is evidenced as something called a work table. So, I just treat the hash as a type of work table since the external evidence is exactly the same and MS labels it that way.

    wrt to the memory/tempdb situation

    Are work/hash tables first created in memory and spilled to tempdb if necessary?

    Thanks

    Yes & no. All temporary tables (hash/work tables, temp tables, table variables) are written to disk as well as memory, every time. But the primary work is done in memory, if it can. Otherwise it puts the excess on the place it reserved on the disk.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Grant Fritchey (1/31/2013)


    SQLSACT (1/31/2013)


    Thanks

    Basically correct. Although, it really only matters to the internals engine. For your purposes, both are dealt with the same way. According the to the Internals book, a hash table goes to what is called a work file, but that's treated in all ways the same as a work table. Further, when you have a hash join and you look at the I/O stats of the query, that hash is evidenced as something called a work table. So, I just treat the hash as a type of work table since the external evidence is exactly the same and MS labels it that way.

    wrt to the memory/tempdb situation

    Are work/hash tables first created in memory and spilled to tempdb if necessary?

    Thanks

    Yes & no. All temporary tables (hash/work tables, temp tables, table variables) are written to disk as well as memory, every time. But the primary work is done in memory, if it can. Otherwise it puts the excess on the place it reserved on the disk.

    Thanks

    All temporary tables (hash/work tables, temp tables, table variables) are written to disk as well as memory, every time

    Which area of work will go to memory and what will go to disk?

    But the primary work is done in memory

    If memory cannot accomodate the primary work, is that spilled to disk?

    I think I need to do some more research on this topic, still very confused

  • Get a copy of SQL Server 2008 Internals by Kalen Delaney. Then read it 100 times. You'll still be confused (as I am) but less so.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • I've read your Execution Plans book and I've taken a lot from there.

    This particular area of worktables/hashtables is eluding me though

    Thanks for your help

  • Well what's the major issue. You're asking about very low level behavior. It's not immediately applicable to most tuning situations. I mean, there is a specific area called workspace memory used for temporary allocation of objects within a query that is pulled from the buffer pool. How is that useful to you if you have a query creating a work table? Fix the query, add the index, modify the structure... Do what's necessary to get by. Does it really matter exactly where the allocation goes?

    If you need to look at allocations, start with sys.dm_os_memory_clerks. You can see what is being allocated to the different memory management allocations. You can look at the buffer pool through sys.dm_os_buffer_descriptors. These can help you see where the allocation is going.

    Memory allocation for a query is shown in the actual execution plan. You can also look at sys.dm_exec_query_memory_grants for execution queries. To see specifically where information is going, take a look at sys.dm_db_session_space_usage. That's going to show you where the memory allocations are going.

    As to precisely when the memory/disk spill occurs, I'm not sure. It's largely a question of how much memory do you have. Once there's not enough, everything goes to disk. You can kind of figure out if you're using disk or not by looking at the allocations available through the DMOs I listed above.

    But I'm still unclear where this gets you, except a more complete understanding of why a hash, sort, or work table actually does cause things to slow down.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Thanks

    I understand the workspace memory area (Well, I hope I do). A query will request required memory and once the query has been granted that, if it needs to, it will request additional memory for processing a hash join for example, which might have to spill to disk it memory can't accomodate that additional memory. Correct?

    I'm not experiencing any particular issue, I'm just trying to understand the behaviour.

    Where I got thrown off a bit was here

    All temporary tables (hash/work tables, temp tables, table variables) are written to disk as well as memory, every time.

    Thanks

  • SQLSACT (1/31/2013)


    Thanks

    I understand the workspace memory area (Well, I hope I do). A query will request required memory and once the query has been granted that, if it needs to, it will request additional memory for processing a hash join for example, which might have to spill to disk it memory can't accomodate that additional memory. Correct?

    I'm not experiencing any particular issue, I'm just trying to understand the behaviour.

    Where I got thrown off a bit was here

    All temporary tables (hash/work tables, temp tables, table variables) are written to disk as well as memory, every time.

    Thanks

    Sorry, but they are. You can watch it occur in the allocations. None of the temporary objects are memory only. They all require some disk allocation. It's just that the majority of the storage is, usually, depending on the system, etc., in memory.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Thanks Grant

    Definately need to do some more Research and Testing

    Thanks

  • Hi Grant

    I know this post is getting old, I thought I'd just fill you in on my findings

    I watched a video (from SQL.Bits) called "Advanced SQL Server 2008 troubleshooting by Klaus Aschenbrenner"

    I noted something interesting, it seems that spill to disk happens when statistics are out of date.

    >> SQL requests a memory grant of 10kb. This is based on 20 rows

    >> SQL gets the memory grant but realizes that there's actually 200 rows instead of 20

    >> SQL can't request the grant to be increased

    >> Spill to tempb

    Also, I've done some testing on a table with 30mil rows. When doing a sort 5000 rows, I see the memory allocation happening but I don't see any tempdb allocations (I'm using sys.dm_db_session_space_usage and sys.dm_db_task_space_usage

    )

    When I do a sort of 10mil rows, I see the memory allocations as well as tempdb allocations as well as a sort warning in profiler.

    When using sys.dm_exec_memory_grants - If the ideal_memory is more that the granted memory. I think this is where the spill is happening. ??

    Thanks

  • Stats being out of date like that is only one possible cause of a spill, but yeah, that could be it. And yes, what you're seeing is SQL Server wants X for memory, X isn't available. So it does a spill to disk for the amount of memory that it has to in order to satisfy the processing needs of the query. That's pretty much it in a nutshell.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • What I'm struggling to see is the tempdb allocation when spill doesn't happen.

    Thanks

  • Why would you see a tempDB allocation when something's not spilling to TempDB?

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass

Viewing 15 posts - 1 through 15 (of 23 total)

You must be logged in to reply to this topic. Login to reply