Memory-optimized table indexes in SQL 2014

  • Dr. Diana Dee (12/29/2013)


    Tom - What is the "hardware cache line length"? Do you know of a reference I can read about that?

    Thanks

    Diana

    I can't think of any references to point you at, because this is something that's been around for decades and anything modern written about it is probably inside chip specs or articles about research in hardware design.

    You can think of it being something that the system hardware does for access to the main RAM storage which is analogous to what SQL Server does for access to disc storage - and then cache line length is analagous to SQL Server's page size, because that's the unit that gets pulled in, when the server wants a row it pulls in the page. But don't take the analogy too seriously, because there are a lot of differences.

    Roughly speaking the cache line length is the size of chunks read from one hunk of storage into another hunk of storage which is closer to the processor that wants the data. I don't know how many cache layers typically exist in today's systems, or what size cach stores are or how long the lines in them are, but the basic mechanism hasn't changed for decades, and I'll describe it although the numbers I'll use are certainly out of date. Typically a processor will ask for a word (say 16 bits) of store, and the processor's store interface will translate that into a request for a cache line (maybe 128 bits) to be stored in a very fast access cache on the same chip as the processor - and if that line is already there, it can satisfy the processor's request immediately, while if it isn't the processor has to wait for the line to be fetched. The chip's store access mechanism may have another cache level, that is shared between all the processors on the chip, so if it has to get a new line it translates that request into a request for a line at the next level - say a line of 512 bits - if that line is already present on the chip, thefist cache's request can be satisfied immediately, otherwise it has to wait until the second cache's line has been fetched from off the chip. That off-chip request is passed to an off-chip cache, to which this chip has very fast access, which holds lines of say 2048 bits; if this cache doesn't have the required data, it fires of a request to the system's main RAM store for the required 2048 bit line. Of course since teh main store is shared between peripherals and all the system's processor chips - at least that's the usual case there have been systems where the main store is actually a mechanism for moving and/or holding multiple copies of pages (of say 8192 bits) between the fast off-chip local cache stores, plus having early level cache invalidation and broadcaset update mechanisms for handling updates, but as far as I know Windows has never been ported to a system that does it that way; NUMA is yet another variant for making the bottom level RAM. Incidentally, it used to be and may still be common practise to distinguish instruction fetch from data fetch and have separate caches for the two, because instruction fetch in well designed software written in a sensible language which provides a sane translation to machine code is generally very well localised, while data fetch is less so.

    The reason hardware works like this is because the main RAM store is very slow - it's a long way from the processor, and signals on the surface of copper wires take a long time to get between the processor and the store, so provided access is well localised pulling big chunks into a closer/faster store gives a performance gain despite all the cache invalidation logic. Also, taking a bigger chunk at a time from the distant store cuts down the control overhead - if you get 1024 bits for one request instead of just 16 bits it will be 512 times as fast provided you can transmit 1024 bits in parallel - so a while back there was a tendency to try to build main storage RAM modules so that thy could transmit a full cache line in parallel; but even when the cache line is wider than the store interface it gives a gain, since there are fewer control cyles to be handled and the multiple interface words that make up a line can be transmitted in burst mode.

    The reason I wonder whether this could be somewhat in conflict with the storage optimised tables is quite simple: if the storage optimisation is placing stuff effectively at random, instead of in some fixed order determined by an index chosen because in matches the most common access pattern, it will interfere with locality of access and reduce cache hit rates. I've seen software people who don't understand hardware fall into that trap before; I suspect that the MS people will have looked at the tradeoffs and done the right thing, but it would be nice to know.

    Tom

  • Tom – Thank you for taking your time to reply. I understand what you are saying about storing rows from memory into CPU cache.

    For more detailed information, try these:

    http://research.microsoft.com/pubs/193594/Hekaton%20-%20Sigmod2013%20final.pdf

    and the Day 2 keynote of Dr. DeWitt:

    http://www.sqlpass.org/summit/2013/PASStv.aspx

    Day 2 // KEYNOTE // David DeWitt: Hekaton (start at 26 minutes; listen for about an hour)

    Diana

  • Diana - Thanks for the references. I imagine they will provide a lot of extra information.

    Tom

  • All of these recent SQL 2014 questions are making me want to install 2014 somewhere NOW. 🙂

    Like many others, I doubt I will be using 2014 in production for a few years. Sigh...

  • Nice question on 2014. Learnt something today about the new version. Thank you Dr. Diana. 🙂

  • Thanks for the question.

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • nice question.. Thanks.

  • Dana, Thomas, Revenant, Koen, pchirags -

    You are welcome!

    Diana

  • +1 nice memory effort !

    ... was pratically sur I read that one week or two ago

Viewing 9 posts - 16 through 23 (of 23 total)

You must be logged in to reply to this topic. Login to reply