Caching

  • Comments posted to this topic are about the item Caching

  • I've had the Developers cache semi-static and static data several times rather than calling stored procedures or doing embedded T-SQL to get the data. For the things I've used it for (caching data for way-to-common calls), it has made a huge difference in performance.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • There are so many types of caching too. Most developers do at least one type without paying any attention to it: memoization. An example of this is to get a value outside of a loop that is used within it as opposed to within it. Jeff mentioned the caching of static (or at least relatively static) data which is another extremely common use.

    Caching should always be considered when developing applications. In this forum many of us might be considering maintaining consistency with the current state of the database i.e. what if it changes and we don't have the latest. Of course, getting the latest might be exactly the wrong thing to do. This is where the classic "it depends" applies: sometimes you want to ensure that the state of the data is the same for the whole of a particular context. That context might be the whole of the duration of a system is running (process), a service remaining active, an application domain's lifetime (AppDomain in .NET), an operation, a thread or, perhaps, a transaction.

    This is an example of why experience counts. Also, it is an example of why some people will always be better at software development than others. There will always be some developers who will never consider these types of questions. It may matter. Or it may not.

    Those of us who do consider these facets of software development have a responsibility to the projects we work on to highlight when and where we need considerations such as caching to be made.

    Gaz

    -- Stop your grinnin' and drop your linen...they're everywhere!!!

  • I've often used caching to be fair.

    A good example is database driven multi lingual translation of web pages (i.e. not in resx files) which can make sense when you want to allow translation of a site through an admin interface (but don't want to generate resource files on the fly). You can cache the translations by page to avoid chatty db hits all the time, invalidating cache when an edit is made.

  • I have used caching more for search pages, but not with more recent hardware. The old days of submitting a request and waiting 15 minutes are gone. We would save the search results to a caching table so you did not wait 15 more minutes to scroll to the next page of 20 lines of output.

  • It's funny how often this happens to me in real life. Working with a customer yesterday, their #1 most resource-consuming query was running every 5 minutes to fetch all new data from the last 60 minutes. Doh!

  • I was once tasked with improving the performance of a stored procedure that ran for fifteen minutes on a good day and more than an hour on occasion. It was 5,000 lines of legacy dog poop that still ran for more than 10 minutes after optimizing the top 10 statements.

    So I rearchitected the stored procedure to insert the resultset into a summary table keyed on Run_Date. Each time the procedure was called, it would first check if a resultset keyed on the current date was already in the summary table, and if so it would simply bypass the main body of T-SQL code and return summary records where Run_Date = @Run_Date. I even scheduled a job to kick off the procedure at 2am each morning, so the first user in the office didn't have to wait. After knocking the load time of the report down from 15 minutes to 2 seconds; they thought I was a database optimization genius.

    This technique works great for scenarios where the resultset is expected to be static for a given hourly, daily, etc. period of time and you need a quick fix without resorting to a complete re-write of the procedure code. Also, the resultsets contained in a summary table don't necessarily have to be used for temporary "caching"; they can be retained permanently and leveraged to facilitate monthly and yearly aggregate reporting.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • I had a similar scenario with a web server generating reports from a rapidly changing, real time database. The original system used a ridiculous number of queries that ran every minute. Of course this hammered the database server so I redesigned it to run a single query once every five minutes (no one noticed the latency :-)). I used a simple xml document as a cache and this allowed me to display the results using xslt. As a bonus I was able to extract more useful reports and make it pretty too.

  • I didn't consider summary data as caching, but if that is the case, then our systems would never function without it. Analyzing customer trends on TB's of raw data would take hours.

  • EricEyster (2/26/2014)


    I didn't consider summary data as caching, but if that is the case, then our systems would never function without it. Analyzing customer trends on TB's of raw data would take hours.

    I'm thinking that any time we persist data in a more accessible secondary location for re-use, wether it's pages in memory buffer cache, derived resultsets in a summary table or materialized view, in-memory tables (xVelocity), or memcached data in the middle tier, then it's all various forms of caching.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • Have you considered a front-end query results cache as an additional performance improvement component? Specifically one that automatically invalidates any cache result when the data is modified and doesn't require any coding changes?

    Dave

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply