Logical reads and its size

  • I have a stored procedure which is slow and being told by the Database Team that it is processing 40 TB of logical pages in the memory.

    Also,in the trace it was found that there were 3 more instances of this procedure which were running and all in all these 4 instances were processing 120 TB of logical pages.

    Our DB size is around 1 TB and RAM on SQL Server is 196 GB and out of which 110 GB is being allocated to SQL Server.

    When inquired about how they arrived to 120 TB, it was being said that one logical read reads a page of size 8KB so (Number of logical reads*8)/1024 (in MB) will give the logical read size in MB.

    I went through the sys.dm_exec_request, but was unable to find any details on the size of logical reads and on how to calculate it.

    https://docs.microsoft.com/en-us/sql/relational-databases/system-dynamic-management-views/sys-dm-exec-requests-transact-sql?view=sql-server-2017

    I agree that the procedure need to be fixed but coming to terms with 120 TB  being processed is very perflxing(how a RAM of 196 GB is processing 120 TB of data.)

    I am seeing couple of posts on the logical read size, but the replies are inconclusive.

    https://social.msdn.microsoft.com/Forums/sqlserver/en-US/95e50d64-8cb8-4e02-b855-bec642e79200/logical-reads?forum=sqldatabaseengine

    https://social.msdn.microsoft.com/Forums/sqlserver/en-US/4b912b12-a8da-4241-9db1-57f7ed7130d2/logical-read?forum=sqldatabaseengine

    So my question is:

    Is this the right way to calculate the size of logical reads?

    Should the size of logical reads be used to measure the performance or the number of logical reads provided by the dmv is enough to ?

    As this occurred on the Production Server,I am unable to provide the sql plan.

    Any information on this will be very helpful. Thank you!!!

  • Can we at least see the stored procedure code?  Hard to fix something without seeing it.

  • I'd be tempted to say the measurement includes spooling to tempdb and either sorting or joining

    if you have to sort/join without decent indexes then that would cause a lot of that traffic.

    try looking at the indexes and re-measuring afterwards

    MVDBA

  • I use this query to find the details of the last time a query was run

    DECLARE  @SqlIdentifier   nvarchar(100) = 'uniqueBitOfTextInQuery'

    SELECT TOP(1) -- 5A6A34DC-C45A-4AC1-80E8-5A65B3BE9F3F
    d.name AS DatabaseName,
    dest.text AS SQL_Text,
    deqs.last_execution_time,
    deqs.creation_time,
    deqs.last_elapsed_time,
    deqs.last_logical_reads,
    deqs.last_logical_writes,
    deqs.last_physical_reads,
    deqs.last_ideal_grant_kb,
    deqs.last_rows,
    deqs.last_worker_time AS last_cpu_time,
    deqs.execution_count,
    deqs.total_worker_time AS total_cpu_time,
    deqs.max_worker_time AS max_cpu_time,
    deqs.total_elapsed_time,
    deqs.max_elapsed_time,
    deqs.total_logical_reads,
    deqs.max_logical_reads,
    deqs.total_physical_reads,
    deqs.max_physical_reads,
    deqp.query_plan,
    detqp.query_plan AS query_plan_text,
    decp.cacheobjtype,
    decp.objtype,
    decp.size_in_bytes
    FROM sys.dm_exec_query_stats deqs
    CROSS APPLY sys.dm_exec_sql_text(deqs.plan_handle) dest
    CROSS APPLY sys.dm_exec_text_query_plan(deqs.plan_handle,deqs.statement_start_offset,deqs.statement_end_offset) detqp
    OUTER APPLY sys.dm_exec_query_plan(deqs.plan_handle) deqp
    INNER JOIN sys.databases d
    ON dest.dbid = d.database_id
    LEFT JOIN sys.dm_exec_cached_plans decp
    ON decp.plan_handle = deqs.plan_handle
    WHERE dest.text LIKE '%' + @SqlIdentifier + '%'
    AND dest.text NOT LIKE '%5A6A34DC-C45A-4AC1-80E8-5A65B3BE9F3F%'
    ORDER BY deqs.last_execution_time DESC
  • Jonathan AC Roberts wrote:

    I use this query to find the details of the last time a query was run

    Could you explain please why the guid is there?

    Aus dem Paradies, das Cantor uns geschaffen, soll uns niemand vertreiben können

  • The calculation is correct. To explain it further and the documentation for such:

    Pages and Extents Architecture Guide

    "In SQL Server, the page size is 8-KB. This means SQL Server databases have 128 pages per megabyte."

    Reading Pages

    A logical read occurs every time the Database Engine requests a page from the buffer cache.

    So each logical read is an 8-KB page. And you just do the math from there. For example - 1000 reads

    1000 * 8 = 8000 KB

    8000 KB /1024 = 7.8125 MB

    which is also the same as

    1000/128 = 7.8125 MB

    In terms of the part you are finding perplexing, Pages can and will go in and out of memory as needed. In a query you can read the same page more than once. it doesn't have anything to do with how much RAM the server has or how many pages a table has. It has to do with how the query was written.

    Sue

     

  • It must be for testing.

    Aus dem Paradies, das Cantor uns geschaffen, soll uns niemand vertreiben können

  • Steve Collins wrote:

    Jonathan AC Roberts wrote:

    I use this query to find the details of the last time a query was run

    Could you explain please why the guid is there?

    I don't think it's needed now. It was originally there as I put the hard coded SQL I was looking for within the query (now it's in @SqlIdentifier), it was just there to exclude the dvm query from the results.

  • Is this the right way to calculate the size of logical reads?

    Yes.

    Should the size of logical reads be used to measure the performance or the number of logical reads provided by the dmv is enough to ?

    Yes.

    As this occurred on the Production Server,I am unable to provide the sql plan.

    Then there's not much we can do for you.

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • ScottPletcher wrote:

    As this occurred on the Production Server,I am unable to provide the sql plan.

    Then there's not much we can do for you.

    If the OP runs the query I provided on the production server it should provide the query plan.

  • Jonathan AC Roberts wrote:

    ScottPletcher wrote:

    As this occurred on the Production Server,I am unable to provide the sql plan.

    Then there's not much we can do for you.

     

    If the OP runs the query I provided on the production server it should provide the query plan.

    But wouldn't that also involve giving us the query plan?  If they absolutely can't give us a query plan (to protect their production info), I don't see how we can be a lot of help.  Even the code itself would only be marginally helpful w/o a query plan.

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • ScottPletcher wrote:

    Jonathan AC Roberts wrote:

    ScottPletcher wrote:

    As this occurred on the Production Server,I am unable to provide the sql plan.

    Then there's not much we can do for you.

    If the OP runs the query I provided on the production server it should provide the query plan.

    But wouldn't that also involve giving us the query plan?  If they absolutely can't give us a query plan (to protect their production info), I don't see how we can be a lot of help.  Even the code itself would only be marginally helpful w/o a query plan.

    I didn't realise that it was to protect information, I just though it was he couldn't run the query for testing.

  • He can supply it - just needs to use SQL Sentry Plan explorer to anonymize  it before posting here.

  • jignesh209 wrote:

    I agree that the procedure need to be fixed but coming to terms with 120 TB  being processed is very perflxing(how a RAM of 196 GB is processing 120 TB of data.)

    The query or set of queries in question isn't loading 120TB into RAM... it's reading a LOT of the same data over and over and over again from RAM.

    This type of thing (a mistake, for sure) could be (for example) from bad criteria that causes many "Many-to-Many" joins.  Another name for an unintended "Many-to-Many" join is "Accidental CROSS JOIN".  It's the same data being read over and over and can produce a whole lot of internal data duplication in the execution plan that could also come out in the results.  A lot of people make the mistake of just throwing a DISTINCT on the SELECT to prevent returning millions (and, sometimes, billions) of duplicate rows and think they've fixed the problem.  It's actually a form of "Hidden RBAR" on steriods and it's very detrimental to performance and resource usage.  Memory is awfully fast but even the "pipe" that leads to/from it has limits.  If a process is consuming 120TB of logical reads (which are reads from memory), then the code has a real problem.

    If you want to see the definition of what a "logical read" is, look at the following link for "Set Statistics IO" and search the page for"logical reads" where it will tell you that it's the "Number of pages read from the data cache".  Remember then that the "data cache" lives in RAM (memory) and that a "page" in SQL Server contains 8,092 bytes and that the smallest physical unit that SQL Server can read is 1 page.

    https://docs.microsoft.com/en-us/sql/t-sql/statements/set-statistics-io-transact-sql?view=sql-server-2017

    Remember then that the "data cache" lives in RAM (memory) and that a "page" in SQL Server contains 8,092 bytes and that the smallest physical unit that SQL Server can read is 1 page.  The following explains that and it also explains that there are 128 pages in a MegaByte.

    https://docs.microsoft.com/en-us/sql/relational-databases/pages-and-extents-architecture-guide?view=sql-server-2017

    Your DBA team is probably using a tool like Adam Machanic's sp_WhoIsActive or any other number of tools that make good use of the sys.dm_exec_requests DMV, which returns the number of logical reads for a given request.

    https://docs.microsoft.com/en-us/sql/relational-databases/system-dynamic-management-views/sys-dm-exec-requests-transact-sql?view=sql-server-2017

    Again, keep in mind that the same data (pages in Ram/memory) could be read over and over.  While a lot of people don't think that's much of a problem because it's being done in high speed ram,  it can be a strong indication that some code has a problem.  If the code is reading 120TB of data as logical reads then, unless you have a database that measures in the Peta-Bytes, you almost assuredly have a bit of code that's causing some serious duration and resource usage problems.

    The way the your DBA team is probably coming up with the 120TB number is that they're adding up all of the logical reads reported for the various parts of the query or queries being executed by a given SPID and dividing that number by 128 (see previous references) to come up with the number of Mega bytes of Logical IO being consumed.

    If they do it right with sys.dm_exec_requests, they should actually be able to tell you which parts of the code are causing the most problems according to the count of logical reads for each query in the code.

    So the answers to the other two questions you asked...

    jignesh209 wrote:

    Is this the right way to calculate the size of logical reads?

    Should the size of logical reads be used to measure the performance or the number of logical reads provided by the dmv is enough to ?

    ... are ...

    Yes and Yes with only rare exceptions.

    p.s.  Now that I read through the rest of the posts, I see that Sue and Scott replied pretty much the same and certainly to the same end.  In the long run, I guess I'm just confirming that they and your DBA Team are correct. :blush:

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • frederico_fonseca wrote:

    He can supply it - just needs to use SQL Sentry Plan explorer to anonymize  it before posting here.

    The thing is... he doesn't actually need to supply it because he didn't actually ask a question about how to make his code faster.  He asked 3 main questions having to do with his DBA Team and their (correct) claims about logical reads and how they translate to 120TB of memory IO.

    And, no... not just replying to you... it's for everyone.  You just had the last and one of the shortest posts about supplying an execution plan. 😀

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 15 posts - 1 through 15 (of 17 total)

You must be logged in to reply to this topic. Login to reply