Query returns 122 million records in 9 hours

  • We have a very silly weekly process to
    to do the Full Load of 5 year data in to Qlikview Application (BI Reporting Tool).
    We return 305 fields from 2 tables (150 and 140 million records) .

    It's a different story why we do it and I tried my best to convince Management to stop this insanity.
    So far no luck on this path.
    So it is what it is.

    Question.
    Should I try to tune the query, indexes to somehow make it a little bit faster?
    Or with 122 million returned data set it's a waste of time?
    I see huge memory pressure during query run. We only have 40 GB Memory in UAT environment.
    In Production it's 80. Right now it takes 9.5 hours to run this query.

  • RVO - Monday, September 11, 2017 11:27 AM

    We have a very silly weekly process to
    to do the Full Load of 5 year data in to Qlikview Application (BI Reporting Tool).
    We return 305 fields from 2 tables (150 and 140 million records) .

    It's a different story why we do it and I tried my best to convince Management to stop this insanity.
    So far no luck on this path.
    So it is what it is.

    Question.
    Should I try to tune the query, indexes to somehow make it a little bit faster?
    Or with 122 million returned data set it's a waste of time?
    I see huge memory pressure during query run. We only have 40 GB Memory in UAT environment.
    In Production it's 80. Right now it takes 9.5 hours to run this query.

    Please post DDL for both tables, as well as the details of any indexes present on these tables.  If you've captured a query plan, that would also be helpful for you to post it.   I don't think you'll be wasting your time trying to tune it, but it still may be a lengthy query, depending on what the cause of the problem is.

    Steve (aka sgmunson) πŸ™‚ πŸ™‚ πŸ™‚
    Rent Servers for Income (picks and shovels strategy)

  • DDL attached.
    SP code attached.
    VIEW source code that is called from SP attached.
    Skeleton is:

    SELECT
    305 COLUMNS
    FROM
    v_qv_Transformations --VIEW source code attaced
    WHERE
            (TRANS_DATE >= @StartDate)
            AND
            (
                (t_ROW_VERSION_CSI <= @trans_UB)
                AND
                (s_ROW_VERSION_CSI <= @split_UB)

            )

    Indexes DDL attached

  • RVO - Monday, September 11, 2017 11:27 AM

    We have a very silly weekly process to
    to do the Full Load of 5 year data in to Qlikview Application (BI Reporting Tool).
    We return 305 fields from 2 tables (150 and 140 million records) .

    It's a different story why we do it and I tried my best to convince Management to stop this insanity.
    So far no luck on this path.
    So it is what it is.

    Question.
    Should I try to tune the query, indexes to somehow make it a little bit faster?
    Or with 122 million returned data set it's a waste of time?
    I see huge memory pressure during query run. We only have 40 GB Memory in UAT environment.
    In Production it's 80. Right now it takes 9.5 hours to run this query.

    I don't know that we need the query just yet. What you need to do is figure out if it is the query itself of the INSERTING OF DATA that is crushing your process. Do a waitstats and file IO stall analysis while the process is running to see where the pain really is.

    BTW, are you getting minimally logged inserts into your target table(s)? If not that must be the first thing to make happen IMHO. Also, if you have indexes on the target table it will almost certainly be faster to drop them and recreate them after the inserts are complete.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • ".....BTW, are you getting minimally logged inserts into your target table(s)? If not that must be the first thing to make happen IMHO. Also, if you have indexes on the target table it will almost certainly be faster to drop them and recreate them after the inserts are complete.

    There is no INSERT anywhere on SQL Server.
    This is the process . . .

    Qlikview Application calls SP
    -----SP returns 122 million records
    ---------Qlikview loads all output into memory (they have 500 GB memory)

  • Going to need to see an actual execution plan for this.   Not a fan of the timestamp data type, but I'm not sure if that's really a problem or not.   There are also some CASE statements in that view that might be better as permanent tables.     I'm also thinking you might want a SPARSE index on CSI_TRANSACTION on the PRODUCT_ID field WHERE PRODUCT_ID = 321, and another SPARSE index on CSI_TRANSACTION on the SRC_SYSTEM_ID field WHERE SRC_SYSTEM_ID = 160.

    Need to know where the optimizer is returning large row estimates but ultimately delivering small row counts.

    Steve (aka sgmunson) πŸ™‚ πŸ™‚ πŸ™‚
    Rent Servers for Income (picks and shovels strategy)

  • And I agree with Kevin on the minimally logged inserts, given that 120 million row expectation.   Also, if you have to insert that many rows, could you possibly lock that table if the insert was fast enough?   Also agree with the dropping and re-creating of the indexes on the target table.

    Steve (aka sgmunson) πŸ™‚ πŸ™‚ πŸ™‚
    Rent Servers for Income (picks and shovels strategy)

  • RVO - Monday, September 11, 2017 1:16 PM

    ".....BTW, are you getting minimally logged inserts into your target table(s)? If not that must be the first thing to make happen IMHO. Also, if you have indexes on the target table it will almost certainly be faster to drop them and recreate them after the inserts are complete.

    There is no INSERT anywhere on SQL Server.
    This is the process . . .

    Qlikview Application calls SP
    -----SP returns 122 million records
    ---------Qlikview loads all output into memory (they have 500 GB memory)

    Ok, so no write issues. 

    Now we shift to sending 305 fields in 290 MILLION rows out of SQL Server to a consuming application. ASYNC_NETWORK_IO FOR REAL anyone?!? πŸ˜€ Do a wait stats analysis to check for that. 

    sp_whoisactive can do a time delay and show you if this is an issue over a period of time. 

    You can look at the file IO stall DMV and see in aggregate if tempdb is getting hammered over time. You can also watch it during the run with code you can find online and with sp_whoisactive.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • Again guys...

    THERE IS NO INSERT

    Application calls this SELECT and loads the entire output into Memory.

    SQL Plan is attached.
    Both in .SQLPLAN and .XML format
    I had to change file extension from .XML to .TXT
    I also uploaded "Plan_Bottleneck.jpg" This is where all bad things happening:

    Key Lookup  on IX02_CSI_TRANSACTION (11% cost)
    SORT (33% cost)
    Index Seek on PK_CSI_TRANSACTION (23% cost)

    This is the plan I got from 
    sys.dm_exec_query_plan
    after starting SP execution
    Not sure if it's accurate. . . . 
    (Can it change later? Or it's final plan that SQL Optimizer will use?)

    I noticed from sys.dm_exec_requests
    that I constantly get wait_type=CXPACKET and status=suspended
    I also noticed in the first few minutes PLE dropped to 21-42 seconds then slowly started to get higher.
    I attached the results from [sys.dm_os_waiting_tasks].
    77 tasks waiting. 76 with WAIT_TYPE=CXPACKET. One with WAIT_TYPE=PAGEIOLATCH_SH

    Now after 25 min since start it's 352 seconds.
    Disk Read speed is 7 ms. Disk Write speed is 3 ms.
    AI noticed granted memory was high . . or at least suspecious behavior . . 

    Results from [sys.dm_exec_query_memory_grants] . . 
    -----------------------------------------------------------------------------------
    session_id    108
    scheduler_id    30
    ideal_memory_kb    556370256
    requested_memory_kb    7669400
    granted_memory_kb    7669400
    required_memory_kb    30864
    used_memory_kb    6552504
    max_used_memory_kb    6552504
    query_cost    270603.3427
    timeout_sec    86400
    session_id    108
    request_id    0
    scheduler_id    30
    dop    4
    request_time    54:07.4
    grant_time    54:07.4
    requested_memory_kb    7669400
    granted_memory_kb    7669400
    required_memory_kb    30864
    used_memory_kb    6552504
    max_used_memory_kb    6552504
    query_cost    270603.3427
    timeout_sec    86400
    resource_semaphore_id    0
    queue_id    NULL
    wait_order    NULL
    is_next_candidate    NULL
    wait_time_ms    NULL
    plan_handle    0x0500A1.....
    sql_handle    0x0300A10.....
    group_id    2
    pool_id    2
    is_small    0
    ideal_memory_kb    556370256

  • Very unusual plan - so many rows from both sides, yet it's a nested loops join and then a key lookup too. I wonder how it would fare with a hash join? You could make that happen with OPTION (HASH JOIN).

    β€œWrite the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • You could also try running it with OPTION(MAXDOP 1) and see if you get any difference.

  • So in addition to querying 120 million rows and 300+ columns, the underlying tables are buried within views. This is like the worst ETL scenario possible... but every corporation has a variation of this historical dump thing, and no one is allowed to question it, as if it were required by law or something. Given the number of columns returns, what you want in terms of indexing at least to insure that joins between tables are covered by an indexed. Beyond that no additional clever index is likely to make a difference. 

    What I would is try to turn this into an incremental load, so you at least won't be dumping the entire 5 year dataset every week. Try to identify a column, something like a date stamp or incremental ID that can be used as the selective offset. Having an index on that column would also help.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • This code is a pile of, well, I'll be nice and say it is suboptimal to the extreme. :hehe:  22 convert_implicits, 100 OTHER converts, 3.4MB query plan, the estimated data size maximum is 568 GIGABYTES!!!!, there's an estimated CPU cost 23000!!!, there's a SORT with an estimated IO cost of 88353!!. I could go on and on.

    The loops with huge numbers is easy to explain: there are 69 EstimateRows values in the plan. The majority of them are 1ish. The max that I saw was 200111, which it seems is a tiny fraction of the size of the actual tables. PRIMARY OBJECTIVE - find and remove all causes of poor estimates. There are many potential causes of this.

    The spool is likely a combination of low estimates and suboptimal indexing. That too is fixable.

    I'm not surprised at all that CXPACKET is the predominant wait type. My gut says trying MAXDOP 1 on this monster is quite silly, but it wouldn't be the strangest thing I have seen in SQL Server by a long shot if this beast happened to run better with that set. :w00t:

    I love the idea of making this an incremental data feed, although honestly that will likely be more trouble than finding and fixing the query performance issues, especially if Qlikview cant' be easily (or at all?) coaxed into accepting such a load mechanism.

    I would still like to know how much delay is involved simply due to data size being transferred from SQL Server to QlikView

    Whenever I see the same table hit over and over (which I see far too often at clients) it makes me cringe and immediately look for ways to combine those hits.

    NOTE: nothing with fixing the query or indexing is rocket science, nor insurmountable. It is simply a matter of identifying an issue, solving it, moving to next issue, lather, rinse, repeat. The 40GB could well be a SIGNIFICANT (and quite possibly insurmountable) obstacle though. Massive data doesn't work with a paltry amount of RAM, even with SSDs in play. If that limitation can't be overcome an iterative approach could actually be the most efficient way to cut the run time down.  

    Also, depending on your knowledge and experience level, you may wish to engage a professional tuner for a day or three to help out.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply