Query performance change depending where condition date value

  • Hi all

    I have and issue with a query in SQL Server 2012

    SELECT top 1000 firmas.TipoPers, firmas.CodPers, GRUPO1.Limite, GRUPO1.Disponible

    FROM (

    select TipoPers, CodPers, Num_Grupo

    from TABLA1

    WHERE FECDATOMES = CAST (convert(varchar(8),'20160531') as datetime)

    ) firmas

    LEFT JOIN (

    SELECT grupo, SUM(limite) as Limite, SUM(disp) as Disponible

    FROM TABLA2

    WHERE FECDATOMES = CAST (convert(varchar(8),'20160531') as datetime)

    group by grupo

    ) GRUPO1

    ON firmas.Num_Grupo = GRUPO1.grupo

    It is a production enviroment and I´m allowed to see only this stats:

    SQL Server parse and compile time:

    CPU time = 31 ms, elapsed time = 40 ms.

    (1000 filas afectadas)

    Table 'TABLA2'. Scan count 9, logical reads 2151, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'Workfile'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'TABLA1'. Scan count 9, logical reads 186, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 249 ms, elapsed time = 31 ms.

    SQL Server parse and compile time:

    CPU time = 0 ms, elapsed time = 0 ms.

    It works fine, but when i run the query with a date value of 20160630 the stats change to worse:

    SQL Server parse and compile time:

    CPU time = 0 ms, elapsed time = 6 ms.

    (1000 filas afectadas)

    Table 'TABLA2'. Scan count 1000, logical reads 1916000, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'TABLA1'. Scan count 1, logical reads 32, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 321141 ms, elapsed time = 412712 ms.

    SQL Server parse and compile time:

    CPU time = 0 ms, elapsed time = 0 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 0 ms.

    I must use a TOP 1000 because in the second case the query never finished. Without the TOP the first case finished in 10 seconds max and return 400.000 rows.

    Regards

  • It seems that you could have a data issue. The amount of data is not similar on both dates.

    The TOP 1000 is not helping because it needs to group the data before limiting the results.

    Be sure to have the adequate indexes. Post table and index definitions to get suggestions. If possible, post the actual execution plan. For information on how to do it, read this: http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

    By the way, this query will be interpreted the same way as yours (as long as FECDATOMES column is a datetime)

    SELECT top 1000

    firmas.TipoPers,

    firmas.CodPers,

    GRUPO1.Limite,

    GRUPO1.Disponible

    FROM TABLA1 AS firmas

    LEFT JOIN (

    SELECT grupo,

    SUM(limite) as Limite,

    SUM(disp) as Disponible

    FROM TABLA2

    WHERE FECDATOMES = '20160531'

    group by grupo

    ) GRUPO1

    ON firmas.Num_Grupo = GRUPO1.grupo

    WHERE FECDATOMES = '20160531';

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • Thanks for your answer.

    I cheked the number of rows distribution by the field FECDATOMES before open the post and are very similar for all dates.

    About the TOP 1000, I used it just to have some result, because with the second date it is returning only 5.000 rows in an hour.

    The tables have a clustered PK and no more indexes. I´m thinking that it could be a hardware issue, is it possible?

    I can´t post the execution plan because I haven´t rights do run it. Sorry.

  • Luis Cazares (7/7/2016)


    ...The TOP 1000 is not helping because it needs to group the data before limiting the results....

    That was my first thought too Luis so I set up a test harness to check it out - and what I find in every case I've tried is the TOP is applied before the subquery, hence limiting the number of rows aggregated. Even when useful indexes are dropped (I can't drop them all except CI as per OP). I reckon it's vastly different rowcounts on those dates.

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • ChrisM@Work (7/7/2016)


    Luis Cazares (7/7/2016)


    ...The TOP 1000 is not helping because it needs to group the data before limiting the results....

    That was my first thought too Luis so I set up a test harness to check it out - and what I find in every case I've tried is the TOP is applied before the subquery, hence limiting the number of rows aggregated. Even when useful indexes are dropped (I can't drop them all except CI as per OP). I reckon it's vastly different rowcounts on those dates.

    Can you share the test harness? I ran these queries on AdventureWorks and the TOP was the last statement applied to the query. Only when used without an aggregate function it reduces the number of reads.

    SET STATISTICS IO ON;

    SELECT OrderDate,

    SUM(SubTotal) SubTotal,

    SUM(TotalDue) TotalDue

    FROM Sales.SalesOrderHeader

    GROUP BY OrderDate;

    SELECT TOP 100 OrderDate,

    SUM(SubTotal) SubTotal,

    SUM(TotalDue) TotalDue

    FROM Sales.SalesOrderHeader

    GROUP BY OrderDate;

    SELECT TOP 100 OrderDate,

    SubTotal,

    TotalDue

    FROM Sales.SalesOrderHeader;

    SELECT top 5

    T.Name,

    T.[Group],

    S.SubTotal,

    S.TotalDue

    FROM Sales.SalesTerritory AS T

    LEFT JOIN (

    SELECT TerritoryID,

    SUM(SubTotal) as SubTotal,

    SUM(TotalDue) as TotalDue

    FROM Sales.SalesOrderHeader

    WHERE OrderDate = '20050901'

    group by TerritoryID

    ) S

    ON T.TerritoryID = S.TerritoryID;

    SET STATISTICS IO OFF;

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • This was using local data but I could share actual plans with you privately?

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • After a bit more fiddling with this, I can get stable and predictable behaviour using an index tailored for my specific query. Adjusting it for the OP's environment yields this:

    CREATE INDEX ix_Test ON TABLA2 (grupo, FECDATOMES) INCLUDE (limite, disp)

    - which in every case I've tried now, using wildly different rowcounts in both tables, has worked i.e. the aggregate query is filtered by the outer TOP-limited query.

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Nice Chris, I hope that this helps Eliseo.

    I think that I understand what you're saying about the TOP filtering before the aggregate, even if the execution plan indicates otherwise with the operators order. However, the reads indicate that the TOP isn't being applied. I would appreciate if you could share me the plans.

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • Luis Cazares (7/7/2016)


    Nice Chris, I hope that this helps Eliseo.

    I think that I understand what you're saying about the TOP filtering before the aggregate, even if the execution plan indicates otherwise with the operators order. However, the reads indicate that the TOP isn't being applied. I would appreciate if you could share me the plans.

    Better still, here's a test harness.

    -- Run up a table with a decent number of rows.

    -- 31 distinct dates

    -- 10 distinct types, 11 to 21

    -- 911 * date * type

    IF OBJECT_ID('tempdb..#Trans') IS NOT NULL DROP TABLE #Trans;

    ;WITH _Tally AS (

    SELECT n = ROW_NUMBER() OVER(ORDER BY (SELECT NULL))

    FROM (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) d (n),

    (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) e (n),

    (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) f (n)

    ),

    Dates AS (SELECT TOP(31) [date] = DATEADD(DAY,n-1,'20160101') FROM _Tally),

    _Types AS (SELECT TOP(10) TypeID = n+10 FROM _Tally),

    Iterations AS (SELECT TOP(911) n FROM _Tally)

    SELECT

    ID = IDENTITY(int,1,1),

    d.[Date],

    t.TypeID

    INTO #Trans

    FROM Dates d CROSS JOIN _Types t CROSS JOIN Iterations i

    ORDER BY d.[Date], t.TypeID

    CREATE UNIQUE CLUSTERED INDEX ucx_ID ON #Trans (ID)

    CREATE INDEX ix_Helper ON #Trans (TypeID, [Date])

    -- 100 rows, TypeID = 1 to 100

    IF OBJECT_ID('tempdb..#TypeID') IS NOT NULL DROP TABLE #TypeID;

    SELECT TOP 100 TypeID = ROW_NUMBER() OVER(ORDER BY (SELECT NULL))

    INTO #TypeID

    FROM #Trans;

    CREATE UNIQUE CLUSTERED INDEX ucx_TypeID ON #TypeID (TypeID)

    -----------------------------------------------------------------------

    -- Run the query

    SELECT TOP(11) *

    FROM #TypeID t

    LEFT loop JOIN (

    SELECT TypeID, cnt = COUNT(*)

    FROM #trans

    WHERE [date] = '20160116'

    GROUP BY TypeID

    ) d ON d.TypeID = t.TypeID;

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply