Declared Variable in WHERE clause weird behavior

  • I'm seeing a huge performance difference between two queries that are almost identical. The only SQL difference is in the WHERE clause:

    Query1: QueryID=@QID

    Query2: QueryID=ISNULL(@QID,A.QueryID)

    Full Query 1 takes 100x + longer to execute. @QID is never null in this case b/c is set at the beginning. I would expect a declared variable (int) would be quicker than a function wrapping that declared variable. The actual execution plan says Query2 is more intensive, but actual time to process tells a much different story. Can anyone shed light on this?

    Full Query1:

    DECLARE @days INT; SET @days=7;

    DECLARE @date1 DATETIME; SET @date1=getdate();

    DECLARE @date2 DATETIME; SET @date2=DATEADD(day,-@days,@date1);

    DECLARE @date3 DATETIME; SET @date3=DATEADD(day,-@days*2,@date1);

    DECLARE @QID INT; SET @QID = (SELECT id FROM Queries WHERE Query='abc');

    SELECT COUNT(*) as Visits, Query,

    (SELECT AVG(CAST(Pos AS DECIMAL)) FROM GooglePos WHERE QueryID=A.QueryID AND PageID=A.PageID AND date>=@date2) AS Pos,

    (SELECT AVG(CAST(Pos AS DECIMAL)) FROM GooglePos WHERE QueryID=A.QueryID AND PageID=A.PageID AND date>=@date3 AND date<@date2) AS Pos2

    FROM Visits A INNER JOIN Pages B ON A.PageID=B.id INNER JOIN Queries E ON E.id=A.QueryID

    WHERE date>=@date2 AND QueryID=@QID GROUP BY QueryID,Query,PageID;

    Full Query2:

    DECLARE @days INT; SET @days=7;

    DECLARE @date1 DATETIME; SET @date1=getdate();

    DECLARE @date2 DATETIME; SET @date2=DATEADD(day,-@days,@date1);

    DECLARE @date3 DATETIME; SET @date3=DATEADD(day,-@days*2,@date1);

    DECLARE @QID INT; SET @QID = (SELECT id FROM Queries WHERE Query='abc');

    SELECT COUNT(*) as Visits, Query,

    (SELECT AVG(CAST(Pos AS DECIMAL)) FROM GooglePos WHERE QueryID=A.QueryID AND PageID=A.PageID AND date>=@date2) AS Pos,

    (SELECT AVG(CAST(Pos AS DECIMAL)) FROM GooglePos WHERE QueryID=A.QueryID AND PageID=A.PageID AND date>=@date3 AND date<@date2) AS Pos2

    FROM Visits A INNER JOIN Pages B ON A.PageID=B.id INNER JOIN Queries E ON E.id=A.QueryID

    WHERE date>=@date2 AND QueryID=ISNULL(@QID,A.QueryID) GROUP BY QueryID,Query,PageID;

  • Could use the DDL for the tables, index definitions, sample data, expected results, and of course the actual execution plans for the two queries.

    Other than that, all you may get are shots in the dark.

  • duane+sql (2/15/2013)


    I'm seeing a huge performance difference between two queries that are almost identical. The only SQL difference is in the WHERE clause:

    Query1: QueryID=@QID

    Query2: QueryID=ISNULL(@QID,A.QueryID)

    Full Query 1 takes 100x + longer to execute. @QID is never null in this case b/c is set at the beginning. I would expect a declared variable (int) would be quicker than a function wrapping that declared variable. The actual execution plan says Query2 is more intensive, but actual time to process tells a much different story. Can anyone shed light on this?

    Full Query1:

    DECLARE @days INT; SET @days=7; DECLARE @date1 DATETIME; SET @date1=getdate(); DECLARE @date2 DATETIME; SET @date2=DATEADD(day,-@days,@date1); DECLARE @date3 DATETIME; SET @date3=DATEADD(day,-@days*2,@date1); DECLARE @QID INT; SET @QID = (SELECT id FROM Queries WHERE Query='abc');

    SELECT COUNT(*) as Visits, Query, (SELECT AVG(CAST(Pos AS DECIMAL)) FROM GooglePos WHERE QueryID=A.QueryID AND PageID=A.PageID AND date>=@date2) AS Pos, (SELECT AVG(CAST(Pos AS DECIMAL)) FROM GooglePos WHERE QueryID=A.QueryID AND PageID=A.PageID AND date>=@date3 AND date<@date2) AS Pos2 FROM Visits A INNER JOIN Pages B ON A.PageID=B.id INNER JOIN Queries E ON E.id=A.QueryID WHERE date>=@date2 AND QueryID=@QID GROUP BY QueryID,Query,PageID;

    Full Query2:

    DECLARE @days INT; SET @days=7; DECLARE @date1 DATETIME; SET @date1=getdate(); DECLARE @date2 DATETIME; SET @date2=DATEADD(day,-@days,@date1); DECLARE @date3 DATETIME; SET @date3=DATEADD(day,-@days*2,@date1); DECLARE @QID INT; SET @QID = (SELECT id FROM Queries WHERE Query='abc');

    SELECT COUNT(*) as Visits, Query, (SELECT AVG(CAST(Pos AS DECIMAL)) FROM GooglePos WHERE QueryID=A.QueryID AND PageID=A.PageID AND date>=@date2) AS Pos, (SELECT AVG(CAST(Pos AS DECIMAL)) FROM GooglePos WHERE QueryID=A.QueryID AND PageID=A.PageID AND date>=@date3 AND date<@date2) AS Pos2 FROM Visits A INNER JOIN Pages B ON A.PageID=B.id INNER JOIN Queries E ON E.id=A.QueryID WHERE date>=@date2 AND QueryID=ISNULL(@QID,A.VisitID) GROUP BY QueryID,Query,PageID;

    The second query isn't SARGable which means it's not capable of using an INDEX SEEK because you have a column inside of a function.

    Test and see. This makes a million rows of data with a clustered index on the only column. It takes scant seconds to run so don't let the big number scare you.

    --===== Create a test table and populate it on the fly.

    SELECT TOP 1000000

    QueryID = IDENTITY(INT,1,1)

    INTO #TestTable

    FROM sys.all_columns ac1

    CROSS JOIN sys.all_columns ac2

    ;

    --===== Add the expected index.

    ALTER TABLE #TestTable

    ADD PRIMARY KEY CLUSTERED (QueryID)

    ;

    Now, turn on the Actual Execution plan and let the following code rip. I've added another bit of code to show what else you don't want to do.

    --===== Before you run this section, turn on the Actual Execution Plan.

    -- Then run it and see the difference on the message tab and the AEP.

    DECLARE @QID INT;

    SELECT @QID = 999999;

    SET STATISTICS IO, TIME ON;

    SELECT QueryID FROM #TestTable WHERE QueryID=@QID;

    SELECT QueryID FROM #TestTable WHERE QueryID=ISNULL(@QID,QueryID);

    SELECT QueryID FROM #TestTable WHERE (@QID IS NULL OR QueryID = @QID);

    SET STATISTICS IO, TIME OFF;

    It sounds like you might be leaning toward a "catch all" query. Please see Gail Shaw's wonderful article on how to do such a thing correctly. Here's the link.

    http://sqlinthewild.co.za/index.php/2009/03/19/catch-all-queries/

    As a bit of a sidebar, never trust execution plan comparisons to tell you which code will be the fastest. Even Actual Execution Plans have a lot of estimated information in them that throw things like % of Batch way out of whack.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff, you are right. The second should take longer, but actually the 1st query takes MUCH longer in this case. I just don't understand why in this case.

    Here are stats from the actual queries:

    (1 row(s) affected)

    (199 row(s) affected)

    Table 'GooglePos'. Scan count 398, logical reads 58888876, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'Pages'. Scan count 0, logical reads 2116, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'Visits'. Scan count 1, logical reads 459642, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'Queries'. Scan count 0, logical reads 3, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    (1 row(s) affected)

    SQL Server Execution Times:

    CPU time = 88858 ms, elapsed time = 88961 ms.

    (199 row(s) affected)

    Table 'Pages'. Scan count 17, logical reads 31, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'Visits'. Scan count 1, logical reads 93, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'Queries'. Scan count 17, logical reads 933, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'GooglePos'. Scan count 2, logical reads 15, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    (1 row(s) affected)

    SQL Server Execution Times:

    CPU time = 127 ms, elapsed time = 177 ms.

  • Recap:

    Query 1 has in the WHERE clause: "QueryID=@QID". As we can see from the statistics Query 1 does a ton more logical reads for the subqueries contain in the full SQL. Why would Query 1 do these logical reads while Query 2 does not? Query 2 has in the WHERE clause: "QueryID=ISNULL(@QID,A.QueryID)". This is the only difference between Query 1 and Query 2.

    The GooglePos table is as follows:

    id int

    Date smalldatetime

    Pos int

    PageID int

    QueryID int

  • Please post the DDL for the table(s) including index definitions. Also, please post the actual execution plan for both of the queries. These can be saved as .sqlplan files and uploaded to ssc.

  • DDL w/ index

    USE [MaxF]

    GO

    /****** Object: Table [dbo].[Visits] Script Date: 02/22/2013 17:25:41 ******/

    SET ANSI_NULLS ON

    GO

    SET QUOTED_IDENTIFIER ON

    GO

    SET ANSI_PADDING ON

    GO

    CREATE TABLE [dbo].[Visits](

    [id] [bigint] NOT NULL,

    [Date] [smalldatetime] NULL,

    [SiID] [int] NULL,

    [PageID] [int] NULL,

    [QueryID] [int] NULL,

    [RefID] [int] NULL,

    [Referrer] [varchar](256) NULL,

    127.0.0.1 [varchar](15) NULL,

    [AffLinkID] [int] NULL,

    [Cchk] [bit] NULL,

    [Rchk] [bit] NULL,

    [CampID] [int] NULL,

    CONSTRAINT [PK_Visits] PRIMARY KEY CLUSTERED

    (

    [id] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    ) ON [PRIMARY]

    GO

    SET ANSI_PADDING OFF

    GO

    CREATE NONCLUSTERED INDEX [Date_nc] ON [dbo].[Visits]

    (

    [Date] ASC

    )

    INCLUDE ( [PageID],

    [QueryID],

    [RefID],

    [AffLinkID],

    [Cchk],

    [Rchk],

    [CampID],

    [Referrer],

    127.0.0.1) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    GO

    CREATE NONCLUSTERED INDEX [PageID_nc] ON [dbo].[Visits]

    (

    [PageID] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    GO

    CREATE NONCLUSTERED INDEX [QueryID_nc] ON [dbo].[Visits]

    (

    [QueryID] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    GO

    CREATE NONCLUSTERED INDEX [RefID_nc] ON [dbo].[Visits]

    (

    [RefID] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    GO

    CREATE NONCLUSTERED INDEX [SiID-QID-Date] ON [dbo].[Visits]

    (

    [SiID] ASC,

    [QueryID] ASC,

    [Date] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    GO

    /****** Object: Table [dbo].[Queries] Script Date: 02/22/2013 17:25:41 ******/

    SET ANSI_NULLS ON

    GO

    SET QUOTED_IDENTIFIER ON

    GO

    CREATE TABLE [dbo].[Queries](

    [id] [int] IDENTITY(1,1) NOT NULL,

    [Query] [nvarchar](100) NULL,

    CONSTRAINT [PK_Queries] PRIMARY KEY CLUSTERED

    (

    [id] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    ) ON [PRIMARY]

    GO

    CREATE NONCLUSTERED INDEX [Query_nc] ON [dbo].[Queries]

    (

    [Query] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    GO

    /****** Object: Table [dbo].[GooglePos] Script Date: 02/22/2013 17:25:41 ******/

    SET ANSI_NULLS ON

    GO

    SET QUOTED_IDENTIFIER ON

    GO

    CREATE TABLE [dbo].[GooglePos](

    [id] [int] IDENTITY(1,1) NOT NULL,

    [Date] [smalldatetime] NULL,

    [Pos] [int] NULL,

    [PageID] [int] NULL,

    [QueryID] [int] NULL,

    CONSTRAINT [PK_GooglePos] PRIMARY KEY CLUSTERED

    (

    [id] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    ) ON [PRIMARY]

    GO

    CREATE NONCLUSTERED INDEX [Date_nc] ON [dbo].[GooglePos]

    (

    [Date] ASC

    )

    INCLUDE ( [Pos],

    [PageID],

    [QueryID]) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    GO

    CREATE NONCLUSTERED INDEX [PageID_nc] ON [dbo].[GooglePos]

    (

    [PageID] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    GO

    CREATE NONCLUSTERED INDEX [QueryID_nc] ON [dbo].[GooglePos]

    (

    [QueryID] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    GO

  • In the first execution plan on the Operator Index Seek QueryId_nc you can see that the Estimated Number of Rows is about 4 and Actual Number of Rows is 34Mio! SQL Server Optimizer has choosen an Index Seek followed by Lookup because it expected only 4 rows to be returned and that's the reason why you have a lot of logical reads.

    In the second case ISNULL operator disallowed the optimizer to choosing Index Seek and in this case this desicion was better.

    The reason why the optimizer had a bad estimation in first case is usage of local variable. When you use local variables the optimizerhas to generate the plan for an unknown value and the real value has been evaluated at the run-time and this is too late for the exeution plan. The plan has been already created.

    You can use OPTION (RECOMPILE) hint at the end of the first query to force generating the plan at the statement level which allows the optimizer a better estimation and finally to generate an optimal execution plan.

    ___________________________
    Do Not Optimize for Exceptions!

  • Thank you so much for the explanation!

  • Parameter sniffing, or lack thereof: http://sqlinthewild.co.za/index.php/2008/02/25/parameter-sniffing-pt-2/

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply