Cardinality estimator [CE] (SQL2014) behavior

  • Dears,

    This behavior :

    • A) Declare variable : integer, fill it up using query
    • B) Query : use previous integer to filter on (where clause)

    Using the old CE  : sub second performance

    Using the new CE : 1 hour! to complete the query (estimations are wrong)

    How to fix?

    using the old CE (forced via traceflag option)

    or replacing the variable in B) by the query in A)

    Is this expected?

    Thanks,

    Tim

  • Yes, sometimes the old CE works better than the new.  A third option is to rewrite the query.  Are you in a position to post it so that we can help you with that?

    John

  • Some more investigation... it seems the translation of the variable goes wrong?

    *************************************************************************************

    ** Engine 14.0.3164.1

    ** Auto statistics on

    ** Update with fullscan performed

    ** Compat level 140

     

    -----------------------------------

    -- CASE 1 : ORIGINAL QUERY / NEW CE

    -----------------------------------

    DECLARE @CustomerId INT

     

    SET @CustomerId = (SELECT DISTINCT CustomerId FROM Import.VehicleFile WHERE ImportId='BDE0ACCC-2946-4433-A8DD-74C9971C8340')

    PRINT @CustomerId

     

     

    SELECT

    MA.ExternalReference, MA.Id

    FROM

    [Master].[Address] MA

    LEFT JOIN

    [Master].[Contacts] MC

    ON

    MC.AddressId = MA.Id

    LEFT JOIN

    [Master].[VehicleFile] DRIVER_MVF

    ON

    DRIVER_MVF.DriverContactId = MC.Id

    AND

    DRIVER_MVF.CustomerId = @CustomerId

    LEFT JOIN

    [Master].[VehicleFile] FLEETCONTACT_MVF

    ON

    FLEETCONTACT_MVF.FleetContactId = MC.Id

    AND

    FLEETCONTACT_MVF.CustomerId = @CustomerId

    WHERE

    DRIVER_MVF.Id IS NOT NULL

    AND

    FLEETCONTACT_MVF.Id IS NOT NULL

     

    -> Takes hours to complete.

     

     

    ----------------------------------

    -- CASE 2 : NEW CE / REWRITE QUERY

    ----------------------------------

    The variable is removed and replaced by subquery

     

    SELECT

    MA.ExternalReference, MA.Id

    FROM

    [Master].[Address] MA

    LEFT JOIN

    [Master].[Contacts] MC

    ON

    MC.AddressId = MA.Id

    LEFT JOIN

    [Master].[VehicleFile] DRIVER_MVF

    ON

    DRIVER_MVF.DriverContactId = MC.Id

    AND

    DRIVER_MVF.CustomerId = SELECT DISTINCT CustomerId FROM Import.VehicleFile WHERE ImportId='BDE0ACCC-2946-4433-A8DD-74C9971C8340'

    LEFT JOIN

    [Master].[VehicleFile] FLEETCONTACT_MVF

    ON

    FLEETCONTACT_MVF.FleetContactId = MC.Id

    AND

    FLEETCONTACT_MVF.CustomerId = SELECT DISTINCT CustomerId FROM Import.VehicleFile WHERE ImportId='BDE0ACCC-2946-4433-A8DD-74C9971C8340'

    WHERE

    DRIVER_MVF.Id IS NOT NULL

    AND

    FLEETCONTACT_MVF.Id IS NOT NULL

     

    -> Subsecond performance : OK!

     

     

    -----------------------------------------

    -- CASE 3 : ORIGINAL QUERY / FORCE OLD CE

    -----------------------------------------

    DECLARE @CustomerId INT

     

    SET @CustomerId = (SELECT DISTINCT CustomerId FROM Import.VehicleFile WHERE ImportId='BDE0ACCC-2946-4433-A8DD-74C9971C8340')

    PRINT @CustomerId

     

     

    SELECT

    MA.ExternalReference, MA.Id

    FROM

    [Master].[Address] MA

    LEFT JOIN

    [Master].[Contacts] MC

    ON

    MC.AddressId = MA.Id

    LEFT JOIN

    [Master].[VehicleFile] DRIVER_MVF

    ON

    DRIVER_MVF.DriverContactId = MC.Id

    AND

    DRIVER_MVF.CustomerId = @CustomerId

    LEFT JOIN

    [Master].[VehicleFile] FLEETCONTACT_MVF

    ON

    FLEETCONTACT_MVF.FleetContactId = MC.Id

    AND

    FLEETCONTACT_MVF.CustomerId = @CustomerId

    WHERE

    DRIVER_MVF.Id IS NOT NULL

    AND

    FLEETCONTACT_MVF.Id IS NOT NULL

    OPTION(QUERYTRACEON 9481)

     

    -> Subsecond performance : OK!

     

  • Without seeing the execution plans, I'm guessing.  What I imagine is happening is that the query optimizer does not know the value of the variable at compile time, and so it has to guess.  It looks as if it guesses badly!  In the case of the subquery, I think the optimizer must be able to rely on some statistics that it doesn't have access to with the variable.

    If you're using SQL Server 2016 or above (I'm confused by the combination of the title of the topic and the forum it's posted in) then you might consider using the Query Store to force a particular plan.  I recommend you also consider rewriting your query to change your LEFT JOINs to INNER JOINs in places where the outer table is referenced in the WHERE clause (as I mentioned in my reply to your private message).  It may or may not make a difference to performance - but it'll certainly make the query simpler and easier to understand.

    John

     

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply