Queries are faster with Statistics Sampling rather than FULLSCAN?

  • I have a really odd issue that I don't understand. We have a 3rd party Excel add-in that talks to another 3rd party application (ERP system) that allows adhoc reporting in Excel for some of our "power users". Over the past couple of months since we've been using the add-in, I've had users report that it is taking a long time to pull all the data into the spreadsheets.

    Unfortunately, it appears that the way the Excel software works is by sending individual queries instead of building a set and returning that. So it will execute this one statement 2500 times. I can't change how the 3rd party software works. One of the queries that is executed looks like this:

    SELECT SUM(A.AMOUNTMST)

    FROM LEDGERTRANS A

    WHERE ((DATAAREAID IN (@P1) )

    AND (((((TRANSDATE>=@P2) AND (TRANSDATE<=@P3))

    AND (((ACCOUNTNUM=@P4) OR (ACCOUNTNUM=@P5)) OR (ACCOUNTNUM=@P6)))

    AND (DIMENSION=@P7)) AND (DIMENSION2_ LIKE @P8 ESCAPE '\' )))

    All 2500+ queries look similar to that, but the only thing that changes may be the number of items in the WHERE clause (there sometimes might only be one AccountNum and sometimes there may be 10+ AccountNum's).

    On that database, we update all STATISTICS with FULLSCAN daily. The LedgerTrans table in our environment has 11 million records.

    When the users report issues, the only thing I do is type "UPDATE STATISTICS LedgerTrans", which does a sampling. If I look at the details, it only sampled 137K rows, which is only ~1% of the rows.

    I managed to capture a SQL Profiler trace of the issue to try to compare execution plans, but I'm still confused. Before I do my "fix", the query formatted above had a duration of 2944 in SQL Profiler (which I assume is 3 seconds). After the fix, the duration is 4 for the same query. Attached are the two execution plans that I captured in SQL Profiler using the Showplan XML event.

    I don't understand how the simple Index Seek/Key Lookup execution plan is slower than the more complicated one with all of the scalars and the same Index Seek/Key Lookup.

    I don't think the system is under heavy load or blocking when the users report the slowness because it is 100% of the time (for the past 10+ times we've had this issue) resolved immediately by running that UPDATE STATISTICS command. I've use Remote Desktop Services shadowing and watched the add-in slowly progress through it's computations at a rate of 1+ minutes per 1 percent of activity, and in the middle of that worksheet refresh, I ran the UPDATE statistics command and it immediately started doing 1% of the total calculations every second.

  • Could you post actual execution plans (not estimated) and DDL (including indexes), please?

  • Statistics cannot be perfect, doesn't matter if you update it with fullscan or sample 1%. It should be better with more sample data, but it's not guaranteed. There are a lot of limitiations of the statistics, and sometimes the optimizer makes wrong assumtions based on this not perfect statistics, and this is what happens in your case.

    Deque (6/12/2013)


    I don't understand how the simple Index Seek/Key Lookup execution plan is slower than the more complicated one with all of the scalars and the same Index Seek/Key Lookup.

    The smaller picture doesn't mean that the plan is better. Learn to read execution plans if you want to understand what's really going on. The scalars cost nothing. Index Seek/Key Lookup are not the same - look at the details not the icons.


    Alex Suprun

  • I do agree with Alexander.

    Performance statistics are as good as the execution plan they help to select.

    In my experience, in most cases, the more accurate performance statistics are the better the chances of a really good execution plan to be chosen but, I have seen exceptions to that rule.

    I remember loading "fake" statistics to a particular table in order to ensure the right execution plan to be choosen.

    _____________________________________
    Pablo (Paul) Berzukov

    Author of Understanding Database Administration available at Amazon and other bookstores.

    Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.
  • HowardW (6/12/2013)


    Could you post actual execution plans (not estimated) and DDL (including indexes), please?

    I will have to wait until the issue happens again to get the actual execution plans. I was unaware that SQL Profiler only captures Estimated Execution Plans with the "SHOWPLAN XML" event. It looks as though the "SHOWPLAN XML Statistics" event does capture the Actual Execution Plans, so I'll give that a try when it happens again.

    As for the DDL:

    CREATE TABLE [dbo].[LEDGERTRANS](

    [ACCOUNTNUM] [nvarchar](20) NOT NULL,

    [TRANSDATE] [datetime] NOT NULL,

    [VOUCHER] [nvarchar](20) NOT NULL,

    [TXT] [nvarchar](60) NOT NULL,

    [AMOUNTMST] [numeric](28, 12) NOT NULL,

    [AMOUNTCUR] [numeric](28, 12) NOT NULL,

    [CURRENCYCODE] [nvarchar](3) NOT NULL,

    [TRANSTYPE] [int] NOT NULL,

    [DIMENSION] [nvarchar](25) NOT NULL,

    [DIMENSION2_] [nvarchar](25) NOT NULL,

    [DIMENSION3_] [nvarchar](25) NOT NULL,

    [QTY] [numeric](28, 12) NOT NULL,

    [DEL_PURCHLEDGERID] [bigint] NOT NULL,

    [DOCUMENTDATE] [datetime] NOT NULL,

    [JOURNALNUM] [nvarchar](10) NOT NULL,

    [JOURNALIZESEQNUM] [int] NOT NULL,

    [ALLOCATELEVEL] [int] NOT NULL,

    [POSTING] [int] NOT NULL,

    [CORRECT] [int] NOT NULL,

    [CREDITING] [int] NOT NULL,

    [DOCUMENTNUM] [nvarchar](20) NOT NULL,

    [PAYMREFERENCE] [nvarchar](20) NOT NULL,

    [PERIODCODE] [int] NOT NULL,

    [OPERATIONSTAX] [int] NOT NULL,

    [THIRDPARTYBANKACCOUNTID] [nvarchar](10) NOT NULL,

    [COMPANYBANKACCOUNTID] [nvarchar](10) NOT NULL,

    [PAYMMODE] [nvarchar](10) NOT NULL,

    [JOURNALIZENUM] [nvarchar](20) NOT NULL,

    [AMOUNTMSTSECOND] [numeric](28, 12) NOT NULL,

    [EUROTRIANGULATION] [int] NOT NULL,

    [FURTHERPOSTINGTYPE] [int] NOT NULL,

    [LEDGERPOSTINGJOURNALID] [nvarchar](10) NOT NULL,

    [TAXREFID] [int] NOT NULL,

    [DEL_LEDGERTRANSREPORTTYPE] [int] NOT NULL,

    [DEL_VOUCHERSEQUENCECODE] [nvarchar](10) NOT NULL,

    [DEL_LEDGERPOSTINGJOURNALREGI36] [nvarchar](10) NOT NULL,

    [CREATEDTRANSACTIONID] [bigint] NOT NULL,

    [DATAAREAID] [nvarchar](4) NOT NULL,

    [RECVERSION] [int] NOT NULL,

    [RECID] [bigint] NOT NULL,

    [CONSOLIDATEDCOMPANY] [nvarchar](4) NOT NULL,

    [REASONREFRECID] [bigint] NOT NULL,

    [MODIFIEDDATETIME] [datetime] NOT NULL,

    [MODIFIEDBY] [nvarchar](5) NOT NULL,

    [CREATEDDATETIME] [datetime] NOT NULL,

    [DEL_CREATEDTIME] [int] NOT NULL,

    [CREATEDBY] [nvarchar](5) NOT NULL,

    CONSTRAINT [I_225RECID] PRIMARY KEY CLUSTERED

    (

    [DATAAREAID] ASC,

    [RECID] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    ) ON [PRIMARY]

    GO

    CREATE NONCLUSTERED INDEX [I_225ACDATE] ON [dbo].[LEDGERTRANS]

    (

    [DATAAREAID] ASC,

    [ACCOUNTNUM] ASC,

    [TRANSDATE] ASC,

    [VOUCHER] ASC,

    [AMOUNTMST] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    GO

    CREATE NONCLUSTERED INDEX [I_225DATEIDX] ON [dbo].[LEDGERTRANS]

    (

    [DATAAREAID] ASC,

    [TRANSDATE] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    GO

    CREATE NONCLUSTERED INDEX [I_225FURTHERPOSTINGTYPE] ON [dbo].[LEDGERTRANS]

    (

    [DATAAREAID] ASC,

    [FURTHERPOSTINGTYPE] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    GO

    CREATE NONCLUSTERED INDEX [I_225JOURNALIZEIDX] ON [dbo].[LEDGERTRANS]

    (

    [DATAAREAID] ASC,

    [JOURNALIZESEQNUM] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    GO

    CREATE NONCLUSTERED INDEX [I_225POSTINGJOURNAIDXL] ON [dbo].[LEDGERTRANS]

    (

    [DATAAREAID] ASC,

    [LEDGERPOSTINGJOURNALID] ASC,

    [VOUCHER] ASC,

    [TRANSDATE] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    GO

    CREATE NONCLUSTERED INDEX [I_225TRANSACTIONLOGIDX] ON [dbo].[LEDGERTRANS]

    (

    [DATAAREAID] ASC,

    [CREATEDTRANSACTIONID] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    GO

    CREATE NONCLUSTERED INDEX [I_225VOUCHERDATEIDX] ON [dbo].[LEDGERTRANS]

    (

    [DATAAREAID] ASC,

    [VOUCHER] ASC,

    [TRANSDATE] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    GO

  • Alexander Suprun (6/12/2013)


    The smaller picture doesn't mean that the plan is better. Learn to read execution plans if you want to understand what's really going on. The scalars cost nothing. Index Seek/Key Lookup are not the same - look at the details not the icons.

    I mistakenly assumed that the seek was happening on the same index, but as you already know, they're not.

    I'll see if I can do some reading and figure this out. I'll also see if I can get the actual execution plan when it happens again, because I assume that would be the correct execution plan to troubleshoot.

  • Probably parameter sniffing of some kind and the FULLSCAN is invalidating the poorly performing plan and forcing a recompile (invalidating the existing plan without updating statistics may well have performed the same trick).

    Without some actual statistics to see where the estimates are out and without being able to modify the queries, you could try to make the optimiser's life simpler (and the plan more stable) by making a covering index with the combined predicate that it's flipping between e.g.:

    CREATE NONCLUSTERED INDEX [I_XXXXXXXXXXXXXX] ON [dbo].[LEDGERTRANS]

    (

    [DATAAREAID] ASC,

    [ACCOUNTNUM] ASC,

    [TRANSDATE] ASC

    )

    INCLUDE (DIMENSION,DIMENSION2_,AMOUNTMST)

    WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    GO

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply