Variables, Sargability, and Indexes

  • I have an SSIS package that pulls data from a datamart table down to a Dev environment for our reporting team. The table itself contains 640,143,067 rows and will continue to grow. We don't want to pull the entire table down (it took almost a week the last time we did this), just the current month's data. There is an INT column (ReportYearMonth) that I'm using to check for the most recent month's data. The structure of the values are YYYYMM, so 201502 for February 2015, 201501 for January 2015, etc.

    There are 7 indexes on the table. 4 of them reference ReportYearMonth as the first column in the index. I figured my query would grab one of those indexes and use it, but when I ran the package last night, it took over 8 hrs for one month's records to move across the environments. I know part of that is my PC (I ran it in BIDS and the servers are in an out-of-state DC), but I also know there has to be a faster way to do this. Especially when running a TOP 1000 version of the query in SSMS (including actual execution plan) tells me that it's doing a freaking table scan.

    So it's not using the indexes and I'm thinking "Could this be because variables are non-sargable?" If so, how do I get around this?

    I'm trying to avoid adding another index or redoing the 7 indexes on the table, because another team designed this and has reasons (I hope) for the indexes they have. Also, too many indexes = Bad Things In Database Land.

    DECLARE @YrMo INT,

    @YrMoStrg CHAR(6) = (SELECT CONVERT(CHAR(4),YEAR(DATEADD(m,-1,GETDATE())))

    + CASE WHEN MONTH(DATEADD(m,-1,GETDATE())) < 10 THEN '0'

    + CONVERT(CHAR(1),MONTH(DATEADD(m,-1,GETDATE())),101)

    ELSE CONVERT(CHAR(2),MONTH(DATEADD(m,-1,GETDATE())),101) END );

    SELECT @YrMo = CONVERT(INT,@YrMoStrg);

    SELECT TOP 1000 col1, col2, col3, col4, col5, ... colN

    FROM dbo.MyDatamartTable

    WHERE ReportYearMonth = @YrMo;

    Thoughts?

    Brandie Tarvin, MCITP Database AdministratorLiveJournal Blog: http://brandietarvin.livejournal.com/[/url]On LinkedIn!, Google+, and Twitter.Freelance Writer: ShadowrunLatchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.

  • SARGable means 'can be used as a search argument', meaning it can be used for an index seek. For a predicate to be SARGable, it has to be:

    <Column> <Operator> <Expression> where expression can be constant, parameter, variable or function.

    Now, whether the indexes are useful, more efficient than a table scan and usable is entirely another matter and isn't answerable without seeing the indexes and preferably execution plan.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Brandie,

    A side note not related to the problem. What if you use the following? @YrMoStrg CHAR(6) = CONVERT(VARCHAR(6), DATEADD(m,-1,GETDATE()), 112)

  • Scrubbed table structure and execution plan attached. I have not included sample data yet because it will take a while for me to scrub it. Let me know if this is needed.

    USE [ReportDB]

    GO

    IF EXISTS (SELECT * FROM sys.objects

    WHERE object_id = OBJECT_ID(N'[dbo].[MyDataMartTable]') AND type in (N'U'))

    DROP TABLE [dbo].[MyDataMartTable]

    GO

    SET ANSI_NULLS ON

    GO

    SET QUOTED_IDENTIFIER ON

    GO

    SET ANSI_PADDING ON

    GO

    CREATE TABLE [dbo].[MyDataMartTable](

    [CID] [int] NOT NULL,

    [ProductCode] [varchar](10) NULL,

    [ReportYearMonth] [int] NULL,

    [IncEPMix] [money] NOT NULL,

    [IncECMix] [money] NOT NULL,

    [IncUPMix] [money] NOT NULL,

    [IncPaid] [money] NOT NULL,

    [IncIAmt] [money] NOT NULL,

    [MoEPMix] [money] NOT NULL,

    [MoECMix] [money] NOT NULL,

    [MoUPMix] [money] NOT NULL,

    [MoPaid] [money] NOT NULL,

    [MoIAmt] [money] NOT NULL,

    [CNumber] [varchar](19) NOT NULL,

    [PIS] [char](2) NULL,

    [PID] [int] NOT NULL,

    [PName] [varchar](50) NULL,

    [AID] [int] NOT NULL,

    [AName] [varchar](50) NULL,

    [DMIKey] [int] NULL,

    [DMPKey] [int] NULL,

    [DMAKey] [int] NULL,

    [RCol] [bit] NOT NULL,

    [Source] [varchar](5) NOT NULL,

    [CovEKey] [int] NOT NULL,

    [DMloadts] [datetime] NOT NULL

    ) ON [PRIMARY]

    GO

    SET ANSI_PADDING OFF

    GO

    CREATE NONCLUSTERED INDEX [IDX_MyDataMartTable_AID_ReportYearMonth]

    ON [dbo].[MyDataMartTable]

    (

    [AID] ASC,

    [ReportYearMonth] ASC

    )

    INCLUDE ( [MoEPMix],

    [MoECMix],

    [MoPaid],

    [PID],

    [PName],

    [AName],

    [DMPKey],

    [CID],

    [ProductCode],

    [MoUPMix],

    [MoIAmt],

    [PIS],

    [DMIKey],

    [DMAKey],

    [IncEPMix],

    [IncECMix],

    [IncUPMix],

    [IncPaid],

    [IncIAmt],

    [RCol]) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF,

    SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF,

    ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    GO

    CREATE NONCLUSTERED INDEX [IDX_MyDataMartTable_CID_ReportYearMonth]

    ON [dbo].[MyDataMartTable]

    (

    [ReportYearMonth] ASC,

    [CID] ASC,

    [RCol] ASC

    )

    INCLUDE ( [IncEPMix],

    [IncIAmt]) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF,

    SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF,

    ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    GO

    CREATE UNIQUE NONCLUSTERED INDEX [IDX_MyDataMartTable_CovEKey]

    ON [dbo].[MyDataMartTable]

    (

    [CovKey] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF,

    SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF,

    ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    GO

    CREATE NONCLUSTERED INDEX [IDX_MyDataMartTable_DMAKey]

    ON [dbo].[MyDataMartTable]

    (

    [DMAKey] ASC,

    [RCol] ASC,

    [Source] ASC

    )

    INCLUDE ( [MoEPMix],

    [MoECMix],

    [MoUPMix],

    [MoPaid],

    [MoIAmt],

    [DMIKey],

    [DMPKey],

    [ReportYearMonth],

    [IncEPMix],

    [IncECMix],

    [IncUPMix],

    [IncPaid],

    [IncIAmt]) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF,

    SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF,

    ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    GO

    CREATE NONCLUSTERED INDEX [IDX_MyDataMartTable_ReportYearMonth_DMPKey_RCol_Source]

    ON [dbo].[MyDataMartTable]

    (

    [ReportYearMonth] ASC,

    [DMPKey] ASC,

    [RCol] ASC,

    [Source] ASC

    )

    INCLUDE ( [IncEPMix],

    [IncECMix],

    [IncUPMix],

    [IncPaid],

    [IncIAmt],

    [DMIKey],

    [DMAKey]) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF,

    SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF,

    ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    GO

    CREATE NONCLUSTERED INDEX [IDX_MyDataMartTable_ReportYearMonth_RCol_Source]

    ON [dbo].[MyDataMartTable]

    (

    [ReportYearMonth] ASC,

    [RCol] ASC,

    [Source] ASC

    )

    INCLUDE ( [CID],

    [IncEPMix],

    [IncECMix],

    [IncUPMix],

    [IncPaid],

    [IncIAmt],

    [MoEPMix],

    [MoECMix],

    [MoUPMix],

    [MoPaid],

    [MoIAmt],

    [DMIKey],

    [DMPKey],

    [DMAKey]) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF,

    SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF,

    ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    GO

    CREATE NONCLUSTERED INDEX [IDX_MyDataMartTable_Source]

    ON [dbo].[MyDataMartTable]

    (

    [Source] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF,

    SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF,

    DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON,

    ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    GO

    Brandie Tarvin, MCITP Database AdministratorLiveJournal Blog: http://brandietarvin.livejournal.com/[/url]On LinkedIn!, Google+, and Twitter.Freelance Writer: ShadowrunLatchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.

  • djj (3/24/2015)


    Brandie,

    A side note not related to the problem. What if you use the following? @YrMoStrg CHAR(6) = CONVERT(VARCHAR(6), DATEADD(m,-1,GETDATE()), 112)

    Oh, will you please stop making sense? I like my tortured jumping-through-hoops bit of coding! 😀

    Thanks. I forgot about the 112 style.

    Brandie Tarvin, MCITP Database AdministratorLiveJournal Blog: http://brandietarvin.livejournal.com/[/url]On LinkedIn!, Google+, and Twitter.Freelance Writer: ShadowrunLatchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.

  • Could we get a scrubbed query that matches the columns in the table?

    It looks like there's no covering index, in which case SQL will do a table scan if the query is estimated to affect more than ~1% of the table (and with variables it's going to estimate very poorly)

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster (3/24/2015)


    Could we get a scrubbed query that matches the columns in the table?

    Sorry. I should have thought about that. Here it is.

    DECLARE @YrMo INT,

    @YrMoStrg CHAR(6) = (SELECT CONVERT(CHAR(6),DATEADD(m,-1,GETDATE()), 112));

    SELECT @YrMo = CONVERT(INT,@YrMoStrg);

    SELECT deaP.CID,

    deaP.ProductCode,

    deaP.ReportYearMonth,

    deaP.IncEPMix,

    deaP.IncECMix,

    deaP.IncUPMix,

    deaP.IncPaid,

    deaP.IncIAmt,

    deaP.MoEPMix,

    deaP.MoECMix,

    deaP.MoUPMix,

    deaP.MoPaid,

    deaP.MoIAmt,

    deaP.CNumber,

    deaP.PIS,

    deaP.PID,

    deaP.PName,

    deaP.AID,

    deaP.AName,

    deaP.DMIKey,

    deaP.DMPKey,

    deaP.DMAKey,

    deaP.RCol,

    deaP.[Source],

    deaP.CovEKey,

    deaP.DMloadts

    FROM dbo.MyDataMartTable deaP

    WHERE ReportYearMonth = @YrMo;

    It looks like there's no covering index, in which case SQL will do a table scan if the query is estimated to affect more than ~1% of the table (and with variables it's going to estimate very poorly)

    Hmmm. I wonder if adding the covering index will help or hurt the reports they are pulling off the table. It shouldn't interfere with the other indexes, right? Other than possibly using the covering index over the other indexes... But a covering index for this specific query would include every column in the table. I'm not sure that's a good idea.

    Based on your comment, though I just tried swapping the ReportYearMonth to the first column of the query and running it. Execution plan doesn't appear to have changed at first glance. Still using a Table Scan. I'll compare it to the other plan and see if the numbers have changed at all.

    Brandie Tarvin, MCITP Database AdministratorLiveJournal Blog: http://brandietarvin.livejournal.com/[/url]On LinkedIn!, Google+, and Twitter.Freelance Writer: ShadowrunLatchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.

  • The only two things that changed in the Execution Plan was the complexity of my variable calls (expected) and the CompileMemory went from 256 to 240. Everything else is the same.

    So the column order swap seems pretty insignificant.

    Brandie Tarvin, MCITP Database AdministratorLiveJournal Blog: http://brandietarvin.livejournal.com/[/url]On LinkedIn!, Google+, and Twitter.Freelance Writer: ShadowrunLatchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.

  • Order of columns within a query is irrelevant. What in my comment suggested otherwise?

    Covering index will help. Looking at what you have, the easiest would be to widen one of the existing indexes that has ReportYearMonth as the leading column of the key and add into that index's include column list any columns that are in your select clause and not already in the index,

    A quick look suggests that one of these would be easiest to widen

    CREATE NONCLUSTERED INDEX [IDX_MyDataMartTable_ReportYearMonth_DMPKey_RCol_Source]

    ON [dbo].[MyDataMartTable]

    (

    [ReportYearMonth] ASC,

    [DMPKey] ASC,

    [RCol] ASC,

    [Source] ASC

    )

    INCLUDE ( [IncEPMix],

    [IncECMix],

    [IncUPMix],

    [IncPaid],

    [IncIAmt],

    [DMIKey],

    [DMAKey])

    CREATE NONCLUSTERED INDEX [IDX_MyDataMartTable_ReportYearMonth_RCol_Source]

    ON [dbo].[MyDataMartTable]

    (

    [ReportYearMonth] ASC,

    [RCol] ASC,

    [Source] ASC

    )

    INCLUDE ( [CID],

    [IncEPMix],

    [IncECMix],

    [IncUPMix],

    [IncPaid],

    [IncIAmt],

    [MoEPMix],

    [MoECMix],

    [MoUPMix],

    [MoPaid],

    [MoIAmt],

    [DMIKey],

    [DMPKey],

    [DMAKey])

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • I don't see a clustered index on the table (but maybe I just missed it).

    If you most often / always query the table based on YrMo, cluster the table on YrMo. Then you won't need all those nonclustered indexes that together likely more than double the size of the table.

    You may very well want an need additional column(s) in the clustered index, to reduce the number of dup keys, but I can't tell what that/those column/s should be just from what I've seen so far. We'd have to look at least at index missing and index usage stats.

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • ScottPletcher (3/25/2015)


    I don't see a clustered index on the table (but maybe I just missed it).

    There isn't one. I don't know why they don't have one, except that they really don't have a single unique column on which to put it and I'm not sure a composite would work because it would require lots of columns for uniqueness. On the other hand, I should discuss surrogate keys with them and see if an IDENTITY would work.

    GilaMonster (3/24/2015)


    Order of columns within a query is irrelevant. What in my comment suggested otherwise?

    Nothing specific. Your comment triggered a memory of a SQL Saturday event on indexes I attended and some imperfect memory of the speaker mentioning that an index wouldn't work properly if the index columns were in a different order than something in the query. I think he referenced the WHERE clause, but again, imperfect memory and I figured that it only took 2 seconds for me to alter the SELECT and see if it applied there as well.

    Which it didn't. But hey, at least I tried it to see if it was something I had missed.

    I will communicate with the report team to see if they have a problem changing up their index. They probably won't, but they will if I don't ask first. Thank you for the suggestions.

    Brandie Tarvin, MCITP Database AdministratorLiveJournal Blog: http://brandietarvin.livejournal.com/[/url]On LinkedIn!, Google+, and Twitter.Freelance Writer: ShadowrunLatchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.

  • Brandie Tarvin (3/27/2015)


    ScottPletcher (3/25/2015)


    I don't see a clustered index on the table (but maybe I just missed it).

    There isn't one. I don't know why they don't have one, except that they really don't have a single unique column on which to put it and I'm not sure a composite would work because it would require lots of columns for uniqueness. On the other hand, I should discuss surrogate keys with them and see if an IDENTITY would work.

    You don't need unique columns, just whatever columns are most commonly used to access the data, as long as they don't extreme fragmentation. SQL will "uniquify" the clustered index itself. Identity is a terrible crutch if you have a natural column(s) that match the queries you use.

    Edit: My best guess doing an extremely quick look at existing indexes the clustered index could start with ReportYearMonth. Best would be to first review existing missing index and index usage stats before making a decision on the clustering index key(s).

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • ScottPletcher (3/27/2015)


    Brandie Tarvin (3/27/2015)


    ScottPletcher (3/25/2015)


    I don't see a clustered index on the table (but maybe I just missed it).

    There isn't one. I don't know why they don't have one, except that they really don't have a single unique column on which to put it and I'm not sure a composite would work because it would require lots of columns for uniqueness. On the other hand, I should discuss surrogate keys with them and see if an IDENTITY would work.

    You don't need unique columns, just whatever columns are most commonly used to access the data, as long as they don't extreme fragmentation. SQL will "uniquify" the clustered index itself. Identity is a terrible crutch if you have a natural column(s) that match the queries you use.

    And as I said, the table wasn't designed with a natural key in mind. It doesn't look like it from what I posted of the table, but the data is terribly generic. The columns that are supposed to be natural have a nasty tendancy to be duplicated by our end consumers.

    Brandie Tarvin, MCITP Database AdministratorLiveJournal Blog: http://brandietarvin.livejournal.com/[/url]On LinkedIn!, Google+, and Twitter.Freelance Writer: ShadowrunLatchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.

Viewing 13 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply