Need some help with a large data volume optimization

  • Alright, here's the very short form of a long story. I've got a ton of files I need to import in batches, because they associate with a data snapshot at the time and if I miss a day of processing I need to know what that snapshot looked like.

    I cannot get this thing to run in a reasonable time span. I've included my runtimes at the tail of this.

    Now, I've overloaded the structure on purpose, primarily because I'm trying to backfeed the setup data for a year instead of 15 days which is the standard batch drop. Portions of this system mimic a vendor system, so bear with me if the naming seems a bit odd.

    I have full dbo control of the DB in question and can make whatever mods I like. Also note that yes, I am doing a cursor by batch to limit the data volume being pushed to about 20% of the actual loaded data at a time. Unfortunate necessity, because I keep blowing up the tempdb log on our dev box if I don't. The cursor actually just handles a parameter list that's fed from a table variable and is not involved at the computational level in any way. The Clustered index and idx_PortSecAgg_staging_3 drop are for the same reasons.

    So, on with the show. The DDL of the tables involved:

    /****** Object: Table [dbo].[PortSecAgg] Script Date: 12/08/2010 12:49:39 ******/

    SET ANSI_NULLS ON

    GO

    SET QUOTED_IDENTIFIER ON

    GO

    SET ANSI_PADDING ON

    GO

    CREATE TABLE [dbo].[PortSecAgg](

    [PortfolioID] [int] NOT NULL,

    [SecurityID] [int] NOT NULL,

    [AsOfDate] [datetime] NOT NULL,

    [IsShortPosition] [bit] NOT NULL,

    [PortfolioCode] [varchar](10) NULL,

    [Symbol] [varchar](20) NULL,

    [Quantity] [numeric](18, 4) NULL,

    [SecTypeCode] [varchar](5) NULL,

    [Cusip] [varchar](15) NULL,

    [UnitCost] [money] NULL,

    [TotalCost] [money] NULL,

    [Price] [money] NULL,

    [MarketValue] [money] NULL,

    [AccruedInterest] [money] NULL,

    [UnrealizedGL] [money] NULL,

    [IsSubAccount] [bit] NULL,

    CONSTRAINT [PK_PortSecAgg] PRIMARY KEY CLUSTERED

    (

    [AsOfDate] ASC,

    [PortfolioID] ASC,

    [SecurityID] ASC,

    [IsShortPosition] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    ) ON [PRIMARY]

    GO

    /****** Object: Table [dbo].[PortSecAgg_Staging] Script Date: 12/08/2010 12:50:12 ******/

    SET ANSI_NULLS ON

    GO

    SET QUOTED_IDENTIFIER ON

    GO

    SET ANSI_PADDING ON

    GO

    CREATE TABLE [dbo].[PortSecAgg_Staging](

    [PortfolioID] [int] NULL,

    [SecurityID] [int] NULL,

    [PortfolioCode] [varchar](12) NULL,

    [Quantity] [numeric](18, 4) NULL,

    [SecTypeCode] [varchar](5) NULL,

    [Symbol] [varchar](20) NULL,

    [Cusip] [varchar](40) NULL,

    [UnitCost] [money] NULL,

    [TotalCost] [money] NULL,

    [Price] [money] NULL,

    [MarketValue] [money] NULL,

    [AccruedInterest] [money] NULL,

    [UnrealizedGL] [money] NULL,

    [IsSubAccount] [varchar](4) NULL,

    [AsOfDate] [datetime] NULL,

    [LoadFileName] [varchar](400) NULL,

    [BatchGroup] [varchar](100) NULL,

    [IsShortPosition] AS (case when [Quantity]<(0.0) then (1) else (0) end) PERSISTED NOT NULL

    ) ON [PRIMARY]

    GO

    SET ANSI_PADDING OFF

    GO

    SET ARITHABORT ON

    SET CONCAT_NULL_YIELDS_NULL ON

    SET QUOTED_IDENTIFIER ON

    SET ANSI_NULLS ON

    SET ANSI_PADDING ON

    SET ANSI_WARNINGS ON

    SET NUMERIC_ROUNDABORT OFF

    /****** Object: Index [idx_PortSecAgg_staging] Script Date: 12/08/2010 12:50:12 ******/

    CREATE CLUSTERED INDEX [idx_PortSecAgg_staging] ON [dbo].[PortSecAgg_Staging]

    (

    [AsOfDate] ASC,

    [PortfolioID] ASC,

    [SecurityID] ASC,

    [IsShortPosition] ASC,

    [BatchGroup] ASC

    )WITH (PAD_INDEX = ON, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 15) ON [PRIMARY]

    GO

    /****** Object: Index [idx_PortSecAgg_staging_2] Script Date: 12/08/2010 12:50:12 ******/

    CREATE NONCLUSTERED INDEX [idx_PortSecAgg_staging_2] ON [dbo].[PortSecAgg_Staging]

    (

    [Symbol] ASC,

    [SecTypeCode] ASC,

    [BatchGroup] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    GO

    /****** Object: Index [idx_PortSecAgg_staging_3] Script Date: 12/08/2010 12:50:12 ******/

    CREATE NONCLUSTERED INDEX [idx_PortSecAgg_staging_3] ON [dbo].[PortSecAgg_Staging]

    (

    [BatchGroup] ASC

    )

    INCLUDE ( [PortfolioID],

    [AsOfDate]) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    GO

    /****** Object: Table [dbo].[SecByBatch] Script Date: 12/08/2010 12:51:18 ******/

    SET ANSI_NULLS ON

    GO

    SET QUOTED_IDENTIFIER ON

    GO

    SET ANSI_PADDING ON

    GO

    CREATE TABLE [dbo].[SecByBatch](

    [SecurityID] [int] NULL,

    [SecTypeBaseCode] [varchar](10) NULL,

    [PrimaryCurrencyCode] [varchar](10) NULL,

    [Symbol] [varchar](20) NULL,

    [BatchGroup] [varchar](100) NULL,

    [secTypecode] [varchar](10) NULL

    ) ON [PRIMARY]

    GO

    SET ANSI_PADDING OFF

    GO

    /****** Object: Index [idx_SecByBatch] Script Date: 12/08/2010 12:51:18 ******/

    CREATE CLUSTERED INDEX [idx_SecByBatch] ON [dbo].[SecByBatch]

    (

    [BatchGroup] ASC,

    [secTypecode] ASC,

    [Symbol] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    GO

    /****** Object: Table [dbo].[vPortfolio] Script Date: 12/08/2010 12:53:33 ******/

    SET ANSI_NULLS ON

    GO

    SET QUOTED_IDENTIFIER ON

    GO

    SET ANSI_PADDING ON

    GO

    CREATE TABLE [dbo].[vPortfolio](

    [PortfolioID] [int] NOT NULL,

    [PortfolioCode] [varchar](32) NOT NULL,

    [PortfolioGUID] [uniqueidentifier] NOT NULL,

    [PrimaryContactID] [int] NULL,

    [OwnerContactID] [int] NULL,

    [OwnerAddressID] [int] NULL,

    [OwnerPhoneID] [int] NULL,

    [BillingContactID] [int] NULL,

    [BillingAddressID] [int] NULL,

    [BillingPhoneID] [int] NULL,

    [BankContactID] [int] NULL,

    [BankAddressID] [int] NULL,

    [BankPhoneID] [int] NULL,

    [ProcessingGroupID] [int] NOT NULL,

    [ReconciliationCloseDate] [datetime] NULL,

    [ShortName] [varchar](72) NOT NULL,

    [PortfolioStatus] [varchar](72) NOT NULL,

    [TaxNumber] [varchar](16) NOT NULL,

    [TaxStatus] [varchar](72) NOT NULL,

    [PortfolioTypeCode] [varchar](72) NOT NULL,

    [InvestmentGoal] [varchar](72) NOT NULL,

    [InitialValue] [money] NULL,

    [IsPositionOnly] [bit] NOT NULL,

    [IsExcludedFromGroupRules] [bit] NOT NULL,

    [DocumentLink] [varchar](520) NOT NULL,

    [varchar](260) NOT NULL,

    [IsIncomplete] [bit] NOT NULL

    ) ON [PRIMARY]

    GO

    SET ANSI_PADDING OFF

    GO

    /****** Object: Index [idx_vPortfolio] Script Date: 12/08/2010 12:53:33 ******/

    CREATE CLUSTERED INDEX [idx_vPortfolio] ON [dbo].[vPortfolio]

    (

    [PortfolioID] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    GO

    /****** Object: Index [idx_vPortfolio_1] Script Date: 12/08/2010 12:53:33 ******/

    CREATE NONCLUSTERED INDEX [idx_vPortfolio_1] ON [dbo].[vPortfolio]

    (

    [PortfolioCode] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    Table Rowcounts:

    select

    o.name,

    i.rowcnt

    from

    sysobjects o

    join

    sysindexes i

    ono.id = i.id

    where

    i.indid IN ( 0, 1)

    AND o.name IN ( 'vPortfolio', 'SecByBatch', 'PortSecAgg', 'PortSecAgg_staging')

    name rowcnt

    -------------------- --------------------

    PortSecAgg 36294437

    vPortfolio 37752

    PortSecAgg_Staging 48581536

    SecByBatch 576077

    Attached you'll find the SQLplans and actual query.

    Below you'll see my times. I expect a change I just included for vPortfolio will deal with a portion of the time for Update PortfolioIDs. I missed it not having a proper non-clustered index for some reason, but I haven't been focusing on that until this last run, it just blew up for some reason. I assume resource competition.

    The rest are about what I usually see.

    Step - Drop Indexes: 576 seconds.

    Step - Update PortfolioIDs: 21622 seconds.

    Step - Rebuild clustered index psa_stg: 1073 seconds.

    Step - Rebuild idx3 psa_stg: 67 seconds.

    Step - Declare and open cursor: 0 seconds.

    Step - Insert into @DelPortList for Batch 0001: 4 seconds.

    Step - Delete from PortSecAgg for Batch 0001: 41 seconds.

    Step - Update PortSecAgg for Batch 0001: 445 seconds.

    Step - Insert New to PortSecAgg for Batch 0001: 672 seconds.

    Step - Insert into @DelPortList for Batch 0002: 4 seconds.

    Step - Delete from PortSecAgg for Batch 0002: 590 seconds.

    Step - Update PortSecAgg for Batch 0002: 454 seconds.

    Step - Insert New to PortSecAgg for Batch 0002: 507 seconds.

    Step - Insert into @DelPortList for Batch 0003: 4 seconds.

    Step - Delete from PortSecAgg for Batch 0003: 544 seconds.

    Step - Update PortSecAgg for Batch 0003: 443 seconds.

    Step - Insert New to PortSecAgg for Batch 0003: 553 seconds.

    Step - Insert into @DelPortList for Batch 0004: 7 seconds.

    Step - Delete from PortSecAgg for Batch 0004: 960 seconds.

    Step - Update PortSecAgg for Batch 0004: 549 seconds.

    Step - Insert New to PortSecAgg for Batch 0004: 599 seconds.

    Step - Insert into @DelPortList for Batch 0005: 4 seconds.

    Step - Delete from PortSecAgg for Batch 0005: 535 seconds.

    Step - Update PortSecAgg for Batch 0005: 597 seconds.

    Step - Insert New to PortSecAgg for Batch 0005: 717 seconds.

    There are a few particular things I'd like some assistance with. The first is it seems like the DELETE portion is just going to be cost of doing business. It's a lot of rows and there's just not a lot I can do about that, but I'm open to suggestions.

    The Update seems WAY more expensive than it needs to be, and I believe it's because of how long it's carrying the rows. Is this entirely because of the scalar? I know it's showing most of the work to be a scan on the staging table, but I can't help but feel I'm missing something.

    The insert is basically why they created MERGE, but I can't use MERGE in 2k5, so, I'm stuck with the old fashioned Left Anti-Semi Join. Again, looking for suggestions to thin this down.

    I thank anyone who takes a gander at this for me and can make some detailed recommendations and/or point out something I missed.

    The .txt file is a .sql file, just rename it. I had to swap it because the uploader doesn't like .sql.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • The first part that strikes me as a target for optimization is the delete.

    DELETE psa

    FROM

    PortSecAgg AS psa

    JOIN

    @DelPortList AS dpl

    ONpsa.PortfolioID = dpl.PortfolioID

    AND psa.AsOfDate = dpl.AsOfDate

    LEFT JOIN

    PortSecAgg_staging AS psa_s

    ONpsa.PortfolioID = psa_s.PortfolioID

    AND psa.SecurityID = psa_s.SecurityID

    AND psa.AsOfDate = psa_s.AsOfDate

    AND psa.IsShortPosition = psa_s.IsShortPosition

    AND psa_s.BatchGroup = @BatchGroup

    WHERE

    psa_s.PortfolioCode IS NULL

    You're getting a clustered index scan on the PortSecAgg_staging table per the execution plan. An index with BatchGroup as the leading edge, and the rest of the columns from the Join On clause following, will almost certainly speed up the deletes. It's going to look an aweful lot like your clustered index, but with BatchGroup as the leading edge instead of last. (That's probably why it's scanning the clustered.)

    I don't have time right now to look at the rest of it right now (meeting in 2 minutes), but that would be my first target.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • GSquared (12/8/2010)


    You're getting a clustered index scan on the PortSecAgg_staging table per the execution plan. An index with BatchGroup as the leading edge, and the rest of the columns from the Join On clause following, will almost certainly speed up the deletes. It's going to look an aweful lot like your clustered index, but with BatchGroup as the leading edge instead of last. (That's probably why it's scanning the clustered.)

    I don't have time right now to look at the rest of it right now (meeting in 2 minutes), but that would be my first target.

    I believe I tried this, but it certainly couldn't hurt. I'll adapt idx_3 for that purpose, since it's already leading edged on the BatchGroup. Rerunning and will get back to you with results from that mod.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • How do you populate the PortSecAgg_staging table?

    Do you really need to update portfolioID and securityID for each and every row?

    That basically means to join 36mill rows to 48mill rows without the chance of an index seek....



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • LutzM (12/8/2010)


    How do you populate the PortSecAgg_staging table?

    Do you really need to update portfolioID and securityID for each and every row?

    That basically means to join 36mill rows to 48mill rows without the chance of an index seek....

    Sadly, yes. The data being sent to me carries business keys only, and the rest of the data relies on the surrogate keys. Add to that the business key can change it's form from day to day but still needs to be associated with the surrogate key and you can see the mess.

    Population of PortSecAgg is a foreach looped SSIS that goes through an ftp directory equivalent and drops all the files to the table. BatchGroup is determined via the file naming, so that all files delivered at the same time get a BatchGroup.

    I don't see any way to avoid the each and every row update there, it's pretty much stuck that way.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • Craig Farrell (12/8/2010)


    LutzM (12/8/2010)


    How do you populate the PortSecAgg_staging table?

    Do you really need to update portfolioID and securityID for each and every row?

    That basically means to join 36mill rows to 48mill rows without the chance of an index seek....

    Sadly, yes. The data being sent to me carries business keys only, and the rest of the data relies on the surrogate keys. Add to that the business key can change it's form from day to day but still needs to be associated with the surrogate key and you can see the mess.

    Population of PortSecAgg is a foreach looped SSIS that goes through an ftp directory equivalent and drops all the files to the table. BatchGroup is determined via the file naming, so that all files delivered at the same time get a BatchGroup.

    I don't see any way to avoid the each and every row update there, it's pretty much stuck that way.

    Correct me if I'm wrong but don't you have the value of s.securityID inside your ForEach loop container as an output value of filling the SecByBatch table? If so, you could use that value when populating the PortSecAgg_staging table and avoid the additional join in the update statement afterwards...

    An additional option would be to get the portfolioID value inside the c.u.r.s.o.r. instead of a separate update of the whole table. Due to the use of DISTINCT I'd expect you'd have less rows to deal with.

    Also, you might want to consider a temp table instead of table variable inside the *cough* loop to benefit from statistics.



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • LutzM (12/8/2010)


    Correct me if I'm wrong but don't you have the value of s.securityID inside your ForEach loop container as an output value of filling the SecByBatch table? If so, you could use that value when populating the PortSecAgg_staging table and avoid the additional join in the update statement afterwards...

    SecByBatch is actually a separate package load/loop alltogether. Although, it's an interesting idea. Load SecByBatch FIRST, then merge join in the SSIS from SecByBatch as the information gets loaded. I've never had much luck with Merge Join operations in SSIS performing incredibly well, mostly because it's a pain to see their execution plans to optimize them.

    An additional option would be to get the portfolioID value inside the c.u.r.s.o.r. instead of a separate update of the whole table. Due to the use of DISTINCT I'd expect you'd have less rows to deal with.

    DISTINCT? *hunts through his code* Oh, right, those two. First one's for the batch identification list. No biggie there. Then there's the one get the Deletion list on a separate key than the staging table is stored in. This is because my deltas don't show removals, so I have to reload that portfolio/asofdate combination when I see any entries for it.

    I had tried doing the PortfolioID update in the loop as well. It actually did basically the same thing 5 times, and did all sorts of nastiness to my clustered along the way. It's actually quicker to just hit it all at once and get it done and over with.

    Also, you might want to consider a temp table instead of table variable inside the *cough* loop to benefit from statistics.

    Hmmmmm.... Dere's value in dem dar words. I reckin' yer right, pardner.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • Sorry for being imprecise... 😉

    Right at the beginning of your *cough* c.u.r.s.o.r. you populate the WannaBeTempTable @DelPortList

    INSERT INTO @DelPortList

    SELECT DISTINCT psa_s.PortfolioID, psa_s.AsOfDate

    FROMPortSecAgg_staging AS psa_s

    WHEREpsa_s.BatchGroup = @BatchGroup

    I'd expand this table to hold the value for PortfolioCode, insert the PortfolioCode instead of PortfolioID (eliminating the need to update the staging table with this value) and add the values of PortfolioID using a separate join on vPortfolio. Since PortfolioCode within DelPortList should be unique, you should definitely get a better performance than the join currently used.

    Regarding SecByBatch: I'm not sure if you'd need to use a MergeJoin... What would prevent you using a Lookup? (The process itself is still some kind of vague to me I admit...)



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • LutzM (12/8/2010)


    Sorry for being imprecise... 😉

    Right at the beginning of your *cough* c.u.r.s.o.r. you populate the WannaBeTempTable @DelPortList

    I'd expand this table to hold the value for PortfolioCode, insert the PortfolioCode instead of PortfolioID (eliminating the need to update the staging table with this value) and add the values of PortfolioID using a separate join on vPortfolio. Since PortfolioCode within DelPortList should be unique, you should definitely get a better performance than the join currently used.

    Bear with me here. PortfolioCode and PortfolioId will both repeat for different AsOfDates. They're pretty much SurrogateKey/BusinessKey. Avoiding updating this in the staging, though, means I can't link to PortSecAgg on the surrogate key, which is my main cluster for other, later operations. PortSecAgg ends up as one of three hearts of a reporting structure.

    What I misunderstand here is the adding the values of PortfolioID via a join when we've just removed it. What is the purpose of that, besides adding another join into the mix when I go merrily deleting?

    Regarding SecByBatch: I'm not sure if you'd need to use a MergeJoin... What would prevent you using a Lookup? (The process itself is still some kind of vague to me I admit...)

    Hm, Good question, and I don't have an immediate answer for you. I need to explore the lookup component before I can answer that.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • G'dang... that Lookup component is QUICK. It almost keeps up with the import itself. (175k rows/file, * 6 files, under 2 minutes)

    Great recommendation, that's one headache out of the way. Thank you for that.

    It runs nicely even without dropping/rebuilding the staging table's indexing. I'm half tempted to leave them in place but I'd still have to do a statistics rebuild and defrag afterwards.

    To the other recommendations:

    The change to idx_3 with BatchGroup as the leading edge didn't perform as hoped. Not really a surprise, as 20% of the table is in each batch, so the leading edge isn't as good as one'd hope. No one of those columns is highly selective enough. I may try to force the index hint, currently it's ignored.

    I swapped to the #tmp instead of @Tmp for DelPortList. No significant difference.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • After re-reading the code you provided I agree that my recommendation regarding PortfolioCode/PortfolioID doesn't make that much sense. I totally overlooked that you need it for the insert and you don't use the temp table for that but the original table. The only question that remains from my side regarding those two columns: Why do you have a SurrogateKey and a BusinessKey? If you'd have PortfolioCode only (as a natural key) you wouldn't need the update in the first place... But I'm sure there are valid reasons for it... 😉

    Another option for the mass update might be a TOP-based batch version as described for instance here[/url] (maybe even as a replacement for the BatchGroup based loop). Since you posted in 2K5, maybe a ROWCOUNT based version would be an option, too. But I'd vote against it since it is marked as deprecated in terms of influencing INSERT/UPDATE/DELETE.

    Regarding the Lookup component: I'm glad at least one of my approaches made a difference to the better. 😀



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • LutzM (12/9/2010)


    The only question that remains from my side regarding those two columns: Why do you have a SurrogateKey and a BusinessKey? If you'd have PortfolioCode only (as a natural key) you wouldn't need the update in the first place... But I'm sure there are valid reasons for it... 😉

    Vendor annoyance. Only reason. Stupid system is TOO customizable and the natural keys can get modded without updating the historicals, which they need to continue to be associated with, thus the surrogates.

    Another option for the mass update might be a TOP-based batch version as described for instance here[/url] (maybe even as a replacement for the BatchGroup based loop). Since you posted in 2K5, maybe a ROWCOUNT based version would be an option, too. But I'd vote against it since it is marked as deprecated in terms of influencing INSERT/UPDATE/DELETE.

    Yep, I'm in 2k5. That huge update component looks like it's intention is to avoid blocking, and I've done similar in the past when required. In this case, I'm half tempted to just open an XLOCK TABLOCK and force it to avoid mucking with locks at all. The entire DB is dedicated to the processing I'm doing.

    I haven't seen that method improve speed, before, though. A single scan vs. a billion seeks, the scan seems to win after a certain point.

    Regarding the Lookup component: I'm glad at least one of my approaches made a difference to the better. 😀

    You'd be amazed at HOW much difference that is making. It's a little disturbing that I hadn't realized the power of that previously.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • Regarding the lookup: I found

    Jamie's blog very helpful. It might add some more benefit to your task performance, too.



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • Which edition of SQL Server is this? Standard?

  • Paul White NZ (12/11/2010)


    Which edition of SQL Server is this? Standard?

    Enterprise afaik. I'll double check Monday when I'm back in the office.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

Viewing 15 posts - 1 through 15 (of 72 total)

You must be logged in to reply to this topic. Login to reply