Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase

Performance hit on Indexing (I Think) Expand / Collapse
Author
Message
Posted Thursday, November 15, 2012 3:17 AM
SSC-Addicted

SSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-Addicted

Group: General Forum Members
Last Login: Monday, November 17, 2014 2:27 AM
Points: 418, Visits: 652
Hi guys,

I am looking after / building a solution that loads data from flat files using bulk insert.
The file contains a unique reference (PolicyID) which is in each file but the system architect has requested that each file load is uniquely identified. To do this we have added Validfrom and ValidTo columns to the table. The Bulk Insert now uses a format file to skip these columns as they are not in the source file.

The whole process is

Update the current records which have a dummy end date (31st Dec 2099) with today's date.

Bulk insert the file with the ValidFrom and ValidTo dates being NULL.

Update the new records with ValidFrom = todays date and ValidTo = dummy date.

I have an index built (Unique but not a PK and not clustered) PolicyID (ASC),ValidFrom(ASC),ValidTo(ASC)

The data load seems to be reasonably quick; But I think the index update is taking a huge performance hit. The file is about 750K records and the first update will be updating records at the end of the index pages, then the load will load with NULL values which I am assuming are going into the front of the index, then being rewritten to go to the end of the index. It this a reasonable assumption and if so,

how do I measure it whilst the script is running, and is it worth turning off the index whist doing the load and then rebuilding it. The data will be loaded on a weekly basis and I need to replicate the process across about 20 files of differing sizes.

I will normally only be reporting on records from the most current insert, looking for a specific policyID, so should I change the order or direction of the index fields: PolicyID and either ValidFrom or ValidTo should be enough to give me a unique key. Is it worth dropping either of the fields from the index.



Post #1385034
Posted Thursday, November 15, 2012 4:04 AM


SSC-Forever

SSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-Forever

Group: General Forum Members
Last Login: Today @ 1:45 AM
Points: 40,430, Visits: 36,881
It's quite common to drop or disable nonclustered indexes before a data load and rebuild or recreate them afterwards.


Gail Shaw
Microsoft Certified Master: SQL Server 2008, MVP
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Post #1385071
Posted Thursday, November 15, 2012 4:33 AM
SSC Eights!

SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!

Group: General Forum Members
Last Login: Yesterday @ 9:31 AM
Points: 922, Visits: 2,524
Are the ValidFrom and ValidTo Date or DateTime, as you couldput default constraints on them.

That way you dont have to run the update after the load and only have to rebuild the index after the load is complete.


_________________________________________________________________________
SSC Guide to Posting and Best Practices
Post #1385084
Posted Thursday, November 15, 2012 4:56 AM
SSC-Addicted

SSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-Addicted

Group: General Forum Members
Last Login: Monday, November 17, 2014 2:27 AM
Points: 418, Visits: 652
Thanks Jason, I am going to investigate that option a bit further. It seems to make a lot of sense as it involved only hitting the record once (when it is created)

My concern is that all 20 files need to load with the same FROM date even if the load runs over midnight (I have no idea when it will be run or how long it will take on the production server) and although I can use standard CASE WHEN THEN END logic to roll the date back if we go over midnight, I really want to be able to set the default from a variable value (which you obviously can't do because then it wouldnt be a default )

The Bulk Insert script is called 20 times from an SSIS package and the From Date would be the package start date

Post #1385091
Posted Thursday, November 15, 2012 5:06 AM
SSC Eights!

SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!

Group: General Forum Members
Last Login: Yesterday @ 9:31 AM
Points: 922, Visits: 2,524
If it can load over midnight then you might have a problem.

Just a thought but cant you alter the default constraint at the start of each load process so that rather than doing a GetDate() it uses a fixed date for that batch.


_________________________________________________________________________
SSC Guide to Posting and Best Practices
Post #1385093
Posted Thursday, November 15, 2012 5:52 AM
SSC-Addicted

SSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-Addicted

Group: General Forum Members
Last Login: Monday, November 17, 2014 2:27 AM
Points: 418, Visits: 652
@Jason

I don't think so, the default has to be set as part of the DDL for the table and you can't use variables in that situation.

The SSIS package calls a stored procedure which builds a dynamic Bulk Insert script which is then executed with exec sp_executesql() I do this because the date formats in the source file are DMY which you can set in t-SQL but not with the Bulk Insert Task in SSIS. I opted not to go down the Data Transfomation task route as I have already built the format files and the only transform I need to do is on the date; it didn't seem worth it to confuse future maintainers (I am a contrator) with having to remap the SSIS tasks each time the data changes - much easier to edit the BCP format files.

Post #1385106
Posted Thursday, November 15, 2012 6:23 AM
SSC Eights!

SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!

Group: General Forum Members
Last Login: Yesterday @ 9:31 AM
Points: 922, Visits: 2,524
Fair point Aaron, it would mean dipping into Dynamic SQL to inject the value into the Alter table script.

_________________________________________________________________________
SSC Guide to Posting and Best Practices
Post #1385118
« Prev Topic | Next Topic »

Add to briefcase

Permissions Expand / Collapse