May 5, 2005 at 10:18 am
Hello,
I need a bit of help. I have a large data load issue. The data comes from a legacy system. The dataload consists of two stages in one DTS package:
Occasionally, and I mean occasionally, step 2 will fail due to a set of bad records in which the parent record is missing or the child key has been mistyped. DTS does not report the exact record which fails in the INSERT INTO" from "SELECT *" in the SQL TASK in the final stage (Stage 2) of the Data Migration.
Without using a cursor in a stored proc and calling the stored proc to run the migration from Staging into the DW. Is there a manner in which to migrate the Staging Records In Database 1 to the DW in Database 2 and obtain the exact row which fails (if any fail)?
Any help/insight would be appreciated in how to capture the failed record during the migration into the DW using DTS.
May 5, 2005 at 10:25 am
There isn't a good way to do this. You could use a stored procedure and run it in batches, say of 100 or 10 or something. That would give you an idea of what failed, but you'd have to troubleshoot.
You might also want to build some queries to check for things that cause the job to fail and check for them before the insert.
May 5, 2005 at 5:03 pm
If you use the datapump task instead of the SQL task, then you can set a limit on the number of failing rows and log the error rows to a text file.
But instead of doing this, you should really be scrubbing your data before inserting into the datawarehouse. This is one of the main reasons for having a staging database in the first place. Once you have a clean set of data you shouldn't get any insert errors.
--------------------
Colt 45 - the original point and click interface
May 5, 2005 at 8:44 pm
I realize that I need to scrub my data. Unfortunately scubbing the system on a field by field basis is going to be a nightmare. This database has over 30+ years of data in it. It has in excess of 80 tables with somewhere near 5000+ fields in the entire database.
Could some one please point me to an example of scubbing in which I can log any incorrect records to a text file so the data can be massaged an reimported at a later date?
Thanks
May 5, 2005 at 9:28 pm
Don't be misled by things like table and field counts. The process of data scrubbing is done in many datawarehouses that hold 100's of gigabytes of data.
You need to isolate the points where bad/missing data can cause problems with the insert. Things like missing keys, bad dates, etc... Once you have these the process of scrubbing the data becomes a bunch of T-SQL update statements. Take a look at some of the articles over at http://www.rkimball.com
Are you able to use the Datapump task instead of the ExecuteSQL task? As mentioned in my previous post you can specify the maximum number of rows in error to allow and log the rows in error to a text file.
--------------------
Colt 45 - the original point and click interface
Viewing 5 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply