SQL Clone
SQLServerCentral is supported by Redgate
Log in  ::  Register  ::  Not logged in

Matching a flat file to its source db

Matching a flat file to its source db

The other White Meat
The other White Meat
SSC Rookie
SSC Rookie (40 reputation)SSC Rookie (40 reputation)SSC Rookie (40 reputation)SSC Rookie (40 reputation)SSC Rookie (40 reputation)SSC Rookie (40 reputation)SSC Rookie (40 reputation)SSC Rookie (40 reputation)

Group: General Forum Members
Points: 40 Visits: 53
I am having issues, I will explain what I am doing and what I want.

I am have a database, and a flat file, and I want to check them against one another. The flat file is from an extract of a report generated by the database. The database has all the elements in the flat file minus some decoded values. So I load my flat file (nvarchar), I have changed all them eta data to match the data types and sizes in the database(char). I then use Multihash and return a hash value of all the columns. The other side I make my SQL, with the needed joins to get the decoded values and Multihash that as well.

I only care about unmatched, so from the merge join i use a conditional split to separate based on ISNULL(Hash_FlatFile)m ISNULL(Hash_DB), and the defualt is Match. I do nothing with the match except terminate to a multicast for debugging. The results should show me what isn't matched from the database and what isn't matched from the Flat file. the count should be even the entire time, so barring any hiccups the file and db should be equal.

I have noticed a few things, One is metadata related, one is data type related too.

The metadata for the database connection seems to get goofed up, I have some inner joins in the SQL command to grab decoded values, but the lose their position, instead of being the third column in the select it moves to the end. I have tried to sort, and a conversion transform to no avail. This lead to the columns not being in the same order when I hash the row. the flat file hash will have (Col A, Col B , Col C) the DB hash will have ( Col A, Col C, Col B) the only reason that Col B is at the end from what i think is that it is from a inner join, and some internal order of operations?

The second issue seems to be some data type things, the flat file is a pipe delimited file with padding to the right(some times) and all the fields stored as nvarchars. I have seen a case where a number was converted to a char with spaces padding to the left. So I have ~ 50 extracts to map, and the padding is unknown, Any suggestions to get apples to apples with the least amount of pain?

Am i going at this the wrong way? I could try a lookup transformation, but if my hashes aren't equal then I am lost too. I discovered the left padding on the number converted using fuzzy lookup, but that's not an option as I am not in an enterprise server.




You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum