Matching a flat file to its source db

  • I am having issues, I will explain what I am doing and what I want.

    I am have a database, and a flat file, and I want to check them against one another. The flat file is from an extract of a report generated by the database. The database has all the elements in the flat file minus some decoded values. So I load my flat file (nvarchar), I have changed all them eta data to match the data types and sizes in the database(char). I then use Multihash and return a hash value of all the columns. The other side I make my SQL, with the needed joins to get the decoded values and Multihash that as well.

    I only care about unmatched, so from the merge join i use a conditional split to separate based on ISNULL(Hash_FlatFile)m ISNULL(Hash_DB), and the defualt is Match. I do nothing with the match except terminate to a multicast for debugging. The results should show me what isn't matched from the database and what isn't matched from the Flat file. the count should be even the entire time, so barring any hiccups the file and db should be equal.

    I have noticed a few things, One is metadata related, one is data type related too.

    The metadata for the database connection seems to get goofed up, I have some inner joins in the SQL command to grab decoded values, but the lose their position, instead of being the third column in the select it moves to the end. I have tried to sort, and a conversion transform to no avail. This lead to the columns not being in the same order when I hash the row. the flat file hash will have (Col A, Col B , Col C) the DB hash will have ( Col A, Col C, Col B) the only reason that Col B is at the end from what i think is that it is from a inner join, and some internal order of operations?

    The second issue seems to be some data type things, the flat file is a pipe delimited file with padding to the right(some times) and all the fields stored as nvarchars. I have seen a case where a number was converted to a char with spaces padding to the left. So I have ~ 50 extracts to map, and the padding is unknown, Any suggestions to get apples to apples with the least amount of pain?

    Am i going at this the wrong way? I could try a lookup transformation, but if my hashes aren't equal then I am lost too. I discovered the left padding on the number converted using fuzzy lookup, but that's not an option as I am not in an enterprise server.

    Thanks

    A

Viewing 0 posts

You must be logged in to reply to this topic. Login to reply