SQL Query Needed

  • Hi I am new at this. I am getting stuck with this SQL query. We are paasing from AS 400 database to informtica. And there are like arnd a million records. These records also have some duplicates. Now there is a column by the name RRN which is somewhat like row ID in oracle. The filter condition shd be in such a way that the duplicates should not be loaded and only 1 row from the dupluicates shd be loaded. The one which has the MAX RRN Value . So for eg there are 4 records which are duplicate then only one with the MAX RRN value shd be loaded. As that was the latest inserted . I need an SQL query which shd just pass rows without duplicates and amongs the duplicates only those with MAX RRN . Any help is highly appreciated.

     

  • Sorry if I'm mistaken, but you should be able to identify the rows that create uniqueness when you use informatica. That is what is for.

    select foo, Bar

    from mytable

    join (select min(foo) as firstfoo

          from mytable

          Group by bar) T1 on firstfoo = foo

    (in the Group by statement is where you put all the fields which cause uniqueness.)

  • I am sorry but I got confused with the uniqueness .. I will explain again as to whats happening.. There are 1 million records which are we are importing from the AS 400 database to Informatica via Source Qualifier .. And this is the initialization .so it would run only once. The SQL query would be used in the Source Qualifier which should then import data which is duplicate free. As there would be multiple sets of duplicate rows..The query should just pick 1 row from all duplcate sets and filter out the others. And this row shd be the one with MAX RRN Value among each set.

     

    The fields are RRN which could be called as the primary key , Product ID, Product Name , SSNNO, SSNType, Product description.

     

    So if there are 4 rows which are duplicates with RRN values 100, 200, 300, 666 then it should only pick the row with RRN value 666 and filter out the remaining. I appreciate ur prompt reply.. I hope I have made the question simpler for you guys .. reply awaited

  • I assume RRn values only will be differed and remaining columns have same values in the duplicate rows. In that case write Sql statement like this

    select max(RRN) as RRN, ProductID, ProductName, SSNNO, SSNType, productdescription from <table_name>

    group by  ProductID, ProductName, SSNNO, SSNType, productdescription

  • THANKS A LOT GUYS .. IT SEEMS TO RUN .. WILL LET U GUYS KNOW IF THERES A PROBLEM..THANKS AGAIN

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply