Design Pattern: Create CSV list from N individual data flow rows

  • Hi guys,

    I am trying to generate a performance optimised package to get a limited set of records from a remote system written in Progress and carry out SCD management using Lookup on the SQL database.

    The challenge is that the source dataset is 350M records over 50M policies and does not have an index that allows me to order the data at source on the fields I require. My delta will normally be about 80K policies but may increase to 500K at peak times. I don't want to read all 350M records to identify changed and new records, but to generate 80K queries to the remote server is slow and kills the performance on that server. I have a proof of concept that allows me to take 5 rows from the delta and create a CSV list.

    Using Konenans Row Number Transformation, give each incoming row a row number.

    Multicast 5 ways

    For each Multicast output

    >>do a conditional split to filter rows based on the mod of the row number

    >>Add another row number transformation ( now have 5 rows with the same RNT2 value)

    >> Sort the data by RNT2

    >>Merge Join MC1 and MC2 on RNT2, allocating the MC2 values to new fields

    >> repeat the Merge join for MJ1 to MC3, MJ2 to MC4, MF3 to MC5

    I now have 5 dft rows combined into one row

    Derived column to collate the 5 values into a CSV string

    Output the CSV string to an package scoped object variable using Recordset destination

    Now use the object collection as the enumerator for a For-Each Loop and pass the CSV into an expression for the remote SQL call and the SQL server Lookup call and sort only a limited amount of data in the dataflow.

    This works, but performance testing indicates that I can get reasonable performance from the remote server with 350 policy numbers in the list. I could extend the proof of concept to do the same thing 350 times (YUK!) or go through a number of iterations. I.e. extent the POC to build a CSV of 7, then take that results set and do the same to append 7x7 then do it again to append 7x7x7 = 343. but both of these methods seem clunky.

    Is there a more flexible solution to let me convert N dft records into a CSV string. This way I can manipulate N from a config file as a parameter.

    TIA

    Obiron

Viewing 0 posts

You must be logged in to reply to this topic. Login to reply