How to load detail records for a select batch of header records

  • I have a large Sybase database that has 150M+ claim (header) records and 190M claim line (detail) records. We want to bring over the last 3 years of data into a new data warehouse. The initial data pull will be by date of service and then incrementally by last activity date.

    The problem is that the detail table only links back to the header table with an ID field. There is no date on the detail table that I can use to pull the detail records. The source data is not structured well enough to join the header and detail records together and pull the data from both tables into one source recordset.

    I can pull all the claim IDs from the header table for the date range but I don't know how to set up SSIS to pull all the records from the detail source table for those Claim IDs. Can anyone give some ideas on how to accomplish this?

    Peggy C.

  • peggy.carstens (2/23/2016)


    I have a large Sybase database that has 150M+ claim (header) records and 190M claim line (detail) records. We want to bring over the last 3 years of data into a new data warehouse. The initial data pull will be by date of service and then incrementally by last activity date.

    The problem is that the detail table only links back to the header table with an ID field. There is no date on the detail table that I can use to pull the detail records. The source data is not structured well enough to join the header and detail records together and pull the data from both tables into one source recordset.

    I can pull all the claim IDs from the header table for the date range but I don't know how to set up SSIS to pull all the records from the detail source table for those Claim IDs. Can anyone give some ideas on how to accomplish this?

    Peggy C.

    First of all, forget about SSIS for the moment and decide whether you can do it in SQL.

    If it were SQL Server, you could put all of the selected Ids into a temp table (with a clustered index on Id for performance).

    Then

    select d.col1, d.col2, ...

    from detail d join temp t on d.Id = t.Id

    I can't see how you could get any better performance than that.


  • That was my first inclination but it takes a long time because the Sybase database is not structured very well and I have no control over it.

    It takes an average of 4 minutes to transfer 20,000 records. Buffer size changed to 20K hoping to speed it up a little but that didn't help as much as I wanted. With approximately 850,000 records in one month that will take about 2.8 hours.

    Seems like a long time... but maybe that's what I have to put up with because of the source data.

  • peggy.carstens (2/23/2016)


    I can pull all the claim IDs from the header table for the date range but I don't know how to set up SSIS to pull all the records from the detail source table for those Claim IDs.

    I don't see what's the problem.

    Do you know how to write SELECT statement? Do you know how to use INNER JOIN in that SELECT statement?


    Alex Suprun

  • I can write the SQL to pull both the header and detail records (duh!). It's the time it takes to pull the data that's the issue. Is there a faster way to get the data than the join in the source since the source is really slow when doing a join? I have no idea why, but it's probably a combination of Sybase, the schema, indexes and number of records in the two tables.

  • peggy.carstens (2/24/2016)


    I can write the SQL to pull both the header and detail records (duh!). It's the time it takes to pull the data that's the issue. Is there a faster way to get the data than the join in the source since the source is really slow when doing a join? I have no idea why, but it's probably a combination of Sybase, the schema, indexes and number of records in the two tables.

    Unless it's actually faster to bring all the detail rows over unfiltered and then filter them in SQL Server, I can't think of a better way.


  • Thank you.

Viewing 7 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply