Merging data from 3 sources

  • I have 3 sets of the same mostly complete telecom data that I am trying to merge into one compete set. I have the DTS set up to import each into it’s own table and a sp that will fill the main table from each. One file is priority (extra data) and each of the other two both contain some exclusive data. I will need to recreate the complete table monthly. My first attempt was to load all into one table and then use a derived table query to remove the duplicate data. This proved to be too system intensive by way of 1.5 mil records. My next though was to load the priority first and then query to add missing data, but again I realize that this would also be a heavy chore. Any ideas for the best method to complete this task would be appreciated.


    MISfIT

  • Well, without knowing too much other than what you've described:

    I would try to avoid loading (potential) duplicate records in the final table, and then having to perform processing to remove the dups. It would be better to insert only the data you need into the final table!  

    First, if you're working with large tables (sounds like you are), make sure your indices are set up carefully on each of the 3 intermediate tables for the queries that you'll need to run. 

    Then you should be able to do a big: 

    INSERT INTO FinalTable()

        SELECT ...

    pulling data from the 3 intermediate tables, using a combination of INNER and/or OUTER JOINS as appropriate.  You can use the "EXISTS()" or "NOT EXISTS()" logic in your WHERE-clause to check if specific records exist/don't exist in any one of the 3 tables.   This to help ensure you get only the records you want.

    Good luck.

    - john 

  • Thanks for the reply John. After rethinking this again, I realize that the best approach might be in the DTS by way of loading the primary table first, then loading the other two tables based on info not in the primary load. This should allow me to simple merge the three tables into a complete data set.


    MISfIT

  • Hi,

    I am trying to implement the Telecom data analytics and strougling for the datamodel of Telecom.

    If possible can you please help me on this.

    Thanks in advance.

    VaraM

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply