Merge 2 identical table based on timestamp and column values

  • I have two identical tables with same columns. Only values are different. Both tables will get updated regularly with time stamp.

    Requirement is to merge these two tables and insert only the latest updated records into new table based on time stamp.

    Another condition is both tables have 3 columns called ‘BiosID’, ’Serial number’ and ‘Hostname’.

    If BiosID is not null, then based on BiosID and timestamp I should get the latest records.

    If BiosID is null, then it should check for Serial number and should get latest records.

    If both BiosID and Serial number is null, then it should check for Hostname to get the latest records.

    If all 3 columns are null, then it should ignore that row.

    Tables are:

    TEST1:

    MachineID |Timestamp |Account |BiosID |Serial number |Hostname |

    1000 | 10-01-2016 |A | Abcd1111 | null | null |

    1001 | 11-01-2016 |B | null | 7890 | XYZ |

    1002 | 12-02-2016 |C | null | null | PQR |

    1003 | 13-01-2016 |D | null | null | DEF |

    Test2:

    MachineID |Timestamp |Account |BiosID |Serial number |Hostname |

    1000 | 10-02-2016 |W | Abcd1111 |1234 |ABC |

    1001 | 11-02-2016 |X | null | 7890 |null |

    1002 | 12-01-2016 |Y | null | null |null |

    1003 | 13-02-2016 |Z | null | null |DEF |

    Can anyone help me how to do this?

    Thanks in advance.

  • Hi and welcome to the forums.

    Which one is the target table? Test1? Test2?

    Please post table scripts, sample data in a consumable format and the expected results.

    Guidelines here: http://spaghettidba.com/2015/04/24/how-to-post-a-t-sql-question-on-a-public-forum/

    -- Gianluca Sartori

  • Hello,

    Both Test1 and Test2 are source tables and result should be stored in a different destination table.

    I even posted the table structure with values.

    Result tale should be:

    Result:

    MachineID |Timestamp |Account |BiosID |Serial number |Hostname |

    1000 |10-02-2016 |W |Abcd1111 |1234 |ABC |

    1001 |11-02-2016 |X |Null |7890 |null |

    1002 |12-02-2016 |C |null |null |PQR |

    1003 |13-02-2016 |Z |null |null |DEF |

  • I'm assuming a lot of things here that you did not say.

    1) I'm assuming MachineID is the primary key of all your tables

    2) I'm assuming you already have some data in your destination table

    3) I'm assuming you want to update the TimeStamp and Account columns with the latest data

    4) I'm assuming you always want to update existing data instead of appending new MachineIDs

    Since you said

    I even posted the table structure with values.

    I would like you to know that I had to assume data types and primary keys and I had to convert your dates to a format usable by SQL Server. Do you see why I asked for table and data scripts? Did you read the posting guidelines I pointed you to? I had to do this because you refused to do so. Next time, help us help you.

    USE tempdb;

    IF OBJECT_ID('DestinationTable') IS NOT NULL

    DROP TABLE DestinationTable;

    CREATE TABLE DestinationTable (

    MachineID INT PRIMARY KEY

    ,Timestamp DATE

    ,Account VARCHAR(3)

    ,BiosID VARCHAR(10)

    ,Serial_number VARCHAR(6)

    ,Hostname VARCHAR(6)

    );

    -- Assuming you already have some data in the destination table...

    INSERT INTO DestinationTable(MachineID,Timestamp,Account,BiosID,Serial_number,Hostname) VALUES (1000,'20160101','P','Abcd1111',NULL,NULL);

    INSERT INTO DestinationTable(MachineID,Timestamp,Account,BiosID,Serial_number,Hostname) VALUES (1001,'20160101','Q',NULL,'7890','XYZ');

    INSERT INTO DestinationTable(MachineID,Timestamp,Account,BiosID,Serial_number,Hostname) VALUES (1002,'20160101','R',NULL,NULL,'PQR');

    INSERT INTO DestinationTable(MachineID,Timestamp,Account,BiosID,Serial_number,Hostname) VALUES (1003,'20160101','S',NULL,NULL,'DEF');

    DECLARE @test1 TABLE (

    MachineID INT PRIMARY KEY

    ,Timestamp DATE

    ,Account VARCHAR(3)

    ,BiosID VARCHAR(10)

    ,Serial_number VARCHAR(6)

    ,Hostname VARCHAR(6)

    );

    INSERT INTO @test1(MachineID,Timestamp,Account,BiosID,Serial_number,Hostname) VALUES (1000,'20160110','A','Abcd1111',NULL,NULL);

    INSERT INTO @test1(MachineID,Timestamp,Account,BiosID,Serial_number,Hostname) VALUES (1001,'20160111','B',NULL,'7890','XYZ');

    INSERT INTO @test1(MachineID,Timestamp,Account,BiosID,Serial_number,Hostname) VALUES (1002,'20160212','C',NULL,NULL,'PQR');

    INSERT INTO @test1(MachineID,Timestamp,Account,BiosID,Serial_number,Hostname) VALUES (1003,'20160113','D',NULL,NULL,'DEF');

    DECLARE @Test2 TABLE (

    MachineID INT PRIMARY KEY

    ,Timestamp DATE

    ,Account VARCHAR(3)

    ,BiosID VARCHAR(10)

    ,Serial_number VARCHAR(6)

    ,Hostname VARCHAR(6)

    );

    INSERT INTO @Test2(MachineID,Timestamp,Account,BiosID,Serial_number,Hostname) VALUES (1000,'20160210','W','Abcd1111','1234','ABC');

    INSERT INTO @Test2(MachineID,Timestamp,Account,BiosID,Serial_number,Hostname) VALUES (1001,'20160211','X',NULL,'7890',NULL);

    INSERT INTO @Test2(MachineID,Timestamp,Account,BiosID,Serial_number,Hostname) VALUES (1002,'20160112','Y',NULL,NULL,NULL);

    INSERT INTO @Test2(MachineID,Timestamp,Account,BiosID,Serial_number,Hostname) VALUES (1003,'20160213','Z',NULL,NULL,'DEF');

    UPDATE DST

    SET TimeStamp = SRC.TimeStamp,

    Account = SRC.Account

    FROM DestinationTable AS DST

    CROSS APPLY (

    SELECT TOP(1) *

    FROM (

    SELECT *

    FROM @test1

    UNION ALL

    SELECT *

    FROM @Test2

    ) AS data

    WHERE MachineID = DST.MachineID

    AND 1 =

    CASE

    WHEN DST.BiosID = data.BiosID THEN 1

    WHEN DST.BiosID IS NULL AND DST.Serial_number = data.Serial_number THEN 1

    WHEN DST.BiosID IS NULL AND DST.Serial_number IS NULL AND DST.Hostname = data.Hostname THEN 1

    END

    ORDER BY Timestamp DESC

    ) AS SRC

    SELECT *

    FROM DestinationTable

    -- Gianluca Sartori

  • Sorry.

    My source tables are read only which are structured by some one else.

    MachineID is not primary key. There is no primary key in both source tables.

    Destination table does not have any values.

    Based on latest timestamp out of those 2 source table, and based on biosid, hostname, serial number field values I have to fetch the records from them.

  • Still unclear what you're asking.

    How are matching the rows? By machineId + biosid, hostname, serial number or just by biosid, hostname, serial number?

    -- Gianluca Sartori

  • INSERT INTO Table3 (MachineID, Timestamp, Account, BiosID, Serial number, Hostname )

    select MachineID, Timestamp, Account, BiosID, Serial number, Hostname

    FROM Table1

    WHERE NOT EXISTS (select * from Table3 where Table3.UniqueKeyCol1 = Table1.UniqueKeyCol1

    and Table3.UniqueKeyCol2 = Table1.UniqueKeyCol2 ...

    union all

    select MachineID, Timestamp, Account, BiosID, Serial number, Hostname

    from Table2

    WHERE NOT EXISTS (select * from Table3 where Table3.UniqueKeyCol1 = Table2.UniqueKeyCol1

    and Table3.UniqueKeyCol2 = Table2.UniqueKeyCol2 ...

    UniqueKey is a set of columns which uniquely identify records in the tables.

    _____________
    Code for TallyGenerator

  • This was my initial query where i was considering only Bios ID. Based on latest time stamp, i was able to merge both source table.

    WITH Combined As(

    SELECT MachineID,TimeStamp,Account,BiosID,SerialNumber,HostName

    FROM TEST1

    UNION

    SELECT MachineID,TimeStamp,Account,BiosID,SerialNumber,HostName

    FROM TEST2

    )

    , Ordered AS(

    SELECT MachineID,TimeStamp,Account,BiosID,SerialNumber,HostName,

    ROW_NUMBER() OVER(PARTITION BY BiosID ORDER BY TimeStamp DESC) AS rn

    FROM Combined

    )

    INSERT INTO TEST3(MachineID,TimeStamp,Account,BiosID,SerialNumber,HostName)

    SELECT MachineID,TimeStamp,Account,BiosID,SerialNumber,HostName FROM Ordered

    WHERE rn=1

    Problem is BiosID is having null values, so i have to put a condition like, if BiosID is null, then i should check for SerialNumber. If SerialNumber is also null, then i should check for HostName.

    Hope now i can get some solution.

  • Maybe something like this will work. Just splitting up your ORs into separate queries.

    IF OBJECT_ID('tempdb..#Test1') IS NOT NULL

    DROP TABLE #Test1

    IF OBJECT_ID('tempdb..#Test2') IS NOT NULL

    DROP TABLE #Test2

    IF OBJECT_ID('tempdb..#DestinationTbl') IS NOT NULL

    DROP TABLE #DestinationTbl

    IF OBJECT_ID('tempdb..#tempUnionTbl') IS NOT NULL

    DROP TABLE #tempUnionTbl

    /* your base tables */

    CREATE TABLE #Test1 (MachineID INT, [Timestamp] DATETIME, Account VARCHAR(3), BiosID VARCHAR(10), SerialNumber VARCHAR(6), Hostname VARCHAR(6));

    CREATE TABLE #Test2 (MachineID INT, [Timestamp] DATETIME, Account VARCHAR(3), BiosID VARCHAR(10), SerialNumber VARCHAR(6), Hostname VARCHAR(6));

    /* your destination table */

    CREATE TABLE #DestinationTbl (MachineID INT, [Timestamp] DATETIME, Account VARCHAR(3), BiosID VARCHAR(10), SerialNumber VARCHAR(6), Hostname VARCHAR(6));

    /* insert sample test data */

    INSERT INTO #Test1(MachineID, Timestamp, Account, BiosID, SerialNumber, Hostname) VALUES (1000, '20160110', 'A', 'abcd1111', NULL, NULL);

    INSERT INTO #Test1(MachineID, Timestamp, Account, BiosID, SerialNumber, Hostname) VALUES (1001, '20160111', 'B', NULL, '7890', 'XYZ');

    INSERT INTO #Test1(MachineID, Timestamp, Account, BiosID, SerialNumber, Hostname) VALUES (1002, '20160212', 'C', NULL, NULL, 'PQR');

    INSERT INTO #Test1(MachineID, Timestamp, Account, BiosID, SerialNumber, Hostname) VALUES (1003, '20160113', 'D', NULL, NULL, 'DEF');

    INSERT INTO #Test2(MachineID ,Timestamp, Account, BiosID, SerialNumber, Hostname) VALUES (1000, '20160210', 'W', 'abcd1111', '1234', 'ABC');

    INSERT INTO #Test2(MachineID ,Timestamp, Account, BiosID, SerialNumber, Hostname) VALUES (1001, '20160211', 'X', NULL, '7890', NULL);

    INSERT INTO #Test2(MachineID ,Timestamp, Account, BiosID, SerialNumber, Hostname) VALUES (1002, '20160112', 'Y', NULL, NULL, NULL);

    INSERT INTO #Test2(MachineID ,Timestamp, Account, BiosID, SerialNumber, Hostname) VALUES (1003, '20160213', 'Z', NULL, NULL, 'DEF');

    /* ======================================================== */

    /* start of a possible solution */

    /* temp table used to combine (UNION ALL) both base tables */

    CREATE TABLE #tempUnionTbl (MachineID INT, [Timestamp] DATETIME, Account VARCHAR(3), BiosID VARCHAR(10), SerialNumber VARCHAR(6), Hostname VARCHAR(6));

    /* insert data from base tables (Test1 and Test2) into temp table */

    INSERT INTO #tempUnionTbl (MachineID, [Timestamp], Account, BiosID, SerialNumber, Hostname)

    SELECTMachineID, [Timestamp], Account, BiosID, SerialNumber, Hostname

    FROM#test1

    UNION ALL

    SELECTMachineID, [Timestamp], Account, BiosID, SerialNumber, Hostname

    FROM#test2

    /* Insert BiosID data into destination table */

    INSERT INTO #DestinationTbl(MachineID, [Timestamp], Account, BiosID, SerialNumber, Hostname)

    SELECTMachineID

    ,[Timestamp]

    ,Account

    ,BiosID

    ,SerialNumber

    ,Hostname

    FROM(SELECTMachineID

    ,[Timestamp]

    ,Account

    ,BiosID

    ,SerialNumber

    ,Hostname

    ,ROW_NUMBER() OVER(PARTITION BY machineId ORDER BY [timestamp] DESC) AS rowNbr

    FROM#tempUnionTbl

    /* If BiosID is not null, then based on BiosID and timestamp I should get the latest records. */

    WHEREbiosID IS NOT NULL

    ) AS x

    WHERE rowNbr = 1

    /* If BiosID is null, insert Serial number data into destination table */

    INSERT INTO #DestinationTbl(MachineID, [Timestamp], Account, BiosID, SerialNumber, Hostname)

    SELECTMachineID

    ,[Timestamp]

    ,Account

    ,BiosID

    ,SerialNumber

    ,Hostname

    FROM(SELECTMachineID

    ,[Timestamp]

    ,Account

    ,BiosID

    ,SerialNumber

    ,Hostname

    ,ROW_NUMBER() OVER(PARTITION BY machineId ORDER BY [timestamp] DESC) AS rowNbr

    FROM#tempUnionTbl

    /* If BiosID is null, then it should check for Serial number and should get latest records. */

    WHEREbiosID IS NULL

    ANDSerialNumber IS NOT NULL

    ) AS x

    WHERE rowNbr = 1

    /* If both BiosID and Serial number is null, insert Hostname data into destination table */

    INSERT INTO #DestinationTbl(MachineID, [Timestamp], Account, BiosID, SerialNumber, Hostname)

    SELECTMachineID

    ,[Timestamp]

    ,Account

    ,BiosID

    ,SerialNumber

    ,Hostname

    FROM(SELECTMachineID

    ,[Timestamp]

    ,Account

    ,BiosID

    ,SerialNumber

    ,Hostname

    ,ROW_NUMBER() OVER(PARTITION BY machineId ORDER BY [timestamp] DESC) AS rowNbr

    FROM#tempUnionTbl

    /* If both BiosID and Serial number is null, then it should check for Hostname to get the latest records. */

    WHEREbiosID IS NULL

    ANDSerialNumber IS NULL

    ANDhostname IS NOT NULL

    ) AS x

    WHERE rowNbr = 1

    /* contents of temp table */

    SELECT*

    FROM#tempUnionTbl

    /* If all 3 columns are null, then it should ignore that row. */

    /* records inserted into destination table */

    SELECTMachineID

    ,[Timestamp]

    ,Account

    ,BiosID

    ,SerialNumber

    ,Hostname

    FROM#DestinationTbl

  • Helpseeker (2/28/2016)


    if BiosID is null, then i should check for SerialNumber. If SerialNumber is also null, then i should check for HostName.

    Check for what?

    What exactly should be the condition?

    _____________
    Code for TallyGenerator

  • Read your post. I copied those statements from your post word for word. Figured those were your query conditions.

  • Probably just

    DECLARE @test1 TABLE (

    MachineID INT

    ,Timestamp DATE

    ,Account VARCHAR(3)

    ,BiosID VARCHAR(10)

    ,Serial_number VARCHAR(6)

    ,Hostname VARCHAR(6)

    );

    INSERT INTO @test1(MachineID,Timestamp,Account,BiosID,Serial_number,Hostname)

    VALUES (1000,'20160110','A','Abcd1111',NULL,NULL)

    ,(1001,'20160111','B',NULL,'7890','XYZ')

    ,(1002,'20160212','C',NULL,NULL,'PQR')

    ,(1003,'20160113','D',NULL,NULL,'DEF')

    ,(1000,'20160310','UU','Abcd1111','UU',NULL)

    ;

    DECLARE @Test2 TABLE (

    MachineID INT

    ,Timestamp DATE

    ,Account VARCHAR(3)

    ,BiosID VARCHAR(10)

    ,Serial_number VARCHAR(6)

    ,Hostname VARCHAR(6)

    );

    INSERT INTO @Test2(MachineID,Timestamp,Account,BiosID,Serial_number,Hostname)

    VALUES (1000,'20160210','W','Abcd1111','1234','ABC')

    ,(1001,'20160211','X',NULL,'7890',NULL)

    ,(1002,'20160112','Y',NULL,NULL,NULL)

    ,(1003,'20160213','Z',NULL,NULL,'DEF')

    ;

    with input as (

    select * from @test1

    union all

    select * from @Test2 )

    , filter as (

    select

    MachineID

    ,Timestamp

    ,Account

    ,BiosID

    ,Serial_number

    ,Hostname

    , rn = case when BiosID is not null then

    ROW_NUMBER() OVER(PARTITION BY BiosID ORDER BY TimeStamp DESC)

    when Serial_number is not null then

    ROW_NUMBER() OVER(PARTITION BY Serial_number ORDER BY TimeStamp DESC)

    when Hostname is not null then

    ROW_NUMBER() OVER(PARTITION BY Hostname ORDER BY TimeStamp DESC)

    else -1 end

    from input

    )

    select *

    from filter

    where rn = 1;

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply