how to archive data based on on archive it self?

  • Hi

    I have an archive tabel in Oracle. We copy this to a ms sql 2008 r2 database for datawarehouse purpose.

    The problem is that it only contains 90 days of records. I now need to archive this archive table. 😀

    There is no unique key in the table. It log our load_id but one id may come several times a day even several days.

    I need to take a copy of the archive as is, and each day make it run thru the new transfer from oracle to figure out what has cahnged and add those lines to my copy.

    How to solve this the best way?

    BR

    Dan

  • What I've done in the past with rows like that is to use all the columns to compose a "key" (although I really don't use sql features like making it a primary key, I just treat all columns as the criteria for selecting and reporting that particular row).

    Now in the cases that I can have duplicate rows (the same value for all columns), I add a "count" column, and only store one copy of the row with an integer representing the "count" of the rows containing the exact same values and then I again have a table with distinct values. So if in a previous copy of an archive I have a row that has occurred twice for a given set of values (count column has value 2), and the new copy of the archive this row occurs 3 times, then I still know what has changed.

    Every copy of the archive I save gets either a unique id for all rows that belong to it, or a unique timestamp (often both values although that could be considered redundant).

  • Hi

    Sorry for late reply. 🙂 Days just fly by.

    I have created a key for each record by simply concatenating each field.

    I have then made an archive table and now I need to add to it the new posts found in the daily update.

    Query below tries to fetch the history_key from the load filed. The criteria is that it does not exist in the archive_load

    select T1.[History_Key],......

    FROM

    LOAD_SCHED as T1

    WHERE NOT EXISTS (SELECT distinct T3.[Archive_Key] FROM Archive_LOAD_SCHED AS T3)

    I know there are posts in T1 that do not exist in the T3, but the result of my query is allways 0.

    If I run the T1 it self and grabs one of the newest keys and tries to ask for it in the T3 query it will not return anything.

    So that spesific T1 key and field should then be displayed.

    As usual, when I think it will be easy, it turns out to be a pain in the ....:w00t:

    BR

    Dan

  • You need to match the keys in both tables!

    select T1.[History_Key]

    FROM

    LOAD_SCHED as T1

    WHERE NOT EXISTS (SELECT distinct T3.[Archive_Key] FROM Archive_LOAD_SCHED AS T3 where T1.history_key = T3.archive_key)

  • Hi.

    I missed those two fields in my code yes

    But even when I add them it does not appear to give the result I need.

    select-- count([LOAD_ID])

    [LOAD_ID]

    ,[ROUTE]

    ,[STATUS]

    ,[ORDER_DEADLINE]

    ,[SHIP_DT]

    ,[STAGE_DT]

    ,[EST_STAGE_DT]

    ,[STG_LOC]

    ,[RC_STARTED]

    ,[CREATE_DT]

    ,[WHO_CREATE]

    ,[UPDATE_DT]

    ,[WHO_UPDATE]

    ,[COMPANY]

    ,[DELETE_DT]

    ,[WHO_DELETE]

    ,[History_Key]

    FROM

    [LOAD_SCHED] as T1-- 9404 records, only 200 of them do not exist in the archive

    WHERE NOT EXISTS (SELECT T2.[Arkiv_Key] FROM [Arch_LOAD_SCHED] AS T2 where T2.[Arkiv_Key] = T1.[History_Key])

    BR

    Dan

  • The query looked ok to me (well except for the 'count' clause, not sure how thats working in isolation without an accompanying 'group by'). You might want to double check the keys you built (any columns with null data will not play nicely with concatenated columns), run on some smaller subsets of your data, etc...

  • The COUNT function above is excluded because it's commented out with a --.

    Let's first strip the query down a bit. What does this return?

    SELECT COUNT(*)

    FROM load_sched t1

    WHERE NOT EXISTS (SELECT 1

    FROM arch_load_sched t2

    WHERE t2.arkiv_key = t1.history_key);

    Next, what are the data types of load_sched.history_key and arch_load_sched.arkiv_key? I'm asking because they should be the same if you're going to try to join them.

    Lastly, do you have duplicates or NULLs in the key field you're matching on in either table?

  • Hi

    As it turns out two of the datefields returns NULL values. They are not supposed to do so, but they do.

    So I have added an isnull function and now it seems to be ok.

    Thanks for your time and input.

    Br

    Dan

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply