Possible to determine the datetime range of loaded data to staging table?

  • Hi all,

    I have created a SSIS package. In the control flow i have a Execute SQL task which will truncate my staging table and followed by a Data Flow task that loads the source data to my destination staging table. Currently, my Data Flow consist of only a OLE DB Source and a OLE DB destination. However i would like to define the life range of data in my staging table but i do not know how to achieve that. Meaning i will only extract a specified range of data from source and load it into the staging table. In my source, there are two varchar fields EVENT_D and EVENT_T. So for example, if i run my package at 26/12/12 11:00:00, Data to be extracted and load into staging are data with EVENT_D + EVENT_T between 23/12/12 07:00:00 to 26/12/12 08:00:00

    and if i run my package at 27/12/12 14:00:00 my loaded data will be between:

    27 Dec 0800 - 73 hours TO 27 Dec 0800.

    and if i run my package at 27/12/12 07:59:00(which is v rare) my loaded data will be between:

    23/12/12 07:00:00 to 26/12/12 08:00:00

    Is this achievable? I guess one of the problem would be my two varchar field and it is in the format of DD/MM/YYYY hh:mm:ss

    Thanks,

    10e5x

  • You may use a conditional split transformation in the Data flow tab. You can write any number of conditions in this transformation and each condition have a respective output. So which ever condition is satisfied that specific data can be directed to the next step as you want.

    You can also use functions to change the datatype of your columns as per your need

  • Depending on the nature of your OLEDB source (is it an RDBMS?), the most efficient way would be to CAST the varchar date/time columns to a single column with a datetime datatype and use a select query with an appropriate WHERE clause to provide the data (and not just the whole table).

    An even better way would be to add a proper datetime column to the source data, but I'm assuming that's not allowed? If there's a lot of data in this table and it's growing, any method is going to gradually grind to a halt, unless you are somehow able to get a useful index on it (eg, in SQL Server, on a computed datetime column added to the base table).

    If you haven't even tried to resolve your issue, please don't expect the hard-working volunteers here to waste their time providing links to answers which you could easily have found yourself.

  • Hi Phil,

    Once again thanks for replying and helping. Ya my source it is from RDBMS. Your suggestions are too complicated to me. I am trying some other simpler way. Maybe two new derive column of EVENT_D and EVENT_T as datetime first then use conditional split. Btw are u able to help me with the expression?

  • Is the source a SQL Server database?

    If you haven't even tried to resolve your issue, please don't expect the hard-working volunteers here to waste their time providing links to answers which you could easily have found yourself.

  • Its Oracle View

  • OK, then you need help from an Oracle developer to design your SELECT statement for the OLEDB source.

    Select col1, col2

    from table

    where [convert varchar date and time to datetime] between [startdate] and [enddate]

    The problem with trying to do this all in SSIS is that you will always have to process all of the rows in the source table. If the source table is growing, as I mentioned before, your process will get slower and slower.

    If you haven't even tried to resolve your issue, please don't expect the hard-working volunteers here to waste their time providing links to answers which you could easily have found yourself.

  • You are right, definately will have overhead. i try to get it done first before looking at efficiency issue. Actually my problem is defining startDate and endDate. Thanks phil

  • 10e5x (12/27/2012)


    You are right, definately will have overhead. i try to get it done first before looking at efficiency issue. Actually my problem is defining startDate and endDate. Thanks phil

    OK, I've looked at your original post again. I'm not sure I understand the logic for setting the start and end dates - can you explain it?

    If you haven't even tried to resolve your issue, please don't expect the hard-working volunteers here to waste their time providing links to answers which you could easily have found yourself.

  • Phil Parkin (12/27/2012)


    10e5x (12/27/2012)


    You are right, definately will have overhead. i try to get it done first before looking at efficiency issue. Actually my problem is defining startDate and endDate. Thanks phil

    OK, I've looked at your original post again. I'm not sure I understand the logic for setting the start and end dates - can you explain it?

    The start date will always be 73 hours before the end date. While the end date will be the nearest 8am but its definitely a datetime of a past. E.g:

    Datetime when package run: 24/12/12 0900

    Startdate: 21/12/12 0700

    Enddate: 24/12/12 0800

    Datetime when package run: 25/12/12 2300

    Startdate: 22/12/12 0700

    Enddate: 25/12/12 0800

    Datetime when package run: 26/12/12 0759

    Startdate: 22/12/12 0700

    Enddate: 25/12/12 0800

    Datetime when package run: 26/12/12 0800

    Startdate: 22/12/12 0700

    Enddate: 25/12/12 0800

    Datetime when package run: 26/12/12 0805

    Startdate: 23/12/12 0700

    Enddate: 26/12/12 0800

    As u can see i want the data to be included from my soure to my staging span accross 73hrs. I want 73hrs worth of event data. So EVENT_D + EVENT_T should be between the Startdate and Enddate

    Thanks in Advance,

    10e5x

  • You may write the expression something like this in the conditional split

    (DT_DBTIMESTAMP)(event_d + " " + event_t) >= DATEADD("HH",-73,GETDATE()) && (DT_DBTIMESTAMP)(event_d + " " + event_t) <= DATEADD("HH",8,(DT_DBDATE)(GETDATE()))

  • Here's my pseudo-code expanded a bit to accommodate the start and end date bits:

    declare @StartDate datetime

    ,@EndDate datetime

    set @EndDate = dateadd(hour, 8, DATEADD(dd, 0, DATEDIFF(dd, 0, DATEADD(HOUR, - 8, getdate()))))

    set @StartDate = dateadd(hour, - 73, @EndDate)

    select col1

    ,col2

    from table

    where [convert varchar date and time to datetime] between @StartDate

    and @EndDate

    That's how it works in T-SQL. Just need to convert that to Oracle.

    --Edit: fixed typo.

    If you haven't even tried to resolve your issue, please don't expect the hard-working volunteers here to waste their time providing links to answers which you could easily have found yourself.

  • Thanks to both of you, currently out of office cant test our both solutions. Will test as soon as i get back. Btw from what i read from both your solutions, i wonder if you all took care of scenario whereby package i executed before or at 8am. Which means this examples:

    Datetime when package run: 26/12/12 0759

    Startdate: 22/12/12 0700

    Enddate: 25/12/12 0800

    Datetime when package run: 26/12/12 0800

    Startdate: 22/12/12 0700

    Enddate: 25/12/12 0800

    Just clarifying. Thanks alot

  • i wonder if you all took care of scenario whereby package i executed before or at 8am.

    My solution handles that.

    If you haven't even tried to resolve your issue, please don't expect the hard-working volunteers here to waste their time providing links to answers which you could easily have found yourself.

  • Phil Parkin (12/28/2012)


    i wonder if you all took care of scenario whereby package i executed before or at 8am.

    My solution handles that.

    Thanks! 😀

Viewing 15 posts - 1 through 15 (of 17 total)

You must be logged in to reply to this topic. Login to reply