Getting imported file create date and storing it with the data

  • Hi all,

    I am new to SSIS and have very very limited scripting knowledge so please be nice and simple with any help you can and I know will provide. My skills all reside purely in the database.

    I am using SSIS 2008R2.

    What I have is an SSIS package that loops through a directory importing the flat files into a staging area prior to processing into the data mart. Across the flat files that I am importing it is possible (more likely probable) that a record that I am trying to process exists in more than one file. The issue is that the data for the record may differ across the flat files as the system they are collected from is reading directly from production, so the record can be updated between generation of the first and last files that are sent to me to process. Therefore the record that I would want to promote from staging to data mart would be the version with the most recent date. The only place I can get the date is the file creation date. So I am trying to add the file creation date into the import process.

    The problem I have is my lack of knowledge on scripting.

    My thinking is that in the staging environment I add an additional column, SoruceFileCreateDate. I then import the data into the staging environment then run an "Execute SQL" task to update all records in the staging table that do not have the SourceFileCreateDate populated to bring in the create date of the file being imported. But how do I get hold of the date? I have used variables in an execute SQL task but I just don't understand scripting well enough to get the file create date into a variable to used in the execute SQL task.

    I am hoping that in the solution I would have a Foreach Loop that has a data flow task to import the flat file into the staging table, then a script task to get the file creation date, an Execute SQL task to set the SourceFileCreateDate in the staging DB and then a File System task to move the file to a processed directory.

    Can someone please help me with the script to get the create date?

    Mark

    /***** Edit ****/

    I am using a variable (of course) within the foreach loop container to dynamically pass the file name and ideally the script would use this Variable as an input to then get the correct file create date.

  • The code is easier than you might think:

    File.GetCreationTime Method

    I would store the result in an SSIS variable and populate a column in the data flow using a derived column and this variable.

    That way you won't have to execute an additional Execute SQL Task.

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • Koen Verbeeck (2/11/2014)


    The code is easier than you might think:

    File.GetCreationTime Method

    I would store the result in an SSIS variable and populate a column in the data flow using a derived column and this variable.

    That way you won't have to execute an additional Execute SQL Task.

    I'm glad you posted this - I didn't have time. That's exactly how I would do it too.

    If you haven't even tried to resolve your issue, please don't expect the hard-working volunteers here to waste their time providing links to answers which you could easily have found yourself.

  • Woops, clicked the bottom one first hence two thanks ticks. That's just so easy I can't believe it. I knew that SSIS was built to help dummies like me but that's just too easy.

  • Koen and Phil (and anybody else who is able and willing to help),

    I am trying and it is really showing my lack of knowledge, I have a book on order but it won't arrive for another week or two and this is the missing piece in my data import puzzle.

    I'm only hours away from being able to commence testing and it is the importing of this file and using the file date variable that is stopping me from finishing.

    I'll let you know what I have tried so far and if someone can point me to a resource that will highlight why I am wrong and what I really should be doing ...

    So I created a variable called FileDate with a package scope and data type date time. I then went to the variable properties, set the "EvaluateAsExpression" property to "true", went into the expression editor, and place the following as the expression to use

    Function GetCreationTime ( @[User::FileName] As String ) As DateTime end function

    This failed as "Attempt to parse function failed". Ok so I know I'm getting something wrong.

    I then thought I'd had an epiphany and that what I needed was a script task that got called wtihin the derived column task.

    So within the data flow I thought I should add a script task to populate the variable, then used the derived column task to use the variable and pass it to the destination. But there was no script task, just the script component. I look at this and did not think it was what I was after so I halted there and still have no understanding of the script component.:ermm:

    OK maybe the script task should be before the data flow within the foreach loop container. I decided that the script task needed to be before the data flow so that the variable value would be set and available to the data flow.

    So within the Foreach Loop I added a script task as the first task in the flow, followed by the data flow and finaly the move to processed files system task. Now it is time to configure the script task.

    So in the script task editor I selected VB2008 as the scripting language, set the read only variables to be the filename variable used within the foreach loop container and the ReadWrite to be the FileDate variable. I then clicked "Edit Script" to copy and paste the code linked by Koen. (Remember I am use to only using SSIS to pump data into a staging environment and then using stored procs to manipulate the date). So the code in the script task window now look like

    "Public Shared Function GetCreationTime(ByVal path As String) As DateTime

    End Function

    "

    I then went to the variable window, clicked on the FileDate variable and started setting its properties. I set the EvaluateAsExpression to "True" then clicked on the elipse against expression. I then entered GetCreationTime( @[User::FileName]) in the Expression window, clicked OK and got errored, "The function "GetCreationTime" was not recognized. Either the function name is incorrect or does not exist."

    The scope for the FileName and FileDate variables are both set to "Package".

    Thanks

    Mark

  • dogramone (2/11/2014)


    Koen and Phil (and anybody else who is able and willing to help),

    I am trying and it is really showing my lack of knowledge, I have a book on order but it won't arrive for another week or two and this is the missing piece in my data import puzzle.

    It's all good. The explanation of what you've tried and explored below definately shows this.

    So I created a variable called FileDate with a package scope and data type date time.

    So far, so good.

    I then went to the variable properties, set the "EvaluateAsExpression" property to "true", went into the expression editor....

    Definately not this. As you concluded below, you want to do this via a script task.

    I then thought I'd had an epiphany and that what I needed was a script task that got called wtihin the derived column task.

    You're overcomplicating it this way.

    OK maybe the script task should be before the data flow within the foreach loop container.

    I decided that the script task needed to be before the data flow so that the variable value would be set and available to the data flow.

    So within the Foreach Loop I added a script task as the first task in the flow, followed by the data flow and finaly the move to processed files system task. Now it is time to configure the script task.

    On target.

    So in the script task editor I selected VB2008 as the scripting language, set the read only variables to be the filename variable used within the foreach loop container and the ReadWrite to be the FileDate variable.

    What setting did you use in the for each task here for the Retrieve File Name? What you'll want is Fully qualified to get the full path. The variable accessibility is correct, I just want to make sure you've got the full thing.

    I then clicked "Edit Script" to copy and paste the code linked by Koen.

    That's not what you wanted to do. If you look further down the page, you'll see examples of usage, like so:

    Imports System.IO

    Module Module1

    Sub Main()

    Dim fileCreatedDate As DateTime = File.GetCreationTime("C:\Example\MyTest.txt")

    Console.WriteLine("file created: " + fileCreatedDate)

    End Sub

    End Module

    So, here's basically the code you're looking for:

    Imports System

    Imports System.Data

    Imports System.Math

    Imports Microsoft.SqlServer.Dts.Runtime

    Imports System.IO

    <System.AddIn.AddIn("ScriptMain", Version:="1.0", Publisher:="", Description:="")> _

    <System.CLSCompliantAttribute(False)> _

    Partial Public Class ScriptMain

    Inherits Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTARTScriptObjectModelBase

    Enum ScriptResults

    Success = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Success

    Failure = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Failure

    End Enum

    Public Sub Main()

    Dim filepath As String

    filepath = Dts.Variables("ForEachLoopVar_FilePathAndName").Value.ToString

    Dts.Variables("FileCreationDate").Value = File.GetCreationTime(filepath)

    Dts.TaskResult = ScriptResults.Success

    End Sub

    End Class

    I copied everything so you could see it, but you really only want to change it to include the Imports System.IO at the top and then the Sub MAIN() function.

    Now, you use a derived column and add the variable in the data flow there.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • Thanks Craig,

    I am now getting data typeing errors and it is the one thing that really frustrates me with integration into databases. I love the data types in DB's and know them pretty well, but I do struggle with Excel and VBA.

    The filepath variable is the full quailified path,

    So this is what I have now using your piece of code, and I think the modifications are correct.

    ' Microsoft SQL Server Integration Services Script Task

    ' Write scripts using Microsoft Visual Basic 2008.

    ' The ScriptMain is the entry point class of the script.

    Imports System

    Imports System.Data

    Imports System.Math

    Imports Microsoft.SqlServer.Dts.Runtime

    Imports System.IO

    <System.AddIn.AddIn("ScriptMain", Version:="1.0", Publisher:="", Description:="")> _

    <System.CLSCompliantAttribute(False)> _

    Partial Public Class ScriptMain

    Inherits Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTARTScriptObjectModelBase

    Enum ScriptResults

    Success = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Success

    Failure = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Failure

    End Enum

    Public Sub Main()

    Dim filepath As String

    filepath = Dts.Variables("User::FileName").Value.ToString 'Modified here to use my variable name

    Dts.Variables("User::FileDate").Value = File.GetCreationTime(filepath) 'Modified here to use my variable name

    Dts.TaskResult = ScriptResults.Success

    End Sub

    End Class

    I get the error message as below

    Error: 0xC001F009 at Package: The type of the value being assigned to variable "User::FileDate" differs from the current variable type. Variables may not change type during execution. Variable types are strict, except for variables of type Object.

    Error: 0xC001C012 at Process Case Files: ForEach Variable Mapping number 2 to variable "User::FileDate" cannot be applied.

    The variable FileName has a data type of string and the variable FileDate has a data type of DateTime. I beleive I am missing a level of casting somewher, but where>

    Sorry to be such a newbie.:unsure:

    Mark

  • If any one is reading this I think I now realise where I've got it wrong, putting the quotes around the variable names. Unfortunately its now 10 pm down under so I'll have to wait until tomorrow morning to check and update.

    Thanks Craig, Koen and Phil for the help and advice. I's why I've been a member here for so long and kept across things even whilst spending 6 years as a project manager. I love data so much and and really enjoy the banter and friendliness for the people on the site. Hell I even bought the first book when it came out and still treasure it.

  • So I finally got there with some pain around data typing. I know my solution is not the best but it is working for me and I can acheive the business results using this mehtod

    I now have my SSIS package FileDate variable with a string data type. I declared a variable within the VB code of type date, set this variable to the GetCreationTime results, then place this variable into the DTS package FileDate variable.

    Here is the updated piece of VB

    Public Sub Main()

    '

    Dim filepath As String

    Dim filedatedatetime As Date

    filepath = Dts.Variables("FilePAth").Value.ToString

    Debug.Print(filepath)

    filedatedatetime = File.GetCreationTime(filepath)

    Dts.Variables("FileDate").Value = filedatedatetime.ToString

    Debug.Print(Dts.Variables("FileDate").Value)

    Dts.TaskResult = ScriptResults.Success

    I can now get a value written to the SQL table, whereas without this method the DTS Package FileDate variable was being set to the enpty string and populating the DB with the default date.

    Luckily I am not considered an SSIS expert but now I can get the data into the DB correctly I am back into my comfort zone.

  • Glad you got it working. Yes, datatypes can be frustrating sometimes in SSIS but honestly, you'll get used to them (and the errors, due to seeing them a few hundred times until you are) and it won't be as big a deal anymore.

    Good luck!


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply