Need to check file name before it get process

  • Hi,

    I need help in checking the file name before i load it to staging tables.

    I have all the file stored in C:\Database\SourceFiles\ABCSourceFiles

    and file name like ABC_YYYYMMDDHHMMSS

    Example

    SourceFile1_20140803073209.txt

    SourceFile2_20130904071210.txt

    SourceFile3_20120703041530.txt

    For i took foreach loop container and point to this location and only selected .txt files and file name put it into @Filename variable.

    I took data flow task on control flow and take flat file connection i am using this variable in flat file connection order to pull the data from that local location.

    So my question is how i will do the file validation for file name.

    We need to reject the file if the file name format is not valid

    Like SourceFile1_2014073209.txt --Reject this file because --it is not YYYYMMDDHHMMSS

    and also we need to reject this file if the file name like SourceFile1_uuu2019hhj --- Reject this file because it is not Filename_YYYYMMDDHHMMSS.

    I am not sure how i will do it before i process this file.

    I appreciate for your help in advanced.

    Thanks,

    Unnati

  • Hi Unnati, see below.

    1) Have a SQL Function to validate the file name. Make sure the sql function returns either 0 or 1 upon failure or success.

    2) Pass the file name you are getting through the loop to the sql function using a Execute sql task.

    3) If the return value is 0, do nothing but if its 1 add whatever task you want to do with that file. I would assume a Data flow task.

    4) Using precedence constraints, check the variable to pass the values to the next tasks.

    another method, not a great one though.

    Have all file names into a sql table and preprocess them with the same function and keep only ones you want to use. Rest you can delete from the table and use the file names that are needed only directly.

    Good Luck 🙂 .. Visit www.sqlsaga.com for more t-sql code snippets and BI related how to articles.

  • Hi,

    Thank you for quick reply but i do not understand the SQL function for this?

    Can you please tell me about that function?

    I appreciate if you can give me steps for that.

    Thanks,

    Unnati

  • Unnati, see this.. You will be needing to use a Select statement like this and capture the result.

    DECLARE @String VARCHAR(MAX) = 'SourceFile1_20140803073209.txt'

    DECLARE @String1 VARCHAR(MAX) ='SourceFile1_2014073209.txt'

    SELECT @String AS [String] , CASE WHEN LEN(PARSENAME(REPLACE(@String, '_', '.'),2)) = 14 THEN 1 ELSE 0 END AS [Return_Value]

    , @String1 AS [String1] , CASE WHEN LEN(PARSENAME(REPLACE(@String1, '_', '.'),2)) = 14 THEN 1 ELSE 0 END AS [Return Value]

    Good Luck 🙂 .. Visit www.sqlsaga.com for more t-sql code snippets and BI related how to articles.

  • Hi, Thank you very much for the code

    But I need to check that if the Date is valid or not

    Lets say if we pass this SourceFile1_99999999999999.txt then should not return 1.

    We need check the condition YYYY should be between 1950 to Current year month is between 1 to 12 and day between 1 to 31

    My File format like FileName_YYYYMMDDHHMISS.txt

    I appreciate if you give me idea on that.

    Thanks,

    Unnati

  • unnati.patel513 (4/10/2014)


    Hi, Thank you very much for the code

    But I need to check that if the Date is valid or not

    Lets say if we pass this SourceFile1_99999999999999.txt then should not return 1.

    We need check the condition YYYY should be between 1950 to Current year month is between 1 to 12 and day between 1 to 31

    My File format like FileName_YYYYMMDDHHMISS.txt

    I appreciate if you give me idea on that.

    Thanks,

    Unnati

    Yes, It was just a code I gave you to see how you should be writing it. see below.. feel free to add more based on that..

    DECLARE @String VARCHAR(MAX) = 'SourceFile1_20140403073209.txt'

    DECLARE @String1 VARCHAR(MAX) ='SourceFile1_2014073209.txt'

    SELECT @String, CASE WHEN CONVERT(date, LEFT(PARSENAME(REPLACE(@String, '_', '.'), 2), 8), 112) <= GETDATE() THEN 1 ELSE 0 END

    Good Luck 🙂 .. Visit www.sqlsaga.com for more t-sql code snippets and BI related how to articles.

  • a4apple (4/10/2014)


    Unnati, see this.. You will be needing to use a Select statement like this and capture the result.

    DECLARE @String VARCHAR(MAX) = 'SourceFile1_20140803073209.txt'

    DECLARE @String1 VARCHAR(MAX) ='SourceFile1_2014073209.txt'

    SELECT @String AS [String] , CASE WHEN LEN(PARSENAME(REPLACE(@String, '_', '.'),2)) = 14 THEN 1 ELSE 0 END AS [Return_Value]

    , @String1 AS [String1] , CASE WHEN LEN(PARSENAME(REPLACE(@String1, '_', '.'),2)) = 14 THEN 1 ELSE 0 END AS [Return Value]

    Repeatedly hitting the database engine with queries which have nothing to do with data is not a good practice, in my opinion. You are generating unnecessary network traffic and needlessly adding to the load on the DB server.

    Nothing you have done here is difficult to do within a Script Task - keeping all of the processing entirely within SSIS and not hitting the DB server at all.

    If you haven't even tried to resolve your issue, please don't expect the hard-working volunteers here to waste their time providing links to answers which you could easily have found yourself.

  • Thank you very much for your help.

    Yes this is good solution.I will put this code in store procedure and run it through Execute sql task.

    Thanks,

    Unnati

  • unnati.patel513 (4/10/2014)


    Hi,

    I need help in checking the file name before i load it to staging tables.

    I have all the file stored in C:\Database\SourceFiles\ABCSourceFiles

    and file name like ABC_YYYYMMDDHHMMSS

    Example

    SourceFile1_20140803073209.txt

    SourceFile2_20130904071210.txt

    SourceFile3_20120703041530.txt

    For i took foreach loop container and point to this location and only selected .txt files and file name put it into @Filename variable.

    I took data flow task on control flow and take flat file connection i am using this variable in flat file connection order to pull the data from that local location.

    So my question is how i will do the file validation for file name.

    We need to reject the file if the file name format is not valid

    Like SourceFile1_2014073209.txt --Reject this file because --it is not YYYYMMDDHHMMSS

    and also we need to reject this file if the file name like SourceFile1_uuu2019hhj --- Reject this file because it is not Filename_YYYYMMDDHHMMSS.

    This is a classic case for using script task in SSIS for the load. Using the Script task, create a list of valid file names that you want to parse using a basic REGEX expression or a format validation. Use a FEL (For each loop) to iterate the variable and load the data...this is a very high level approach

    Raunak J

  • Phil Parkin (4/13/2014)

    Repeatedly hitting the database engine with queries which have nothing to do with data is not a good practice, in my opinion. You are generating unnecessary network traffic and needlessly adding to the load on the DB server.

    Nothing you have done here is difficult to do within a Script Task - keeping all of the processing entirely within SSIS and not hitting the DB server at all.

    Phil,

    What you said is absolutely true. A script task would be a best choice to do this. I provided a way to get started. I only provided the TS a tip to get started and from there make it effective. Thanks for sharing.. 🙂

    Good Luck 🙂 .. Visit www.sqlsaga.com for more t-sql code snippets and BI related how to articles.

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply