Rename file - prefix with file date

  • I receive three versions of a file each day, which I then import. But the file names are such that my ForEach loop processes the files out of sequence. I receive the files like this-

    PRICES_AP_20160623.TXT

    PRICES_EU_20160623.TXT

    PRICES_AM_20160623.TXT

    The files are created in the above order (AP, then EU, then AM), and I want to process them in that order. But because the ForEach loop processes the files alphabetically, my package is processing the files in the wrong order (AM, AP, EU). This is especially a problem if several days worth of files have stacked up. Then the ForEach processes them like this-

    PRICES_AM_20160622.TXT

    PRICES_AM_20160623.TXT

    PRICES_AP_20160622.TXT

    PRICES_AP_20160623.TXT

    PRICES_EU_20160622.TXT

    PRICES_EU_20160623.TXT

    I want to prefix the create date and time (not the current date/time) to the file name of each file, so that they're processed in their creation order.

  • jeffe_verde (6/25/2016)


    I receive three versions of a file each day, which I then import. But the file names are such that my ForEach loop processes the files out of sequence. I receive the files like this-

    PRICES_AP_20160623.TXT

    PRICES_EU_20160623.TXT

    PRICES_AM_20160623.TXT

    The files are created in the above order (AP, then EU, then AM), and I want to process them in that order. But because the ForEach loop processes the files alphabetically, my package is processing the files in the wrong order (AM, AP, EU). This is especially a problem if several days worth of files have stacked up. Then the ForEach processes them like this-

    PRICES_AM_20160622.TXT

    PRICES_AM_20160623.TXT

    PRICES_AP_20160622.TXT

    PRICES_AP_20160623.TXT

    PRICES_EU_20160622.TXT

    PRICES_EU_20160623.TXT

    I want to prefix the create date and time (not the current date/time) to the file name of each file, so that they're processed in their creation order.

    How does your suggested solution resolve this requirement:

    The files are created in the above order (AP, then EU, then AM), and I want to process them in that order

    ?

    Would they not still be processed in AM, AP, EU order?

    If you haven't even tried to resolve your issue, please don't expect the hard-working volunteers here to waste their time providing links to answers which you could easily have found yourself.

  • I don't know much about SSIS but it seems to me that the only way to process things in the correct order would be to "take control" of the situation. With that thought in mind, here's the basic concept for how to create a "Control" table. I can't explain how to use it in SSIS but it seems like it should be an easy thing to "step through" because it would be easy to do in a T-SQL only solution.

    DROP TABLE #Control

    GO

    --===== Simulate a process that captures all of the

    -- file names in a table with an extra column

    -- for processing order.

    SELECT d.PricesFileName

    ,ProcessOrder = CAST(0 AS INT)

    INTO #Control

    FROM

    (

    SELECT 'PRICES_AM_20160622.TXT' UNION ALL

    SELECT 'PRICES_AM_20160623.TXT' UNION ALL

    SELECT 'PRICES_AP_20160622.TXT' UNION ALL

    SELECT 'PRICES_AP_20160623.TXT' UNION ALL

    SELECT 'PRICES_EU_20160622.TXT' UNION ALL

    SELECT 'PRICES_EU_20160623.TXT'

    ) d (PricesFileName)

    ;

    --===== Parse the file names to establish the process order and update the control table

    -- with the correct order.

    WITH cteCreateProcessOrder AS

    (

    SELECT ProcessOrder

    ,RN = ROW_NUMBER() OVER (ORDER BY SUBSTRING(PricesFileName,11,8)

    ,CASE

    WHEN SUBSTRING(PricesFileName,8,2) = 'AP' THEN 1

    WHEN SUBSTRING(PricesFileName,8,2) = 'EU' THEN 2

    WHEN SUBSTRING(PricesFileName,8,2) = 'AM' THEN 3

    ELSE 'UNKNOWN TYPE OF FILE DETECTED'

    END)

    FROM #Control

    )

    UPDATE cteCreateProcessOrder

    SET ProcessOrder = RN

    ;

    --==== Let's see what we have in the #Control table.

    SELECT * FROM #Control ORDER BY ProcessOrder

    ;

    The content of the control table, sorted by the ProcessOrder column looks like this...

    PricesFileName ProcessOrder

    ---------------------- ------------

    PRICES_AP_20160622.TXT 1

    PRICES_EU_20160622.TXT 2

    PRICES_AM_20160622.TXT 3

    PRICES_AP_20160623.TXT 4

    PRICES_EU_20160623.TXT 5

    PRICES_AM_20160623.TXT 6

    (6 row(s) affected)

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • The files have data for the Asia-Pacific, EU, and the Americas financial markets (thus the AP, EU, and AM in the file names), and each file is created after close of business for that region. So the create TIME for the EU file is about 8 hours after the AP file, and the AM file is about 8 hours after the EU file.

    Right now, I check the FTP throughout the day, to catch each file as it's created.

  • jeffe_verde (6/26/2016)


    The files have data for the Asia-Pacific, EU, and the Americas financial markets (thus the AP, EU, and AM in the file names), and each file is created after close of business for that region. So the create TIME for the EU file is about 8 hours after the AP file, and the AM file is about 8 hours after the EU file.

    Right now, I check the FTP throughout the day, to catch each file as it's created.

    It seems, then, that you need something that will list the CREATED DATE/TIME for each of the files. Are you allowed to use xp_CmShell or other "thing" that will allow the interrogation of that information at the OS level? I could be wrong but I don't believe that SSIS itself has anything that would return the date and time of creation of files. Being able to do such a thing would allow total automation.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (6/27/2016)


    jeffe_verde (6/26/2016)


    The files have data for the Asia-Pacific, EU, and the Americas financial markets (thus the AP, EU, and AM in the file names), and each file is created after close of business for that region. So the create TIME for the EU file is about 8 hours after the AP file, and the AM file is about 8 hours after the EU file.

    Right now, I check the FTP throughout the day, to catch each file as it's created.

    It seems, then, that you need something that will list the CREATED DATE/TIME for each of the files. Are you allowed to use xp_CmShell or other "thing" that will allow the interrogation of that information at the OS level? I could be wrong but I don't believe that SSIS itself has anything that would return the date and time of creation of files. Being able to do such a thing would allow total automation.

    On this occasion, you are wrong, Jeff 🙂

    I have not had the time to create sample code, but this can be achieved using a Script Task containing just a few lines of code.

    If you haven't even tried to resolve your issue, please don't expect the hard-working volunteers here to waste their time providing links to answers which you could easily have found yourself.

  • Thank you both for the excellent suggestions. I ended up configuring a ForEach Loop container to enumerate by file date, as described in this article.

    For those that have the freedom to install 3rd-party custom controls, the author of the above article has published a packaged ForEach Sorted Loop solution that looks pretty slick.

  • Phil Parkin (6/28/2016)


    Jeff Moden (6/27/2016)


    jeffe_verde (6/26/2016)


    The files have data for the Asia-Pacific, EU, and the Americas financial markets (thus the AP, EU, and AM in the file names), and each file is created after close of business for that region. So the create TIME for the EU file is about 8 hours after the AP file, and the AM file is about 8 hours after the EU file.

    Right now, I check the FTP throughout the day, to catch each file as it's created.

    It seems, then, that you need something that will list the CREATED DATE/TIME for each of the files. Are you allowed to use xp_CmShell or other "thing" that will allow the interrogation of that information at the OS level? I could be wrong but I don't believe that SSIS itself has anything that would return the date and time of creation of files. Being able to do such a thing would allow total automation.

    On this occasion, you are wrong, Jeff 🙂

    I have not had the time to create sample code, but this can be achieved using a Script Task containing just a few lines of code.

    Looking forward to see that, Phil. 😉 I'm all in favor of learning something different.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • jeffe_verde (7/4/2016)


    Thank you both for the excellent suggestions. I ended up configuring a ForEach Loop container to enumerate by file date, as described in this article.

    For those that have the freedom to install 3rd-party custom controls, the author of the above article has published a packaged ForEach Sorted Loop solution that looks pretty slick.

    To be honest, I try to avoid such things like the plague. Although they are certainly useful, they're frequently not supported through various revisions and are frequently not allowed by various companies that I work for. If something goes wrong with them, who fixes them? Probably not the people that use them. 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (7/4/2016)


    Phil Parkin (6/28/2016)


    Jeff Moden (6/27/2016)


    jeffe_verde (6/26/2016)


    The files have data for the Asia-Pacific, EU, and the Americas financial markets (thus the AP, EU, and AM in the file names), and each file is created after close of business for that region. So the create TIME for the EU file is about 8 hours after the AP file, and the AM file is about 8 hours after the EU file.

    Right now, I check the FTP throughout the day, to catch each file as it's created.

    It seems, then, that you need something that will list the CREATED DATE/TIME for each of the files. Are you allowed to use xp_CmShell or other "thing" that will allow the interrogation of that information at the OS level? I could be wrong but I don't believe that SSIS itself has anything that would return the date and time of creation of files. Being able to do such a thing would allow total automation.

    On this occasion, you are wrong, Jeff 🙂

    I have not had the time to create sample code, but this can be achieved using a Script Task containing just a few lines of code.

    Looking forward to see that, Phil. 😉 I'm all in favor of learning something different.

    As you asked so nicely.

    As you may know, Script Tasks execute C#, so if you know the language, they can be a powerful tool.

    I created a folder on my PC called c:\SSISFileTest and put some files in there. Several .txt files and one or two other file types.

    If we assume that we wish to rename all .txt files in this folder by prefixing the file names with datetime of creation (in format YYYYMMDD MMss), here is the code to do it.

    In order for it to work, you'll probably need to add using System.IO; to your 'Using' block.

    string fld = @"c:\SSISFileTest";

    string fileType = "*.txt";

    FileInfo fi;

    string newName;

    foreach (string f in Directory.GetFiles(fld, fileType))

    {

    fi = new FileInfo(f);

    newName = Path.Combine(fld, fi.CreationTime.ToString("yyyymmdd HHmm") + " " + Path.GetFileName(f));

    //Do the rename

    File.Move(f, newName);

    {

    Aside from the declarations, that's just three statements in a loop.

    If this were 'properly' implemented in SSIS, I would expect the folder and file type to come in as SSIS parameters, but that is a trivial change.

    If you haven't even tried to resolve your issue, please don't expect the hard-working volunteers here to waste their time providing links to answers which you could easily have found yourself.

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply