Load folder full of csv files

  • I have a folder with csv files that has 2 columns a datetime, and a integer value. I want to load them all to a SQL table, but I need to create a 3 column in my table called filename. In this filename column I need to capture the filename of the file it's loading, so they can tie the data to the file.

    THanks.

  • You can use a script for this sort of processing. I would use Powershell.

    First download Out-DataTable from:

    https://github.com/RamblingCookieMonster/PowerShell/blob/master/Out-DataTable.ps1

    Then write something like:

    <#
    LoadCSVs.ps1

    LoadCSVs -Folder <path> -db <db>

    eg

    . C:\Batch\Scripts\LoadCSVs.ps1

    LoadCSVs -Folder C:\Batch\Source -db YourDB
    #>

    . "C:\Batch\Scripts\Out-DataTable.ps1"

    Function LoadCSVs
    {
    Param($Folder, $db)
    Try
    {
    $ConnectionString = "Data Source=YourServer; Database=$db; Trusted_Connection=True;";

    Write-Output "$((Get-Date -Format s)) Loading $Folder/*.csv into $db."

    $Files = Get-ChildItem -Path $Folder | Where-Object{!($_.PSIsContainer) -and ($_.Name -like "*.csv")} | Sort-Object Name
    $bulkCopy = new-object ("Data.SqlClient.SqlBulkCopy") $ConnectionString
    $connection = New-Object System.Data.SqlClient.SqlConnection($ConnectionString)
    $connection.Open()
    $sqlCommand = $connection.CreateCommand()

    $Table = "YourTable"
    $sqlCommand.CommandText = "TRUNCATE TABLE dbo." + $Table
    $result = $sqlCommand.ExecuteNonQuery()

    Foreach ($File in $Files)
    {
    $FileName = $File.Name
    Write-Output "$((Get-Date -Format s)) Loading $FileName"
    $csvDataTable = Import-CSV -Path $File.FullName | Select-Object *,@{Name='Filename';Expression={$FileName}}| Out-DataTable
    $bulkCopy.DestinationTableName = $Table
    $bulkCopy.WriteToServer($csvDataTable)
    }
    }
    Catch
    {
    $ErrorMessage = $_.Exception.Message
    $FailedItem = $_.Exception.ItemName
    $emailFrom = "xxx"
    $emailTo = "xxx"
    $subject = "xxx"
    $body = "$ErrorMessage"
    $smtpServer = "xxx"
    $smtp = new-object Net.Mail.SmtpClient($smtpServer)
    $smtp.Send($emailFrom, $emailTo, $subject, $body)

    Throw $ErrorMessage
    Break
    }
    Finally
    {
    $bulkCopy.Close()
    $connection.Close()
    Write-Output "$((Get-Date -Format s)) Load complete."
    }
    }
  • This is what the data looks like I'm trying to process. The first line is the header...

    "stamp_time","wirespeed"

    "3/25/2020 7:31:47 AM","226"

     

    Thanks any changes to script to handle input?

     

     

  • Bruin wrote:

    I have a folder with csv files that has 2 columns a datetime, and a integer value. I want to load them all to a SQL table, but I need to create a 3 column in my table called filename. In this filename column I need to capture the filename of the file it's loading, so they can tie the data to the file.

    THanks.

    Do you need the full path name including the file name and extension, just the filenname and extension, or just the file name?

    Also, which version of SQL Server are you using and are you hell bent to use PowerShell or are you open to suggestions?

    Also, what do the file names look like?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Sql version is 2012 and the files are named like

    f300-line1

    Powershell seemed like a easier solution to getting the filename into the table.

     

  • Bruin wrote:

    Sql version is 2012 and the files are named like

    f300-line1

    Powershell seemed like a easier solution to getting the filename into the table.

    Thanks for the information.  If you're happy with the PowerShell solution, then case closed, right?  Or not?

     

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Doesn't seem to be working...

    2021-06-02T12:03:27 Loading C:\Line_Dumps/*.csv into process.

    2021-06-02T12:03:28 Loading F306-Line5.csv

    2021-06-02T12:07:28 Load complete.

    Exception calling "WriteToServer" with "1" argument(s): "The given value of type String from the data source cannot be

    converted to type datetime of the specified target column."

    At C:\PowerShell\LoadCSVs.ps1:48 char:3

    + Throw $ErrorMessage

    + ~~~~~~~~~~~~~~~~~~~

    + CategoryInfo : OperationStopped: (Exception calli...target column.":String) [], RuntimeException

    + FullyQualifiedErrorId : Exception calling "WriteToServer" with "1" argument(s): "The given value of type String from th

    e data source cannot be converted to type datetime of the specified target column."

     

    SQL table:

    date_recorded  --datetime

    speed    -- int

    File_Name  Nvarchar(35)

     

  • I found the problem when data was extracted it was a point in time, and on one of the files the last record was partial written.

    Is there any any to check that and drop that record from the load process?

    Thanks.

  • does this process have to read the entire file before loading?  Is there a SQL solution without using SSIS?

    Thanks.

  • Bruin wrote:

    I found the problem when data was extracted it was a point in time, and on one of the files the last record was partial written.

    Is there any any to check that and drop that record from the load process?

    Thanks.

    Since there doesn't seem to be any PII or proprietary info in these files, can you attach that file in its entirety?

     

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • This is what the failed file looked like:

    "stamp_time","wirespeed"

    "1/2/2019 5:53:45 PM","187"

    "1/2/2019 5:54:18 PM","187"

    "1/2/2019 5:54:51 PM","188"

    "1/2/2019 5:55:23 PM","187"

    "1/2/2019 5:55

    When I was dumping data it caught it in the middle of a write...

     

     

     

  • Ok... thanks.  I'm at work and have an appointment at 5PM that's going to last for a couple of hours but I'll see what I can do.  And, no... it won't be using SSIS or PowerShell, if that's alright with you.

     

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • p.s. Tell me what the various parts of the file name like f300-line1 mean.  I have an idea to help with other things that you may not yet be planning on.

     

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • The machine is the first part F300-line1  F300-LIne15

    machine - 300

    Line - 1

    it could have up to 15 lines..

    As far as the process used, as long as it's fast..

    Many Thanks.

  • any update?

    THanks

Viewing 15 posts - 1 through 15 (of 45 total)

You must be logged in to reply to this topic. Login to reply