parse string and check for valid date

  • I have a flat file with a bar(|) delimiter. It has no headers, but I want to check columns(11,12,13) for valid dates. If invalid date found in any of the fields display entire record and
    fail the process. 

    Without a header  how do validate sample below (Date) as a check for column 11

       if (! ($row.date -as [datetime]) ) {
       Write-Host "Row: $($csvData.indexOf($row)+2): $($row.date) is invalid"
       Write-Output $row

     Thanks.

  • Something like this should do it:

    $badLines = 0
    Get-Content .\test.txt | ForEach-Object {
        $currentLine = $_
        $cols = $_.Split('|')
        Try {
            # Note columns are 0 based
            [DateTime]:: ParseExact($cols[10], 'dd/MM/yyyy', $null) | Out-Null
            [DateTime]:: ParseExact($cols[11], 'dd/MM/yyyy', $null) | Out-Null
            [DateTime]:: ParseExact($cols[12], 'dd/MM/yyyy', $null) | Out-Null
        } Catch {
            Write-Output $currentLine
            $badLines ++
        }
    }
    if ($badLines -gt 0)
    {
        throw "$($badLines) lines contained invalid dates"
    }

  • It throw and error at the end.

     +  throw "$($badLines) lines contained invalid dates"
    +  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      + CategoryInfo    : OperationStopped: (97 lines contained invalid dates:String) [], RuntimeEx
     ception
      + FullyQualifiedErrorId : 97 lines contained invalid dates

     THanks.

    Can it first make sure there is a date to check? if fields are empty\null this maybe not an error I want to trap. Only
    if date present does it conform to rules.

     Thanks again

  • any idea what caused this error?

     Thanks.

  • You can wrap calls to ParseExact in If statements to exclude cases you don't want to try and check. What have you tried that isn't working?

  • Need an example of how to wrap the if statement checking if date exists and conforms to format dd-mm-yyy (null or empty can pass check but maybe log the line number).

     When I run script above it throws error:

     + throw "$($badLines) lines contained invalid dates"
    + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo  : OperationStopped: (97 lines contained invalid dates:String) [], RuntimeEx
    ception
    + FullyQualifiedErrorId : 97 lines contained invalid dates

  • Bruin - Wednesday, November 21, 2018 7:00 AM

    Need an example of how to wrap the if statement checking if date exists and conforms to format dd-mm-yyy (null or empty can pass check but maybe log the line number).

     When I run script above it throws error:

     + throw "$($badLines) lines contained invalid dates"
    + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo  : OperationStopped: (97 lines contained invalid dates:String) [], RuntimeEx
    ception
    + FullyQualifiedErrorId : 97 lines contained invalid dates

    To be honest, it sounds like you're trying to rewrite BCP or BULK INSERT in Powershell.  Why not just import the data into a table and check the data there?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I was looking for a generic Insert that would read a csv and dynamically create a SQL table based upon number of columns in csv then do the
    ETL in sql table

  • Bruin - Friday, November 23, 2018 6:05 AM

    I was looking for a generic Insert that would read a csv and dynamically create a SQL table based upon number of columns in csv then do the
    ETL in sql table

    If you are doing a generic import how would you know what columns to test for conforming to valid dates?

  • I know which fields in csv contain dates

  • Bruin - Friday, November 23, 2018 5:21 PM

    I know which fields in csv contain dates

    For every possible file you will be importing?

  • Bruin - Friday, November 23, 2018 6:05 AM

    I was looking for a generic Insert that would read a csv and dynamically create a SQL table based upon number of columns in csv then do the
    ETL in sql table

    You can do this... have a stored procedure that reads the first row as a blob (the one with the field names) and split it so you get a numbered list of the field names.  Use that as a source of dynamic SQL to build your target table and then import the file to that new target table.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • How could that be accomplished(example) if no header available.

     Thanks.

  • Bruin - Monday, November 26, 2018 9:45 AM

    How could that be accomplished(example) if no header available.

     Thanks.

    Read the first row anyway just to get a count of the columns and then assign generic column names.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I have been trying to use PS to load text files to SQL using the script from article "Import tricky CSV's". The script is very good
    but the speed seems to be the issue. Sometimes you have to give up speed for accuracy and field parsing since data is not always
    clean which that script addresses nicely.

    Issues I have found with script processing ETL

    delimiter 
    header\noheader
    error handling
    speed
    memory consumption

    I like the idea of having ETL in a script, but many of the examples I have seen speed always seem to be an issue. I have files
    that can range from 50 records to millions, and row size from 10 fields to 200. I have used SSIS, but sometimes not flexible when you have to change
    the criteria or add\remove fields.

    would like to hear\see other solutions people are using to accomplish file loading to SQL

    Thanks again.

Viewing 15 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply