Error converting string to date

  • Hi,

    I've got a really annoying problem in that I have a char column (loaded from csv) that contains dates in multiple formats.

    Some are 'YYYY-MM-DD HH:mm:SS' others are 'DD/MM/YYYY HH:mm'

    If I run a where statement like CAST([StupidColumn] AS DATETIME) BETWEEN @FromDate AND @ToDate I get an error in the cast.

    I thought to myself, that's easy...

    set dateformat 'dmy'

    ;with cteTxns AS (

    select ProcessedIndicator, [Transaction Datetime]

    from myTable

    where isdate([Transaction Datetime]) = 1

    )

    update cteTxns

    set ProcessedIndicator = 0

    where CAST([Transaction Datetime] AS DATETIME) = @SomeDate

    set dateformat 'ymd'

    ;with cteTxns AS (

    select ProcessedIndicator, [Transaction Datetime]

    from myTable

    where isdate([Transaction Datetime]) = 1

    )

    update cteTxns

    set ProcessedIndicator = 0

    where CAST([Transaction Datetime] AS DATETIME) = @SomeDate

    but no... isdate appears to only consider the set dateformat when it feels like it or when your system language is set to Italian or whatever lol BOL isn't much help on this subject or I'm just not reading it right.

    Anyone else come across this? I might end up having to write an update program to detect the date format and convert it unless anyone else has any bright ideas? he asks hopefully...

    Cheers guys.

    Ben

    ^ Thats me!

    ----------------------------------------
    01010111011010000110000101110100 01100001 0110001101101111011011010111000001101100011001010111010001100101 01110100011010010110110101100101 011101110110000101110011011101000110010101110010
    ----------------------------------------

  • Would you consider '2011-01-01 17:15:55' as YYYY-MM-DD HH:MM:SS or YYYY-DD-MM HH:MM:SS ?

    Would you consider '01/01/1995 12:52' as DD/MM/YYYY HH:MM or MM/DD/YYYY HH:MM ?


    Forever trying to learn
    My blog - http://www.cadavre.co.uk/
    For better, quicker answers on T-SQL questions, click on the following...http://www.sqlservercentral.com/articles/Best+Practices/61537/
    For better, quicker answers on SQL Server performance related questions, click on the following...http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • ooh interesting... I've just checked again and all the 2011-01-01 type dates returned true by isdate (im expecting yyyy-mm-dd) are returning records where the dd part is between 1 and 12.

    This just makes things worse... I ask it for dmy and it looks for ydm.

    Seriously starting to think writing a program, maybe a php script, to do it for me is the best idea...

    Ben

    ^ Thats me!

    ----------------------------------------
    01010111011010000110000101110100 01100001 0110001101101111011011010111000001101100011001010111010001100101 01110100011010010110110101100101 011101110110000101110011011101000110010101110010
    ----------------------------------------

  • BenWard (5/2/2013)


    ooh interesting... I've just checked again and all the 2011-01-01 type dates returned true by isdate (im expecting yyyy-mm-dd) are returning records where the dd part is between 1 and 12.

    This just makes things worse... I ask it for dmy and it looks for ydm.

    Seriously starting to think writing a program, maybe a php script, to do it for me is the best idea...

    The problem is knowing the date format. It's relatively trivial if you know that the date format is always one of "YYYY-MM-DD" or "DD/MM/YYYY". The difficulty comes when it could be "YYYY-DD-MM" or "YYYY-MM-DD" because programatically there is no way to tell what format 2011-01-02 belongs to as it is a valid date as either YYYY-DD-MM or YYYY-MM-DD. If at all possible, get whoever generated the file to do it again with a consistent format or with a "date format" column. Failing that, you would need to make a decision about whether you consider 2011-01-02 to be YYYY-DD-MM or YYYY-MM-DD and whether you consider 12/01/1995 to be DD/MM/YYYY or MM/DD/YYYY. After which, we can use some PATINDEX magic and a CASE statement to ensure that the correct formatted dates go in to the "convert".


    Forever trying to learn
    My blog - http://www.cadavre.co.uk/
    For better, quicker answers on T-SQL questions, click on the following...http://www.sqlservercentral.com/articles/Best+Practices/61537/
    For better, quicker answers on SQL Server performance related questions, click on the following...http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • alas this table is built from daily data files going back over a number of months so getting the software house to re-export some of those csv files isn't an option.

    I can confirm that ALL of the data is either ymd or dmy there is no ydm or mdy.

    Ben

    ^ Thats me!

    ----------------------------------------
    01010111011010000110000101110100 01100001 0110001101101111011011010111000001101100011001010111010001100101 01110100011010010110110101100101 011101110110000101110011011101000110010101110010
    ----------------------------------------

  • BenWard (5/2/2013)


    alas this table is built from daily data files going back over a number of months so getting the software house to re-export some of those csv files isn't an option.

    I can confirm that ALL of the data is either ymd or dmy there is no ydm or mdy.

    Something like this: -

    SELECT *,

    CAST(

    CASE WHEN PATINDEX('[1-2][0-9][0-9][0-9]-[0-1][0-9]-[0-3][0-9]', dates) > 0 THEN dates

    WHEN PATINDEX('[1-2][0-9][0-9][0-9]/[0-1][0-9]/[0-3][0-9]', dates) > 0 THEN REPLACE(dates,'/','-')

    WHEN PATINDEX('[0-3][0-9]-[0-1][0-9]-[1-2][0-9][0-9][0-9]', dates) > 0 OR

    PATINDEX('[0-3][0-9]/[0-1][0-9]/[1-2][0-9][0-9][0-9]', dates) > 0 THEN SUBSTRING(dates,7,4)+'-'+SUBSTRING(dates,4,2)+'-'+SUBSTRING(dates,1,2)

    ELSE NULL END AS DATETIME)

    FROM (

    --== SAMPLE DATA ==--

    SELECT '2012-02-01'

    UNION ALL SELECT '2012/03/01'

    UNION ALL SELECT '01-05-1995'

    UNION ALL SELECT '01/06/1995'

    ) a(dates);

    You can make it more robust by including all of the different patterns for a valid date (e.g. for September, we'd want something like CASE WHEN PATINDEX('19[0-9][0-9]-09-30', dates) > 0 OR PATINDEX('19[0-9][0-9]-09-[1-2][0-9]', dates) > 0 OR PATINDEX('19[0-9][0-9]-09-0[1-9]', dates) > 0 OR PATINDEX('19[0-1][0-9][0-9]-09-30', dates) > 0 OR PATINDEX('19[0-9][0-9]-09-[1-2][0-9]', dates) > 0 OR PATINDEX('19[0-9][0-9]-09-0[1-9]', dates) > 0).


    Forever trying to learn
    My blog - http://www.cadavre.co.uk/
    For better, quicker answers on T-SQL questions, click on the following...http://www.sqlservercentral.com/articles/Best+Practices/61537/
    For better, quicker answers on SQL Server performance related questions, click on the following...http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • You sir, are a legend.

    Thanks very much - I've not used PATINDEX before, I'll take a look into that in more detail.

    Ben

    ^ Thats me!

    ----------------------------------------
    01010111011010000110000101110100 01100001 0110001101101111011011010111000001101100011001010111010001100101 01110100011010010110110101100101 011101110110000101110011011101000110010101110010
    ----------------------------------------

  • I usually don't expect the ISDATE function to return predictable results because it is dependent on the specific date datatype you are converting to and often gives different results (0/1) with the same date-like looking string passed in if the types are different. So I use specific date validation procedures I've cobbled together. The TRY CATCH method is best but those don't work in functions so unfortunately a procedure must be built to implement the TRY CATCH validation method. Right now what I have to offer is RBAR and I apologize for that. If someone can convert my solution to still use TRY CATCH and get rid of the cursor loop I'll be grateful for the contribution.

    First thing for testing is to build some random dates. I'm including a date format function just so I can input dates in various forms and then build a sample table.

    CREATE FUNCTION [dbo].[tvfFormatDateWithMask]

    (

    @date AS DATETIME

    ,@format_string VARCHAR(50)

    )

    RETURNS TABLE

    AS

    RETURN

    (

    WITH fourRows(N)

    AS (

    SELECT 1

    UNION ALL

    SELECT 2

    UNION ALL

    SELECT 3

    UNION ALL

    SELECT 4

    )

    ,cteTally(N)

    AS (

    SELECT TOP (50)

    ROW_NUMBER() OVER (ORDER BY (SELECT NULL))

    FROM

    fourRows AS A

    CROSS JOIN fourRows AS B

    CROSS JOIN fourRows AS C

    ORDER BY 1

    )

    ,tokenizedString

    AS (

    SELECT

    N

    ,C

    ,groupId = DENSE_RANK() OVER (ORDER BY C, _groupId)

    FROM

    (

    SELECT

    N

    ,SUBSTRING(@format_string COLLATE Latin1_General_CS_AS,N,1) AS C

    ,_groupId = N - ROW_NUMBER() OVER (PARTITION BY SUBSTRING(@format_string COLLATE Latin1_General_CS_AS,N,1) ORDER BY N)

    FROM

    cteTally

    WHERE

    N <= LEN(@format_string)

    ) AS fs)

    SELECT

    formattedDate =

    (SELECT

    CASE REPLICATE(MIN(C),COUNT(*))

    WHEN 'YYYY' THEN RIGHT('0000' + CAST(YEAR(@date) AS NVARCHAR(4)),4)

    WHEN 'YY' THEN RIGHT('00' + CAST(YEAR(@date) AS NVARCHAR(4)),2)

    WHEN 'Y' THEN CAST(CAST(RIGHT('00' + CAST(YEAR(@date) AS NVARCHAR(4)),1) AS INT) AS NVARCHAR(2))

    WHEN 'MMMM' THEN DATENAME(month,@date)

    WHEN 'MM' THEN RIGHT('00' + CAST(MONTH(@date) AS NVARCHAR(2)),2)

    WHEN 'M' THEN CAST(MONTH(@date) AS NVARCHAR(2))

    WHEN 'DDDD' THEN DATENAME(weekday,@date)

    WHEN 'DD' THEN RIGHT('00' + CAST(DAY(@date) AS NVARCHAR(2)),2)

    WHEN 'D' THEN CAST(DAY(@date) AS NVARCHAR(2))

    WHEN 'HH' THEN RIGHT('00' + CAST(DATEPART(hour,@date) AS NVARCHAR(2)),2)

    WHEN 'H' THEN CAST(DATEPART(hour,@date) AS NVARCHAR(2))

    WHEN 'hh' THEN RIGHT('00' + CAST(

    CASE DATEPART(hour,@date)

    WHEN 12 THEN 12

    ELSE DATEPART(hour,@date) % 12

    END AS NVARCHAR(2)),2)

    WHEN 'h' THEN

    CASE WHEN LEFT(

    RIGHT('00' + CAST(

    CASE DATEPART(hour,@date)

    WHEN 12 THEN 12

    ELSE DATEPART(hour,@date) % 12

    END AS NVARCHAR(2)),2),1) = 0 THEN

    RIGHT('00' + CAST(

    CASE DATEPART(hour,@date)

    WHEN 12 THEN 12

    ELSE DATEPART(hour,@date) % 12

    END AS NVARCHAR(2)),1)

    ELSE

    RIGHT('00' + CAST(

    CASE DATEPART(hour,@date)

    WHEN 12 THEN 12

    ELSE DATEPART(hour,@date) % 12

    END AS NVARCHAR(2)),2)

    END

    WHEN 'mm' THEN RIGHT('00' + CAST(DATEPART(minute,@date) AS NVARCHAR(2)),2)

    WHEN 'm' THEN CAST(DATEPART(minute,@date) AS NVARCHAR(2))

    WHEN 'ss' THEN RIGHT('00' + CAST(DATEPART(second,@date) AS NVARCHAR(2)),2)

    WHEN 's' THEN CAST(DATEPART(second,@date) AS NVARCHAR(2))

    WHEN 'fff' THEN RIGHT('000' + CAST(DATEPART(millisecond,@date) AS NVARCHAR(3)),3)

    WHEN 'f' THEN CAST(DATEPART(millisecond,@date) AS NVARCHAR(3))

    WHEN 'tt' THEN

    CASE

    WHEN DATEPART(hour,@date) >= 12 THEN N'PM'

    ELSE N'AM'

    END

    WHEN 't' THEN

    CASE

    WHEN DATEPART(hour,@date) >= 12 THEN N'P'

    ELSE N'A'

    END

    WHEN 'x' THEN

    CASE

    WHEN CAST(DAY(@date) AS INT) IN (1,21,31) THEN N'st'

    WHEN CAST(DAY(@date) AS INT) IN (3,23) THEN N'rd'

    ELSE N'th'

    END

    ELSE MIN(C)

    END

    FROM

    tokenizedString

    GROUP BY

    groupId

    ORDER BY

    MIN(N)

    FOR

    XML PATH('')

    ,TYPE

    ).value('(./text())[1]','nvarchar(50)')

    )

    /*

    SELECT FormattedDate FROM dbo.tvfFormatDateWithMask(GETDATE(),'DD/MM/YY') AS CurrDate

    or

    SELECT

    FormattedDate

    FROM

    (

    SELECT

    GETDATE() AS UnformattedDate

    ) Result

    CROSS APPLY

    dbo.tvfFormatDateWithMask(Result.UnformattedDate,'DD/MM/YY')

    'YYYY'- full year with century

    'YY'- year without century

    'Y'- last digit of year

    'MMMM'- month name

    'MM'- month number with leading zero

    'M'- month number without leading zero)

    'DDDD'- day name

    'DD'- day number with leading zero

    'D'- day number without leading zero

    'HH'- hour with leading zero (24 hr format)

    'H'- hour without leading zero

    'hh'- hour with leading zero (12 hr format)

    'h'- hour without leading zero (12 hr format)

    'mm'- minutes with leading zero

    'm'- minutes without leading zero

    'ss'- seconds with leading zero

    's'- seconds without leading zero

    'fff'- milliseconds with leading zeros

    'f'- milliseconds without leading zeros

    'tt'- AM or PM

    't'- A or P

    'x'- st, nd, or th

    */

    Now create the date validation procedure itself:

    CREATE PROCEDURE [dbo].[IsValidDate]

    @sDate VARCHAR(50)

    ,@sDateFormat CHAR(3) = 'DMY' -- MDY, DMY, YMD, YDM, MYD, DYM

    AS

    BEGIN

    SET NOCOUNT ON

    SET DATEFORMAT @sDateFormat

    DECLARE

    @dStdDate DATETIME2 --use the date datatype you want to check against

    ,@bIsValidDate BIT

    ,@ERROR BIT

    SET @dStdDate = NULL

    BEGIN TRY

    SET @dStdDate = CONVERT(DATETIME2,@sDate)

    SET @bIsValidDate = 1

    END TRY

    BEGIN CATCH

    SET @bIsValidDate = 0

    SET @dStdDate = NULL

    END CATCH

    SET @ERROR = @bIsValidDate

    SELECT

    @sDate AS InputDate

    ,@dStdDate AS StdDate

    ,@ERROR AS IsValidDate

    /*

    EXAMPLES:

    EXEC dbo.IsValidDate '01-07-2001' -- Valid date

    EXEC dbo.IsValidDate '1/7/2001' -- Valid date

    EXEC dbo.IsValidDate '07-01-2001' -- Valid date

    EXEC dbo.IsValidDate '7/1/2001' -- Valid date

    EXEC dbo.IsValidDate '29-12-2013','DMY' -- Valid date in DMY format

    EXEC dbo.IsValidDate '29-12-2013','MDY' -- Invalid date in MDY format

    EXEC dbo.IsValidDate '32-12-2013' -- ERROR: Date out of range

    EXEC dbo.IsValidDate '29-02-2013' -- ERROR: Not a leap year

    EXEC dbo.IsValidDate 'xyz' -- ERROR: Invalid date

    EXEC dbo.IsValidDate '1234' -- Invalid std date/valid umc date

    EXEC dbo.IsValidDate '2013-01-08 15:44:12' -- Valid date

    EXEC dbo.IsValidDate '2013-01-08 15:44:12.000' -- Valid date

    EXEC dbo.IsValidDate '2013-01-08 15:44:12.208' -- Valid date

    EXEC dbo.IsValidDate '2013-01-08 15:44:12.2081606' -- Invalid std date/valid umc date

    EXEC dbo.IsValidDate '2013-01-08 15:44:12.2081606 +05:30' -- Invalid std date/valid umc date

    */

    END

    Now for the script to create 10,000 rows of sample dates with some bad dates sprinkled in.

    IF OBJECT_ID('tempdb..#TempTable') IS NOT NULL

    DROP TABLE #TempTable

    CREATE TABLE #TempTable (

    [ID] INT IDENTITY(1,1) NOT NULL,

    [SomeDate] VARCHAR(50) NULL,

    PRIMARY KEY (ID))

    IF OBJECT_ID('tempdb..#DateValidationResults') IS NOT NULL

    DROP TABLE #DateValidationResults

    CREATE TABLE #DateValidationResults (

    [ID] INT IDENTITY(1,1) NOT NULL,

    [InputDate] VARCHAR(50) NULL,

    [ConvertedDate] VARCHAR(50) NULL,

    [IsValidDate] BIT NULL,

    PRIMARY KEY (ID))

    SET NOCOUNT ON

    BEGIN

    INSERT INTO #TempTable

    SELECT TOP 10000

    (CASE

    WHEN t1.N % 50 = 0

    THEN 'ABC-DEFG-9999-XYZ' -- BAD DATE

    WHEN t1.N % 5 = 0

    THEN (SELECT formattedDate FROM dbo.tvfFormatDateWithMask(CAST(RAND(CHECKSUM((SELECT MyNewID FROM dbo.iFunction)))*3653.0+(SELECT MyRand*100000 FROM dbo.iFunction) AS DATETIME),'DD/MM/YYYY hh:mm'))

    WHEN t1.N % 3 = 0

    THEN (SELECT formattedDate FROM dbo.tvfFormatDateWithMask(CAST(RAND(CHECKSUM((SELECT MyNewID FROM dbo.iFunction)))*3653.0+(SELECT MyRand*100000 FROM dbo.iFunction) AS DATETIME),'YY/MM/DD'))

    WHEN t1.N % 2 = 0

    THEN (SELECT formattedDate FROM dbo.tvfFormatDateWithMask(CAST(RAND(CHECKSUM((SELECT MyNewID FROM dbo.iFunction)))*3653.0+(SELECT MyRand*100000 FROM dbo.iFunction) AS DATETIME),'YYYY-MM-DD hh:mm:ss.fff +hh:mm'))

    ELSE

    (SELECT formattedDate FROM dbo.tvfFormatDateWithMask(CAST(RAND(CHECKSUM((SELECT MyNewID FROM dbo.iFunction)))*3653.0+(SELECT MyRand*100000 FROM dbo.iFunction) AS DATETIME),'YYYY-MM-DD hh:mm:ss'))

    END) AS SomeDate

    FROM dbo.Tally t1

    Finally, run the validation script:

    DECLARE

    @ID INT

    ,@chkDate VARCHAR(50)

    DECLARE DateValidation CURSOR

    FOR

    SELECT

    ID

    ,SomeDate

    FROM

    #TempTable

    OPEN DateValidation

    FETCH NEXT FROM DateValidation INTO

    @ID, @chkDate

    WHILE @@FETCH_STATUS = 0

    BEGIN

    BEGIN TRY

    INSERT INTO #DateValidationResults

    (InputDate, ConvertedDate, IsValidDate)

    EXEC dbo.IsValidDate @chkDate,'YMD'

    END TRY

    BEGIN CATCH

    INSERT INTO #DateValidationResults

    (InputDate, ConvertedDate, IsValidDate)

    VALUES (@ID,@chkDate,0)

    END CATCH

    SET @chkDate = NULL

    FETCH NEXT FROM DateValidation INTO

    @ID, @chkDate

    END

    CLOSE DateValidation

    DEALLOCATE DateValidation

    SELECT

    *

    FROM

    #DateValidationResults

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply