GetDateInString

  • Comments posted to this topic are about the item GetDateInString

  • hi

    let me know below query satisfy requirement or not .

    thanks

    Jayanth

    CREATE FUNCTION [Global].[fn_GetDateInString]

    (

    @InputString AS NVARCHAR(500)

    )

    RETURNS NVARCHAR(50)

    AS

    BEGIN

    DECLARE @output AS nvarchar(100)

    SELECT @output = Convert(varchar,convert(Date, @InputString, 107),107)

    return @output

    END

  • That would work if all you had was the date in different formats. However, the main purpose that I had to write the scripts was because we have a database that is varchar(500) and the user enters with the narrative a data value. I'm extracting if found this value.

  • -- Convert to an inline table-valued function: more efficient.

    -- minimise the work done to identify a date within a string.

    DECLARE @InputString NVARCHAR(500) = 'ABC01/01/2013XYZ'

    SELECT

    DateString = CASE WHEN ISDATE(PatternString) = 1 THEN PatternString ELSE NULL END

    FROM (

    -- look for a date pattern in the string, stop if found.

    SELECT PatternString = CASE

    -- date format as 01/01/2012

    WHEN PATINDEX('%[0-9][0-9]/[0-9][0-9]/[0-9][0-9][0-9][0-9]%',@InputString) > 0

    THEN LEFT(SUBSTRING(@InputString,PATINDEX('%[0-9][0-9]/[0-9][0-9]/[0-9][0-9][0-9][0-9]%',@InputString),StringLen),10)

    -- date format as 1/01/2012

    WHEN PATINDEX('%[0-9]/[0-9][0-9]/[0-9][0-9][0-9][0-9]%',@InputString) > 0

    THEN LEFT(SUBSTRING(@InputString,PATINDEX('%[0-9]/[0-9][0-9]/[0-9][0-9][0-9][0-9]%',@InputString),StringLen),9)

    -- date format as 01/1/2012

    WHEN PATINDEX('%[0-9][0-9]/[0-9]/[0-9][0-9][0-9][0-9]%',@InputString) > 0

    THEN LEFT(SUBSTRING(@InputString,PATINDEX('%[0-9][0-9]/[0-9]/[0-9][0-9][0-9][0-9]%',@InputString),StringLen),9)

    -- date format as 1/1/2012

    WHEN PATINDEX('%[0-9]/[0-9]/[0-9][0-9][0-9][0-9]%',@InputString) > 0

    THEN LEFT(SUBSTRING(@InputString,PATINDEX('%[0-9]/[0-9]/[0-9][0-9][0-9][0-9]%',@InputString),StringLen),8)

    -- date format as 01/11/12

    WHEN PATINDEX('%[0-9][0-9]/[0-9][0-9]/[0-9][0-9]%',@InputString) > 0

    THEN LEFT(SUBSTRING(@InputString,PATINDEX('%[0-9][0-9]/[0-9][0-9]/[0-9][0-9]%',@InputString),StringLen),8)

    -- date format as 01/1/12

    WHEN PATINDEX('%[0-9][0-9]/[0-9]/[0-9][0-9]%',@InputString) > 0

    THEN LEFT(SUBSTRING(@InputString,PATINDEX('%[0-9][0-9]/[0-9]/[0-9][0-9]%',@InputString),StringLen),8)

    -- date format as 1/11/12

    WHEN PATINDEX('%[0-9]/[0-9][0-9]/[0-9][0-9]%',@InputString) > 0

    THEN LEFT(SUBSTRING(@InputString,PATINDEX('%[0-9]/[0-9][0-9]/[0-9][0-9]%',@InputString),StringLen),7)

    -- date format as 1/1/12

    WHEN PATINDEX('%[0-9]/[0-9]/[0-9][0-9]%',@InputString) > 0

    THEN LEFT(SUBSTRING(@InputString,PATINDEX('%[0-9]/[0-9]/[0-9][0-9]%',@InputString),StringLen),6)

    -- another 80 or so options...

    ELSE NULL END

    FROM (SELECT StringLen = LEN(@InputString)) s

    ) d

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • I would never write a so complicated script for this simple functionality. It's too hard to maintain.

  • bli-963763 (1/17/2013)


    I would never write a so complicated script for this simple functionality. It's too hard to maintain.

    I'd very much like to see your simpler version.

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Hard to maintain it's really not an option when it comes to working with databases, you got to do what you got to do. Data entered by users can be very tricky and whether you like it or not as a DBA your job is to maintain it, easy or difficult it's part of the job.

    What you have to consider is what's the simplest and better out of all the options. ChrisM's version believe it or not is easier and simpler to maintain than the version I wrote.

    BTW thanks ChrisM.

  • HildaJ (1/17/2013)


    Hard to maintain it's really not an option when it comes to working with databases, you got to do what you got to do. Data entered by users can be very tricky and whether you like it or not as a DBA your job is to maintain it, easy or difficult it's part of the job.

    What you have to consider is what's the simplest and better out of all the options. ChrisM's version believe it or not is easier and simpler to maintain than the version I wrote.

    BTW thanks ChrisM.

    You're welcome HildaJ. I can easily understand why you wrote this piece of code. Finding a date within a string is trivial if it's in a fixed format.

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • ChrisM@Work (1/17/2013)


    bli-963763 (1/17/2013)


    I would never write a so complicated script for this simple functionality. It's too hard to maintain.

    I'd very much like to see your simpler version.

    I might try something like this (trying to taking advantage of SQL Server's IsDate function):

    -- Find all possible dates in a string column

    ;with Tally as (

    -- Celko-style derived numbers table

    select a.n

    + b.n * 10

    + c.n * 100

    + 1 as n

    from (select 0 as n union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9) a

    , (select 0 as n union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9) b

    , (select 0 as n union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9) c

    )

    select i.n as SubstringIndex

    , l.n as SubstringLen

    , substring(t.MyColumn, i.n, l.n) as CandidateSubstring

    , cast(substring(t.MyColumn, i.n, l.n) as datetime) as InterpretedDateTime

    from Tally i

    cross join Tally l

    cross join MyTable t

    where 1 = 1

    and i.n + l.n - 1 <= len(t.MyColumn)

    and isdate(substring(t.MyColumn, i.n, l.n)) = 1

    order by l.n desc, i.n

    This is close to the simplest form that I can come up with, though it could be optimized with smarter where/join clauses (what is the max len of a date, etc.). Also, using replace for month names/"st"/"th" might be smart as well, as would using Jeff Moden's Tally table instead of the CTE above. And again, we're at the mercy of IsDate, so would have to massage the data (possibly using replace and such) to fit the mold.

    That said, the OP has a great function. Fun stuff. Thanks!

  • Nice work Hilda - I have fought that dragon too with embedded dates in large text fields.... I like your solution. Since I'm also a dot.net guy my approach is via a CLR. I appreciate your work, thank you 🙂

    Tom in Sacramento - For better, quicker answers on T-SQL questions, click on the following...http://www.sqlservercentral.com/articles/Best+Practices/61537/

  • Nice work on the first script and for the follow ons. I have usually managed this on the .net c# side with no considerations for doing it in the SQL world. This adds another tool to the arsenal when we seek to slay the dragon. Thanks.

    M.

    Not all gray hairs are Dinosaurs!

  • Nice one Hilda,

    More like a braintrainer, than for any actual purpose I tried to come up with something that could do the same task with a little less elaborate code. I haven't managed to completely mimick your functionality in the 1,5 hours spent. For example this doesn't support formats with month names or abbreviations in it, but I ran out of time before I could do that. Still I did come up with something I'd like to share with you, maybe it gives you some ideas to further improve your function.

    declare @InputString nvarchar(500);

    select @InputString = '123456789012/31/20132/1/201312345678902013-01-182004-02-29';

    with cteNumbers as (

    select row_number() over (order by (select null)) as n

    from sys.syscolumns sc1

    cross join sys.syscolumns sc2

    ),

    ctePatterns as (

    select N'[mm]/[dd]/[yyyy]' as pattern, 10 as len, 1 as ms, 2 as ml, 4 as ds, 2 as dl, 7 as ys, 4 as yl, convert(smallint, 0x07) as checks

    union all

    select N'[m]/[dd]/[yyyy]', 9, 1, 1, 3, 2, 6, 4, 0x07 as checks

    union all

    select N'[m]/[d]/[yyyy]', 8, 1, 1, 3, 1, 5, 4, 0x07 as checks

    union all

    select N'[mm]/[d]/[yyyy]', 9, 1, 2, 4, 1, 6, 4, 0x07 as checks

    union all

    select N'[mm]/[dd]/[yy]', 8, 1, 2, 4, 2, 7, 2, 0x0B as checks

    union all

    select N'[m]/[dd]/[yy]', 7, 1, 1, 3, 2, 6, 2, 0x0B as checks

    union all

    select N'[m]/[d]/[yy]', 6, 1, 1, 3, 1, 5, 2, 0x0B as checks

    union all

    select N'[mm]/[d]/[yy]', 7, 1, 2, 4, 1, 6, 2, 0x0B as checks

    union all

    select N'[yyyy]-[mm]-[dd]', 10, 6, 2, 9, 2, 1, 4, 0x07 as checks

    union all

    select N'[yyyy]-[m]-[dd]', 9, 6, 1, 8, 2, 1, 4, 0x07 as checks

    union all

    select N'[yyyy]-[m]-[d]', 8, 6, 1, 8, 1, 1, 4, 0x07 as checks

    union all

    select N'[yyyy]-[mm]-[d]', 9, 6, 2, 9, 1, 1, 4, 0x07 as checks

    )

    select n.n as position,

    pat.len,

    row_number() over (partition by n.n + pat.len order by pat.len desc),

    substring(inp.txt, n.n, pat.len),

    pat.pattern,

    -- inp.txt,

    x.[year],

    x.[month],

    x.[day]

    from (

    select @inputString as txt

    ) inp

    cross join cteNumbers n

    cross join ctePatterns pat

    cross apply (

    select replace(replace(replace(replace(replace(replace(

    pat.pattern,

    '[yyyy]', '[0-9][0-9][0-9][0-9]'),

    '[yy]', '[0-9][0-9]'),

    '[dd]', '[0-9][0-9]'),

    '[d]', '[0-9]'),

    '[mm]', '[0-9][0-9]'),

    '[m]', '[0-9]') as pattern,

    convert(int, substring(inp.txt, -1 + n.n + pat.ms, pat.ml)) as [month],

    convert(int, substring(inp.txt, -1 + n.n + pat.ds, pat.dl)) as [day],

    convert(int, substring(inp.txt, -1 + n.n + pat.ys, pat.yl)) as [year]

    ) x

    where n.n < datalength(inp.txt) / 2

    and patindex(x.pattern, substring(inp.txt, n.n, pat.len)) > 0

    and not exists (

    select 1 where pat.checks & 0x01 <> 0x00 and not convert(int, substring(inp.txt, -1 + n.n + pat.ms, pat.ml)) between 1 and 12

    union all

    select 1 where pat.checks & 0x02 <> 0x00 and (

    convert(int, substring(inp.txt, -1 + n.n + pat.ds, pat.dl)) < 1

    or convert(int, substring(inp.txt, -1 + n.n + pat.ds, pat.dl)) >

    case

    when convert(int, substring(inp.txt, -1 + n.n + pat.ms, pat.ml)) in (1,3,5,7,8,10,12) then 31

    when convert(int, substring(inp.txt, -1 + n.n + pat.ms, pat.ml)) in (4,6,9,11) then 30

    when convert(int, substring(inp.txt, -1 + n.n + pat.ms, pat.ml)) = 2 then

    case when convert(int, substring(inp.txt, -1 + n.n + pat.ms, pat.ml)) % 4 = 0 then 29

    else 28

    end

    end)

    union all

    select 1 where pat.checks & 0x04 <> 0x00 and not convert(int, substring(inp.txt, -1 + n.n + pat.ys, pat.yl)) between 1900 and 9999

    union all

    select 1 where pat.checks & 0x08 <> 0x00 and not convert(int, substring(inp.txt, -1 + n.n + pat.ys, pat.yl)) between 0 and 99

    )

    order by n.n + pat.len,

    3;

    The output is as follows:

    positionlen(No column name)(No column name)patternyearmonthday

    118112/31/20[mm]/[dd]/[yy]201231

    1110112/31/2013[mm]/[dd]/[yyyy]20131231

    21612/1/20[m]/[d]/[yy]2021

    21812/1/2013[m]/[d]/[yyyy]201321

    39912013-01-1[yyyy]-[mm]-[d]201311

    391012013-01-18[yyyy]-[mm]-[dd]2013118

    49912004-02-2[yyyy]-[mm]-[d]200422



    Posting Data Etiquette - Jeff Moden[/url]
    Posting Performance Based Questions - Gail Shaw[/url]
    Hidden RBAR - Jeff Moden[/url]
    Cross Tabs and Pivots - Jeff Moden[/url]
    Catch-all queries - Gail Shaw[/url]


    If you don't have time to do it right, when will you have time to do it over?

  • Very cool, and I must ask, did you do this for fun? 😀 Pretty amazing. The more I learn the more I feel there's so much out there that I don't know. I'll have to make time to incorporate everyones suggestions into mine.

    Thanks.

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply