Why does these return different results?

  • As part of an earlier post - someone was asking how to find capitalized letters, and I found that these 2 applications of Patindex return different results, which doesn't really make sense to me. What am I missing about this syntax?

    the code:

    declare @fun nvarchar(200)

    set @fun='asdfw qweoiu-qwer aslfkjas Ajsdjskwiwoelriutoty'

    select patindex('%[ABCDEFGHIJKLMNOPQRSTUVWXYZ]%',@fun COLLATE Latin1_General_CS_AI) --returns 28

    select patindex('%[A-Z]%',@fun COLLATE Latin1_General_CS_AI) --returns 2

    Now - having dealt with Regex-like syntax, I would usually tend to write the second syntax (which would mean the A through Z range in shorthand), except that it's not returning what I'd think is the "correct" answer. What am I missing?

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • PATINDEX does not use Regular Expression Syntax so your second example is matching on 'A', '-', 'Z' and so the only matches found are 'A' and 'Z'

  • Jack Corbett (4/8/2008)


    PATINDEX does not use Regular Expression Syntax so your second example is matching on 'A', '-', 'Z' and so the only matches found are 'A' and 'Z'

    Then why does BOL advocate using the "range wildcard" as described here?

    http://msdn2.microsoft.com/en-us/library/ms179884.aspx

    the example is using that exact syntax, but with numbers [0-9]. That seems to work, but letter "ranges" fail - why?

    Also - even if that were the case - then why is it returning 2? there's no capital A or Z or - in that position....

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • Well, count this as one of the times I made a fool of myself. I actually looked up PatIndex in BOL expecting it to give an example using the Regular Expression Syntax and when I did not see it I assumed (yeah I know what that means) it was not supported. Of course I also did not notice the [] in the first list. Of course now that I feel like a fool I need to figure out the right answer:P. With limited testing if you do:

    declare @fun nvarchar(200)

    set @fun='asdfw qweoiu-qwer aslfkjas Ajsdjskwiwoelriutoty'

    select patindex('%[ABCDEFGHIJKLMNOPQRSTUVWXYZ]%',@fun COLLATE Latin1_General_CS_AI) --returns 28

    select patindex('%[A-B]%',@fun COLLATE Latin1_General_CS_AI) --returns 28

    select patindex('%[A-C]%',@fun COLLATE Latin1_General_CS_AI) --returns 28

    select patindex('%[A-c]%',@fun COLLATE Latin1_General_CS_AI) --returns 28

    select patindex('%[A-D]%',@fun COLLATE Latin1_General_CS_AI) --returns 2

    This does not provide an answer, but adds to the question.

  • Jack Corbett (4/8/2008)


    Well, count this as one of the times I made a fool of myself. I actually looked up PatIndex in BOL expecting it to give an example using the Regular Expression Syntax and when I did not see it I assumed (yeah I know what that means) it was not supported. Of course I also did not notice the [] in the first list. Of course now that I feel like a fool I need to figure out the right answer:P. With limited testing if you do:

    declare @fun nvarchar(200)

    set @fun='asdfw qweoiu-qwer aslfkjas Ajsdjskwiwoelriutoty'

    select patindex('%[ABCDEFGHIJKLMNOPQRSTUVWXYZ]%',@fun COLLATE Latin1_General_CS_AI) --returns 28

    select patindex('%[A-B]%',@fun COLLATE Latin1_General_CS_AI) --returns 28

    select patindex('%[A-C]%',@fun COLLATE Latin1_General_CS_AI) --returns 28

    select patindex('%[A-c]%',@fun COLLATE Latin1_General_CS_AI) --returns 28

    select patindex('%[A-D]%',@fun COLLATE Latin1_General_CS_AI) --returns 2

    This does not provide an answer, but adds to the question.

    Oh don't feel bad. You're right that I shouldn't have used the "Regex syntax" term, since it only supports ranges and range exclusions and not all of the other stuff Regex does....

    I was feeling like someone stole my teddy bear a little earlier, since I thought I understood how it worked before this. I actually may need to go check that some of my code isn't fubar'ed due to this.

    Thanks for giving it a shot!

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • Matt Miller (4/8/2008)

    Oh don't feel bad. You're right that I shouldn't have used the "Regex syntax" term, since it only supports ranges and range exclusions and not all of the other stuff Regex does....

    I was feeling like someone stole my teddy bear a little earlier, since I thought I understood how it worked before this. I actually may need to go check that some of my code isn't fubar'ed due to this.

    Thanks for giving it a shot!

    Never said I felt bad:D.

    I think I have found something interesting though. When using the range wildcard syntax it looks like PATINDEX is resetting position when a character changes in the searched expression. Basically it looks like it is checking 'as' then 'sd', etc... try this:

    declare @fun nvarchar(200)

    set @fun='asdAfw qweoiu-qwer aslfkjas Ajsdjskwiwoelriutoty'

    select patindex('%[ABCDEFGHIJKLMNOPQRSTUVWXYZ]%',@fun COLLATE Latin1_General_CS_AI) --returns 4

    select patindex('%[A-Z]%',@fun COLLATE Latin1_General_CS_AI) --returns 2

    set @fun='Aasdfw qweoiu-qwer aslfkjas Ajsdjskwiwoelriutoty'

    select patindex('%[ABCDEFGHIJKLMNOPQRSTUVWXYZ]%',@fun COLLATE Latin1_General_CS_AI) --returns 1

    select patindex('%[A-Z]%',@fun COLLATE Latin1_General_CS_AI) --returns 1

    You can use any upper case character to replace the first 'A' and still get the same results.

    Wish I had an answer to why it behaves that way.

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply