PatReplace8K

  • Comments posted to this topic are about the item PatReplace8K

    "I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

    -- Itzik Ben-Gan 2001

  • Why is the CASE statement necessary?

  • sequelgarrett (6/29/2015)


    Why is the CASE statement necessary?

    I have to go through my notes, I can't remember at the moment and I don't have access to a PC. It had something to do with the behavior of how the string is re-concatenated using XML PATH (''). I recently changed the code and it may not be necessary anymore. I will have to re-test without the CASE statement.

    "I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

    -- Itzik Ben-Gan 2001

  • Nice job, Alan.

    One question: what's the CHECKSUM() for? It doesn't appear to change the result from ROW_NUMBER().

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • ChrisM@Work (6/30/2015)


    Nice job, Alan.

    One question: what's the CHECKSUM() for? It doesn't appear to change the result from ROW_NUMBER().

    Same question.

    Don Simpson



    I'm not sure about Heisenberg.

  • DonlSimpson (6/30/2015)


    ChrisM@Work (6/30/2015)


    Nice job, Alan.

    One question: what's the CHECKSUM() for? It doesn't appear to change the result from ROW_NUMBER().

    Same question.

    PatExclude8K and PatReplace8K evolved from the thread at the end of this comment where Lowell was looking for tips to develop an itvf_strip_nonnumeric function. (Check it out if you have not seen the thread - it's a great stuff, I learned a ton).

    The original code for the tally table part looked like this:

    SELECT TOP (LEN(@String)) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E1 a....)

    Eirikur changed the tally table part to deal with the implicit conversation in the TOP clause like this:

    SELECT TOP (CONVERT(BIGINT,LEN(@String),0)) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E1 a....)

    Jeff Moden further enhanced it by removing the conversion to BIGINT in the TOP clause and doing the CHECKSUM over ROW_NUMBER(). He explains the change in the comment section of his updated function:

    2. CHECKSUM returns an INT and will return the exact number given if given an INT to begin with. It's also faster than a CAST or CONVERT and is used as a performance enhancer by changing the BIGINT of ROW_NUMBER() to a more appropriately sized INT.

    Based on my testing, the function appears to perform a bit faster using CHECKSUM over ROW_NUMBER() instead of the conversion to BIGINT in the TOP clause.

    http://www.sqlservercentral.com/Forums/Topic1585850-391-3.aspx

    "I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

    -- Itzik Ben-Gan 2001

  • Alan.B (6/30/2015)


    Based on my testing, the function appears to perform a bit faster using CHECKSUM over ROW_NUMBER() instead of the conversion to BIGINT in the TOP clause.

    http://www.sqlservercentral.com/Forums/Topic1585850-391-3.aspx

    So it should only be an issue if there is a chance that row_number would exceed the boundaries of an int, correct?

    Don Simpson



    I'm not sure about Heisenberg.

  • Alan.B (6/30/2015)


    ...Based on my testing, the function appears to perform a bit faster using CHECKSUM over ROW_NUMBER() instead of the conversion to BIGINT in the TOP clause. ...

    So CHECKSUM over ROW_NUMBER() is used to convert the result of ROW_NUMBER() from BIGINT to INT because it's faster than CAST or CONVERT. Cool.

    TOP (without percent) uses BIGINT so an INT value passed in will be converted internally to BIGINT. I'd expect an internal conversion to be faster than an explicit conversion but if your testing shows otherwise then that's cool to know too.

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • DonlSimpson (6/30/2015)


    Alan.B (6/30/2015)


    Based on my testing, the function appears to perform a bit faster using CHECKSUM over ROW_NUMBER() instead of the conversion to BIGINT in the TOP clause.

    http://www.sqlservercentral.com/Forums/Topic1585850-391-3.aspx

    So it should only be an issue if there is a chance that row_number would exceed the boundaries of an int, correct?

    Sorry that I missed this (the wife and I had a baby a few weeks ago).

    Yes, correct. And since we're dealing with an 8000 character limit/1 row_number per character, it will never get near BIGINT levels.

    Again, credit to Jeff Moden and Eirikur for that little trick.

    "I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

    -- Itzik Ben-Gan 2001

  • Hi Alan,

    Excellent script and thanks a lot for this. I will definitely use it a lot. I would like to go through the script at leisure time and might have some questions later if that is OK with you. Can do some learning here. Now time is tight just about had time to test your script:-P:-P:-P:-P:-P:-P

    Manie Verster
    Developer
    Johannesburg
    South Africa

    I am happy because I choose to be happy.
    I just love my job!!!

  • manie (6/17/2016)


    Hi Alan,

    Excellent script and thanks a lot for this. I will definitely use it a lot. I would like to go through the script at leisure time and might have some questions later if that is OK with you. Can do some learning here. Now time is tight just about had time to test your script:-P:-P:-P:-P:-P:-P

    Thanks Marie. Questions are encouraged.

    "I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

    -- Itzik Ben-Gan 2001

  • Hi Alan, great thanks for very useful function.

    Please, fix variables names for Case Sensitive instance: change @String on @string (or vice versa) and @Pattern on @pattern.

    I fixed this issue  here: https://github.com/ktaranov/sqlserver-kit/commit/b263d9aa7d30f35ff324ee26ca8b91172df72713

    Also, for CS instance, please fix PatExclude8K iTally alias.

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply