String Character Count

  • Comments posted to this topic are about the item String Character Count

  • Well done on posting your script to share your knowledge.

    I would, however, like to suggest that 1) low row or character counts should never be used as a justification for slow code and 2) never use any form of RBAR for something so simple.

    We all know why the WHILE loop will be slow. It's RBAR. What a lot of people don't know is that using a recursive CTE to count is usually worse than even a WHILE loop and should never be used even for the smallest of counts. For more on that, please see the following article.

    http://www.sqlservercentral.com/Authors/Articles/Jeff_Moden/80567/

    Here's a method of doing what you did without the performance problems that exist when using a recursive CTE or WHILE loop. The code will easily handle an entire table column so you also don't have the logistic problems of expressing the RBAR of a recursive CTE or WHILE loop over more than one row.

    SELECT CharacterSymbol = SUBSTRING(st.SomeString,t.N,1)

    , CharacterCount = COUNT(*)

    FROM dbo.SomeTable st

    JOIN dbo.Tally t

    ON t.N <= DATALENGTH(st.SomeString)

    GROUP BY SUBSTRING(st.SomeString,t.N,1)

    ORDER BY CharacterSymbol

    ;

    If you don't know what a Tally Table is or how it can be used to replace certain types of loops, please see the following article.

    http://www.sqlservercentral.com/articles/T-SQL/62867/

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Thanks Jeff. I will check the useful links for the Tally table. I have never used it before. Thanks again for enlightening me that these two methods will be slow but like I said the question came out of the blue and I had to provide a solution immediately. I really appreciate your effort in explaining this and I must admit I have read quite a lot of your articles and I find them enterprising.

  • I kinda figured it was something like that. Thanks for the feedback, Benson. And thanks for the reads on my articles. I aim to please. I sometimes miss but I'm always aiming. 😀

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Should be "FROM Listings AS s", I think

  • Seems to be missing the AS s aliasing.

  • Thanks I have updated and re-submitted the script.

  • Agreed, thanks for posting. I'm always happy to see someone introduced to the tally table.

    And thank you Jeff, for keeping an eye out for the opportunities to make those introductions.

    One point I'd like to make is that it makes a difference when the tally table is zero-based vs. one-based. Mine happens to be zero-based, so I need to filter out the 0 in this case:

    SELECT CharacterSymbol = SUBSTRING(st.Message,t.N,1)

    ,CharacterCount = COUNT(*)

    FROM (SELECT '1234567890,1234567890' AS Message) st

    JOIN [dbo].[Tally] t

    ON t.N BETWEEN 1 AND DATALENGTH(st.Message)

    GROUP BY SUBSTRING(st.Message,t.N,1)

    ORDER BY CharacterSymbol

  • Thanks, this could be very handy!

    Not all gray hairs are Dinosaurs!

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply