tsql query - Count the number of spaces in a string

  • How do I write a query that tells me how many spaces (...or any character for that matter) are in a particular string?

    Example:

    David H Rogers would return a 2 since there are 2 spaces.

  • This?

    DECLARE @String VARCHAR(100)

    ,@CharToFind VARCHAR(1)

    SET @String = 'AAAA BBBCB NNNNN NEEEEE ERERERERERE '

    SET @CharToFind = ' '

    SELECT CountOfCharsInTheString = DATALENGTH (@String) - DATALENGTH(REPLACE(@String,@CharToFind,''))

  • Or this.

    DECLARE @TestStr VARCHAR(20),

    @CharToFind CHAR(1);

    SET @TestStr = 'This is a string';

    SET @CharToFind = ' ';

    WITH

    e1(n) AS (SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1), -- 10 rows

    e2(n) AS (SELECT a.n FROM e1 a CROSS JOIN e1 b), -- 100 rows

    e4(n) AS (SELECT a.n FROM e2 a CROSS JOIN e2 b), -- 10,000 rows

    tally(n) AS (SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM e4)

    SELECT

    COUNT(SUBSTRING(@TestStr, n, 1))

    FROM

    tally

    WHERE

    SUBSTRING(@TestStr, n, 1) = @CharToFind

    AND n <= DATALENGTH(@TestStr);

  • Hmmm.... I wonder which is faster.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Shall we take a look at performance?

    SET NOCOUNT ON;

    IF object_id('tempdb..#testEnvironment') IS NOT NULL

    BEGIN

    DROP TABLE #testEnvironment;

    END;

    SELECT TOP 1000000 IDENTITY(INT,1,1) AS ID,

    REPLACE(CAST(NEWID() AS VARCHAR(36)),'-',' ') AS string

    INTO #testEnvironment

    FROM master.dbo.syscolumns sc1, master.dbo.syscolumns sc2, master.dbo.syscolumns sc3;

    CREATE CLUSTERED INDEX idx_clu_testEnvironment_ID ON #testEnvironment (ID);

    DECLARE @findMe CHAR(1) = ' ', @HOLDER INT;

    PRINT '========== BASELINE ==========';

    SET STATISTICS IO ON;

    SET STATISTICS TIME ON;

    SELECT @HOLDER = COUNT(*)

    FROM #testEnvironment;

    SET STATISTICS TIME OFF;

    SET STATISTICS IO OFF;

    PRINT REPLICATE('=',80);

    PRINT '========== DATALENGTH - DATALENGTH REPLACE ==========';

    SET STATISTICS IO ON;

    SET STATISTICS TIME ON;

    SELECT @HOLDER = DATALENGTH (string) - DATALENGTH(REPLACE(string,@findMe,''))

    FROM #testEnvironment;

    SET STATISTICS TIME OFF;

    SET STATISTICS IO OFF;

    PRINT REPLICATE('=',80);

    PRINT '========== TALLY ==========';

    SET STATISTICS IO ON;

    SET STATISTICS TIME ON;

    WITH e1(n) AS (SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1), -- 10 rows

    e2(n) AS (SELECT a.n FROM e1 a CROSS JOIN e1 b), -- 100 rows

    e4(n) AS (SELECT a.n FROM e2 a CROSS JOIN e2 b), -- 10,000 rows

    tally(n) AS (SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM e4)

    SELECT @HOLDER = COUNT(SUBSTRING(string, n, 1))

    FROM tally

    CROSS APPLY #testEnvironment

    WHERE SUBSTRING(string, n, 1) = @findMe

    AND n <= DATALENGTH(string);

    SET STATISTICS TIME OFF;

    SET STATISTICS IO OFF;

    PRINT REPLICATE('=',80);

    Results:

    ========== BASELINE ==========

    Table '#testEnvironment'. Scan count 5, logical reads 6818, physical reads 0, read-ahead reads 12, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 78 ms, elapsed time = 36 ms.

    ================================================================================

    ========== DATALENGTH - DATALENGTH REPLACE ==========

    Table '#testEnvironment'. Scan count 1, logical reads 6818, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 5641 ms, elapsed time = 5647 ms.

    ================================================================================

    ========== TALLY ==========

    Table 'Worktable'. Scan count 10000, logical reads 3283222, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table '#testEnvironment'. Scan count 1, logical reads 6818, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 22890 ms, elapsed time = 8734 ms.

    ================================================================================

    Of course, if we're going to look at performance, perhaps we should remember the bug with replace.

    With that in mind, I propose changing ColdCoffee's solution to this: -

    DECLARE @String VARCHAR(100), @CharToFind VARCHAR(1);

    SET @String = 'AAAA BBBCB NNNNN NEEEEE ERERERERERE ';

    SET @CharToFind = ' '

    SELECT CountOfCharsInTheString = DATALENGTH(@String) - DATALENGTH(REPLACE(@String COLLATE Latin1_General_BIN2, @CharToFind, ''));

    Let's check out the performance gain.

    SET NOCOUNT ON;

    IF object_id('tempdb..#testEnvironment') IS NOT NULL

    BEGIN

    DROP TABLE #testEnvironment;

    END;

    SELECT TOP 1000000 IDENTITY(INT,1,1) AS ID,

    REPLACE(CAST(NEWID() AS VARCHAR(36)),'-',' ') AS string

    INTO #testEnvironment

    FROM master.dbo.syscolumns sc1, master.dbo.syscolumns sc2, master.dbo.syscolumns sc3;

    CREATE CLUSTERED INDEX idx_clu_testEnvironment_ID ON #testEnvironment (ID);

    DECLARE @findMe CHAR(1) = ' ', @HOLDER INT;

    PRINT '========== BASELINE ==========';

    SET STATISTICS IO ON;

    SET STATISTICS TIME ON;

    SELECT @HOLDER = COUNT(*)

    FROM #testEnvironment;

    SET STATISTICS TIME OFF;

    SET STATISTICS IO OFF;

    PRINT REPLICATE('=',80);

    PRINT '========== DATALENGTH - DATALENGTH REPLACE ==========';

    SET STATISTICS IO ON;

    SET STATISTICS TIME ON;

    SELECT @HOLDER = DATALENGTH (string) - DATALENGTH(REPLACE(string,@findMe,''))

    FROM #testEnvironment;

    SET STATISTICS TIME OFF;

    SET STATISTICS IO OFF;

    PRINT REPLICATE('=',80);

    PRINT '========== DATALENGTH - DATALENGTH REPLACE COLLATE Latin1_General_BIN2 ==========';

    SET STATISTICS IO ON;

    SET STATISTICS TIME ON;

    SELECT @HOLDER = DATALENGTH (string) - DATALENGTH(REPLACE(string COLLATE Latin1_General_BIN2,@findMe,''))

    FROM #testEnvironment;

    SET STATISTICS TIME OFF;

    SET STATISTICS IO OFF;

    PRINT REPLICATE('=',80);

    ========== BASELINE ==========

    Table '#testEnvironment'. Scan count 5, logical reads 6818, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 109 ms, elapsed time = 35 ms.

    ================================================================================

    ========== DATALENGTH - DATALENGTH REPLACE ==========

    Table '#testEnvironment'. Scan count 1, logical reads 6818, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 5594 ms, elapsed time = 5643 ms.

    ================================================================================

    ========== DATALENGTH - DATALENGTH REPLACE COLLATE Latin1_General_BIN2 ==========

    Table '#testEnvironment'. Scan count 1, logical reads 6818, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 922 ms, elapsed time = 922 ms.

    ================================================================================

    The huge difference remains no matter what order you execute or how many times you run it.


    Forever trying to learn
    My blog - http://www.cadavre.co.uk/
    For better, quicker answers on T-SQL questions, click on the following...http://www.sqlservercentral.com/articles/Best+Practices/61537/
    For better, quicker answers on SQL Server performance related questions, click on the following...http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • Nice one Cadavre!

    That's the same collation issue that I was trying to address the other day when timing REPLACE on another thread.

    Just one question though. Did Jeff Moden put you up to this? He's been telling me that he was hoping other people would take up for him and run this kind of performance testing instead of him always having to do it. 🙂

    I like your test harnesses.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • dwain.c (6/20/2012)


    Nice one Cadavre!

    That's the same collation issue that I was trying to address the other day when timing REPLACE on another thread.

    Just one question though. Did Jeff Moden put you up to this? He's been telling me that he was hoping other people would take up for him and run this kind of performance testing instead of him always having to do it. 🙂

    I like your test harnesses.

    He didn't put me up to it, no. But I started my career at the same time as I signed up to this site (within a couple of months), so I picked up a couple of good habits from him and a lot of others along the way.


    Forever trying to learn
    My blog - http://www.cadavre.co.uk/
    For better, quicker answers on T-SQL questions, click on the following...http://www.sqlservercentral.com/articles/Best+Practices/61537/
    For better, quicker answers on SQL Server performance related questions, click on the following...http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • Cadavre (6/20/2012)


    dwain.c (6/20/2012)


    Nice one Cadavre!

    That's the same collation issue that I was trying to address the other day when timing REPLACE on another thread.

    Just one question though. Did Jeff Moden put you up to this? He's been telling me that he was hoping other people would take up for him and run this kind of performance testing instead of him always having to do it. 🙂

    I like your test harnesses.

    He didn't put me up to it, no. But I started my career at the same time as I signed up to this site (within a couple of months), so I picked up a couple of good habits from him and a lot of others along the way.

    I'm finding I'm picking up lots of good habits from this site too. 🙂

    And dissing a few bad ones.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • Well, I doubt I'd be the developer I am today (or have the job I currently have) without Jeff and a lot of other people from this site (way too many to mention and I'd be bound to miss someone).

    This forum is a god-send, because it introduces you to so many extremely talented people that can in turn introduce you to new ideas.


    Forever trying to learn
    My blog - http://www.cadavre.co.uk/
    For better, quicker answers on T-SQL questions, click on the following...http://www.sqlservercentral.com/articles/Best+Practices/61537/
    For better, quicker answers on SQL Server performance related questions, click on the following...http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • Cadavre (6/20/2012)


    Well, I doubt I'd be the developer I am today (or have the job I currently have) without Jeff and a lot of other people from this site (way too many to mention and I'd be bound to miss someone).

    This forum is a god-send, because it introduces you to so many extremely talented people that can in turn introduce you to new ideas.

    I completely agree with you Cadavre.

    I've just started my career in SQL Server. Joined this Forum and made my first post a month after I started working on SQL server.

    I am that young when it comes to Sql Server and am still learning.

    But, I must say that I have learnt a lot of(ie: a whole lot of) things from SSC posts and solutions provided by the very learned Gurus of SSC. 🙂

    Vinu Vijayan

    For better and faster solutions please check..."How to post data/code on a forum to get the best help" - Jeff Moden[/url] 😉

  • vinu512 (6/20/2012)


    Cadavre (6/20/2012)


    Well, I doubt I'd be the developer I am today (or have the job I currently have) without Jeff and a lot of other people from this site (way too many to mention and I'd be bound to miss someone).

    This forum is a god-send, because it introduces you to so many extremely talented people that can in turn introduce you to new ideas.

    I completely agree with you Cadavre.

    I've just started my career in SQL Server. Joined this Forum and made my first post a month after I started working on SQL server.

    I am that young when it comes to Sql Server and am still learning.

    But, I must say that I have learnt a lot of(ie: a whole lot of) things from SSC posts and solutions provided by the very learned Gurus of SSC. 🙂

    That is why Jeff is SQL God 😀

    p.s. I too learn/learned a lot from Jeff, bet this brings tear to his eye 😉

    Far away is close at hand in the images of elsewhere.
    Anon.

  • I'll need to study this one. Wow, thanks for your effort and time.

  • Cadavre (6/20/2012)


    Shall we take a look at performance?

    Out-freakin'-standing, Craig! Well done and spot on especially with bringing up the collation problem. If no one rose to the occasion for testing, I was going to but, like Dwain said, I've been trying to get someone else besides me to do the testing. Since you've done this so often, I could have guessed it would be you to take up the ol' "A Developer must not guess, a Developer must KNOW" banner. Very well done and, yeah, it brings a tear of happiness to this ol' man's eyes to see someone do it so well. My hat's off to you.

    I guess I'm going to have to ask the "which is faster" question more often instead of always doing it myself. Thank you VERY much for the effort you put into the test.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • David Burrows (6/20/2012)


    vinu512 (6/20/2012)


    Cadavre (6/20/2012)


    Well, I doubt I'd be the developer I am today (or have the job I currently have) without Jeff and a lot of other people from this site (way too many to mention and I'd be bound to miss someone).

    This forum is a god-send, because it introduces you to so many extremely talented people that can in turn introduce you to new ideas.

    I completely agree with you Cadavre.

    I've just started my career in SQL Server. Joined this Forum and made my first post a month after I started working on SQL server.

    I am that young when it comes to Sql Server and am still learning.

    But, I must say that I have learnt a lot of(ie: a whole lot of) things from SSC posts and solutions provided by the very learned Gurus of SSC. 🙂

    That is why Jeff is SQL God 😀

    p.s. I too learn/learned a lot from Jeff, bet this brings tear to his eye 😉

    Oh my... you folks are way too kind :blush:. I'm just passing what I can forward. Thank you for the kind thoughts. If it weren't for this forum, I'd likely be in the dark ages myself.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • That is why Jeff is SQL God

    +10

    [font="Courier New"]Looking for a Deadlock Victim Support Group..[/font]

Viewing 15 posts - 1 through 15 (of 21 total)

You must be logged in to reply to this topic. Login to reply