CHARINDEX doesn't find double spaces

  • Hi all

    I have a very strange situation where I have a table containing data that definitely has records having double spaces.

    Take the example below. (btw, the table contains addresses and associated data ie. postcodes, localities.)

    Record ctr Address_Detail

    ============================

    7271 LOT 42 ROBERTSON RD

    3205 LOT 16 SCOMAZZON RD

    One small issue too...I just realised when I complete this post, there must be some smart checking going on that removes the double spaces between 42 ROBERTSON RD and 16 SCOMAZZON RD. (so trust me, there are doubles spaces here!!)

    If I use the syntax:

    select address_ctr, address_detail, CHARINDEX(' ',address_detail) as position from usrAddrWork where CHARINDEX(' ',address_detail)>0

    I get no records returned.

    However, If I execute the following statement but substitute the actual string into the charindex call:

    select charindex(' ','LOT 42 ROBERTSON RD')

    I get the output of 7 which is where the double space is.

    One last thing.

    If I change my script to

    select address_ctr, address_detail, CHARINDEX(' ',address_detail) as position from usrAddrWork where CHARINDEX('BERT',address_detail)>0

    I get quite a number of records returned so it seems the problem is confined to finding double spaces.

    I've also tried using CHAR(32)+CHAR(32) and a number of other alternatives to name a few...

    I'm running SQL Server Management Console 2008 R2 and the collation for the database is "Latin1_General_CI_AS"

    Has anyone else come across this problem....

    Other details regarding build follow...

    Microsoft SQL Server Management Studio10.50.1600.1

    Microsoft Data Access Components (MDAC)6.1.7600.16385

    Microsoft MSXML3.0 4.0 6.0

    Microsoft Internet Explorer8.0.7600.16385

    Microsoft .NET Framework2.0.50727.4927

    Operating System6.1.7600

    Grant WILLIAMS

    Technical Consultant - TechnologyOne

    Queensland AUSTRALIA

  • Would it be possible for you to post some example DDL and sample data INSERTs? This will allow us to help you in the specific context of your situation.

    --SJT--

  • Using the following I could not duplicate your problem.

    CREATE TABLE #TestTempTab

    (Col1 varchar(50) COLLATE Latin1_General_CI_AS)

    INSERT INTO #TestTempTab

    SELECT 'LOT 42 ROBERTSON RD'

    --------1234567890'

    select Col1, charindex(' ',Col1) FROM #TestTempTab

    Results:

    Col1 (No column name)

    LOT 42 ROBERTSON RD7

    Because tempdb uses the default server collation, one additional question... what collation is being used by your TempDB

    from BOL

    Specify that the temporary table column use the default collation of the user database, not tempdb. This enables the temporary table to work with similarly formatted tables in multiple databases, if that is required of your system.

    Sorry I could not be any real assistance.....

    If everything seems to be going well, you have obviously overlooked something.

    Ron

    Please help us, help you -before posting a question please read[/url]
    Before posting a performance problem please read[/url]

  • Hi all

    I've found the issue so here's the answer with very RED cheeks

    When I output (using a select statement) those records that "appear" to have a double space into a query grid, I copy the output into the query window and the output looks like it has two spaces.

    I tried running the following code but this time, I extracted the field using a sql select and found that the double space is in fact char(13) char(10).

    btw, the code below simply returns the ascii value of each character in the record.

    Many appologies.

    declare @STR nvarchar(100), @i int, @C char(1)

    select @STR = address_detail

    from usrAddrWork

    where address_ctr in (3205)

    -- where address_ctr in (957)

    -- where address_ctr in (3220)

    -- where address_ctr in (4999)

    -- where address_ctr in (7271)

    select @STR

    select @i = 1

    select @C = LEFT(@str,1)

    while @i<LEN(@str)+1

    begin

    select ascii(@c) as ascii_val, @C as char, @i as pos

    select @i = @i + 1

    select @C = SUBSTRING(@str,@i,1)

    end

  • Hi Grant,

    I know your code is for a one-off test so performance may not matter, but I thought I'd show you a way to get rid of the While loop because, someday, it could matter on other things.

    --===== This just sets up the test

    DECLARE @WhackoString VARCHAR(8000)

    ;

    SELECT @WhackoString = '

    This is one sentance.

    This is another but it has a couple of tabs in it.

    '

    ;

    --===== This does similar to what the While loop does.

    SELECT Position = t.N,

    [Character] = SUBSTRING(@WhackoString,t.N,1),

    Ascii_Value = ASCII(SUBSTRING(@WhackoString,t.N,1))

    FROM dbo.Tally t

    WHERE t.N BETWEEN 1 AND DATALENGTH(@WhackoString)

    ORDER BY t.N

    ;

    If you don't know what a Tally Table is or you don't know how it can be used to replace certain WHILE loops, please see the following article...

    http://www.sqlservercentral.com/articles/T-SQL/62867/

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Thank you for this - after an hour of bashing my head against the wall struggling with this I also solved this issue.

    For me the issue was that SSMS Results To Grid outputted a genuine double space. I almost always use this output. Eventually I found that Results To Text outputted a CRLF.

    ASCII "20 20" in grid view versus "0D 0A" in text... Now I know!
    Hope this saves you time

  • Oli Winfield - Monday, July 17, 2017 5:25 AM

    Thank you for this - after an hour of bashing my head against the wall struggling with this I also solved this issue.

    For me the issue was that SSMS Results To Grid outputted a genuine double space. I almost always use this output. Eventually I found that Results To Text outputted a CRLF.

    ASCII "20 20" in grid view versus "0D 0A" in text... Now I know!
    Hope this saves you time

    CRLF is not the only cause of similar issues.
    TAB's, special formatting from Word and Excel, other non-printable characters - they all can cause similar confusions.

    When I face an oddity with a string which cannot be explained I add CONVERT(VARBINARY(NNN), String) to my query - and it usually explains everything.

    _____________
    Code for TallyGenerator

  • Sergiy - Tuesday, August 1, 2017 10:32 PM

    Oli Winfield - Monday, July 17, 2017 5:25 AM

    Thank you for this - after an hour of bashing my head against the wall struggling with this I also solved this issue.

    For me the issue was that SSMS Results To Grid outputted a genuine double space. I almost always use this output. Eventually I found that Results To Text outputted a CRLF.

    ASCII "20 20" in grid view versus "0D 0A" in text... Now I know!
    Hope this saves you time

    CRLF is not the only cause of similar issues.
    TAB's, special formatting from Word and Excel, other non-printable characters - they all can cause similar confusions.

    When I face an oddity with a string which cannot be explained I add CONVERT(VARBINARY(NNN), String) to my query - and it usually explains everything.

    The one that most regularly gets me is character 160 - the non-breaking space.  It looks like a space, is the same length as a space, but doesn't match when compared to a space...

    Thomas Rushton
    blog: https://thelonedba.wordpress.com

  • ThomasRushton - Wednesday, August 2, 2017 4:36 AM

    Sergiy - Tuesday, August 1, 2017 10:32 PM

    Oli Winfield - Monday, July 17, 2017 5:25 AM

    Thank you for this - after an hour of bashing my head against the wall struggling with this I also solved this issue.

    For me the issue was that SSMS Results To Grid outputted a genuine double space. I almost always use this output. Eventually I found that Results To Text outputted a CRLF.

    ASCII "20 20" in grid view versus "0D 0A" in text... Now I know!
    Hope this saves you time

    CRLF is not the only cause of similar issues.
    TAB's, special formatting from Word and Excel, other non-printable characters - they all can cause similar confusions.

    When I face an oddity with a string which cannot be explained I add CONVERT(VARBINARY(NNN), String) to my query - and it usually explains everything.

    The one that most regularly gets me is character 160 - the non-breaking space.  It looks like a space, is the same length as a space, but doesn't match when compared to a space...

    I've run into that exact character myself, usually from data originating from a web page somewhere along the line.

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply