More than 1 alphanumeric chars in a string

  • Luis Cazares (3/12/2014)


    I might be trying to kill a fly with a cannon but this could be an option. It could be turned into an inLine Table-valued Function if needed to avoid writing everything if this code is needed somewhere else.

    CREATE TABLE #employee(Surname varchar(50), GivenName varchar(50)) ;

    WITH Employee as

    (

    select 'ABC' as Surname, 'x.yz' as GivenName union all

    select 'ABC', 'X.*YZ' union all

    select 'A*.BC', 'xyz' union all

    select 'ABc', 'x.y.z'

    )

    INSERT INTO #employee

    SELECT *

    FROM Employee;

    WITH E1(N) AS(

    SELECT N FROM (VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0))E(N)

    ),

    E2(N) AS(

    SELECT a.N FROM E1 a, E1 b

    ),

    E4(N) AS(

    SELECT a.N FROM E2 a, E2 b

    )

    SELECT e.*

    FROM #employee e

    CROSS APPLY( SELECT TOP( SELECT MAX(strLen)

    FROM( VALUES(LEN( e.Surname)), (LEN( e.Surname))) x(strLen))

    ROW_NUMBER() OVER(ORDER BY (SELECT NULL))

    FROM E4) Tally(N)

    WHERE (SUBSTRING( e.Surname, N, 1) LIKE '[^0-9a-zA-z]'

    AND SUBSTRING( e.Surname, N + 1, 1) LIKE '[^0-9a-zA-z]')

    OR (SUBSTRING( e.GivenName, N, 1) LIKE '[^0-9a-zA-z]'

    AND SUBSTRING( e.GivenName, N + 1, 1) LIKE '[^0-9a-zA-z]') ;

    GO

    DROP TABLE #employee;

    CROSS APPLY was my first thought too but it doesn't seem to work for this case:

    with E1(N) AS(

    SELECT N FROM (VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0))E(N)

    ),

    E2(N) AS(

    SELECT a.N FROM E1 a, E1 b

    ),

    E4(N) AS(

    SELECT a.N FROM E2 a, E2 b

    ),

    Employee as

    (

    select 'ABC' as Surname, 'x.yz' as GivenName union all

    select 'A.BC' as Surname, 'X.*YZ' as GivenName union all

    select 'A.D.*C' as Surname, 'X.*YZ' as GivenName union all

    select 'A*.BC.*.%.%.%' as Surname, 'xyz' as GivenName

    )

    SELECT e.*

    FROM Employee e

    CROSS APPLY( SELECT TOP( SELECT MAX(strLen)

    FROM( VALUES(LEN( e.Surname)), (LEN( e.Surname))) x(strLen))

    ROW_NUMBER() OVER(ORDER BY (SELECT NULL))

    FROM E4) Tally(N)

    WHERE (SUBSTRING( e.Surname, N, 1) LIKE '[^0-9a-zA-z]'

    AND SUBSTRING( e.Surname, N + 1, 1) LIKE '[^0-9a-zA-z]')

    OR (SUBSTRING( e.GivenName, N, 1) LIKE '[^0-9a-zA-z]'

    AND SUBSTRING( e.GivenName, N + 1, 1) LIKE '[^0-9a-zA-z]') ;


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • This also works, using the PatternSplitCM FUNCTION that Chris Morris contributed to the 4th article in my signature links.

    with Employee as

    (

    select 'ABC' as Surname, 'x.yz' as GivenName union all

    select 'A.BC' as Surname, 'X.*YZ' as GivenName union all

    select 'A.D.*C' as Surname, 'X.*YZ' as GivenName union all

    select 'A*.BC.*.%.%.%' as Surname, 'xyz' as GivenName

    )

    SELECT a.Surname, a.GivenName

    FROM Employee a

    WHERE EXISTS

    (

    SELECT 1

    FROM dbo.PatternSplitCM(a.Surname, '[^a-zA-Z0-9]')

    WHERE [Matched]=1 AND LEN(Item) > 1

    )

    OR EXISTS

    (

    SELECT 1

    FROM dbo.PatternSplitCM(a.GivenName, '[^a-zA-Z0-9]')

    WHERE [Matched]=1 AND LEN(Item) > 1

    );

    Timing between my two submissions also looks pretty close:

    SET STATISTICS TIME ON;

    with Employee as

    (

    select 'ABC' as Surname, 'x.yz' as GivenName union all

    select 'A.BC' as Surname, 'X.*YZ' as GivenName union all

    select 'A.D.*C' as Surname, 'X.*YZ' as GivenName union all

    select 'A*.BC.*.%.%.%' as Surname, 'xyz' as GivenName

    ),

    Tally (n) AS

    (

    SELECT TOP (SELECT CASE WHEN MAX(LEN(Surname)) > MAX(LEN(GivenName)) THEN MAX(LEN(Surname)) ELSE MAX(LEN(GivenName)) END FROM Employee)

    ROW_NUMBER() OVER (ORDER BY (SELECT NULL))

    FROM sys.all_columns

    )

    SELECT a.Surname, a.GivenName

    FROM Employee a, Employee c, Employee d, Employee e, Employee f, Employee g, Employee h, Employee i, Employee j

    WHERE EXISTS

    (

    SELECT 1

    FROM Tally

    WHERE n < LEN(a.Surname) - 1 AND

    PATINDEX('%[^a-zA-Z0-9]%', SUBSTRING(a.Surname, n, 1)) > 0 AND

    PATINDEX('%[^a-zA-Z0-9]%', SUBSTRING(a.Surname, n+1, 1)) > 0

    )

    OR EXISTS

    (

    SELECT 1

    FROM Tally

    WHERE n < LEN(a.GivenName) - 1 AND

    PATINDEX('%[^a-zA-Z0-9]%', SUBSTRING(a.GivenName, n, 1)) > 0 AND

    PATINDEX('%[^a-zA-Z0-9]%', SUBSTRING(a.GivenName, n+1, 1)) > 0

    );

    SET STATISTICS TIME OFF;

    SET STATISTICS TIME ON;

    with Employee as

    (

    select 'ABC' as Surname, 'x.yz' as GivenName union all

    select 'A.BC' as Surname, 'X.*YZ' as GivenName union all

    select 'A.D.*C' as Surname, 'X.*YZ' as GivenName union all

    select 'A*.BC.*.%.%.%' as Surname, 'xyz' as GivenName

    )

    SELECT a.Surname, a.GivenName

    FROM Employee a, Employee c, Employee d, Employee e, Employee f, Employee g, Employee h, Employee i, Employee j

    WHERE EXISTS

    (

    SELECT 1

    FROM dbo.PatternSplitCM(a.Surname, '[^a-zA-Z0-9]')

    WHERE [Matched]=1 AND LEN(Item) > 1

    )

    OR EXISTS

    (

    SELECT 1

    FROM dbo.PatternSplitCM(a.GivenName, '[^a-zA-Z0-9]')

    WHERE [Matched]=1 AND LEN(Item) > 1

    );

    SET STATISTICS TIME OFF;


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • DECLARE @tbl TABLE

    (

    ID INT IDENTITY(1,1),

    Char1 NVARCHAR(100)

    )

    INSERT INTO @tbl

    select 'ABC x.yz'

    UNION

    select 'A.BC X.*YZ'

    UNION

    select 'A.BC *X.YZ'

    DECLARE @tblNumber TABLE

    (

    Number INT

    )

    INSERT INTO @tblNumber

    SELECT TOP 1000 ROW_NUMBER() OVER(ORDER BY S.Object_id )

    from sys.objects s,sys.objects s1

    select Char1 from @tbl,@tblNumber where PATINDEX('%[^A-Z0-9 ]%',SUBSTRING(Char1,Number,1))>0 AND Number<=LEN(Char1)

    GROUP BY Char1

    HAVING COUNT(Distinct Number)>1

    Regards,
    Mitesh OSwal
    +918698619998

  • Mitesh Oswal (3/13/2014)


    DECLARE @tbl TABLE

    (

    ID INT IDENTITY(1,1),

    Char1 NVARCHAR(100)

    )

    INSERT INTO @tbl

    select 'ABC x.yz'

    UNION

    select 'A.BC X.*YZ'

    UNION

    select 'A.BC *X.YZ'

    DECLARE @tblNumber TABLE

    (

    Number INT

    )

    INSERT INTO @tblNumber

    SELECT TOP 1000 ROW_NUMBER() OVER(ORDER BY S.Object_id )

    from sys.objects s,sys.objects s1

    select Char1 from @tbl,@tblNumber where PATINDEX('%[^A-Z0-9 ]%',SUBSTRING(Char1,Number,1))>0 AND Number<=LEN(Char1)

    GROUP BY Char1

    HAVING COUNT(Distinct Number)>1

    Unfortunately this does not meet the requirements. The OP is trying to find when there are 2 nonalphanumeric characters in a row. The excellent code you posted will only find those that have 2 non alphanumeric characters.

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • DECLARE @tbl TABLE

    (

    ID INT IDENTITY(1,1),

    Char1 NVARCHAR(100)

    )

    INSERT INTO @tbl

    select 'ABC x.yz'

    UNION

    select 'A.BC X.*YZ'

    UNION

    select 'A.BC *X.YZ'

    DECLARE @tblNumber TABLE

    (

    Number INT

    )

    INSERT INTO @tblNumber

    SELECT TOP 1000 ROW_NUMBER() OVER(ORDER BY S.Object_id )

    from sys.objects s,sys.objects s1

    ;with cte as

    (

    select Char1,Number,ROW_NUMBER() OVER(PARTITION BY CHAR1 ORDER BY Number)RN from @tbl,@tblNumber where PATINDEX('%[^A-Z0-9 ]%',SUBSTRING(Char1,Number,1))>0 AND Number<=LEN(Char1)

    )

    select Distinct C.Char1 from cte c INNER JOIN Cte C2 ON C.Char1 = C2.Char1 AND C.RN+1 = C2.RN AND C.NUmber+1 = C2.NUmber

    Regards,
    Mitesh OSwal
    +918698619998

  • Try:

    SELECT

    *

    FROM

    #employee

    WHERE

    Surname LIKE '%[^0-9a-zA-Z][^0-9a-zA-Z]%'

    OR GivenName LIKE '%[^0-9a-zA-Z][^0-9a-zA-Z]%';

    GO

  • Crud post on my part... took it down. :blush:

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • hunchback (4/8/2014)


    Try:

    SELECT

    *

    FROM

    #employee

    WHERE

    Surname LIKE '%[^0-9a-zA-Z][^0-9a-zA-Z]%'

    OR GivenName LIKE '%[^0-9a-zA-Z][^0-9a-zA-Z]%';

    GO

    Heh... simplicity at it's best. The only thing you need to add is a % between the calculated characters and a COLLATE clause on each line. Nicely done and good reminder to keep it simple. Thanks.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 8 posts - 16 through 22 (of 22 total)

You must be logged in to reply to this topic. Login to reply