SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Removing non-alphanumeric characters from a column using tally tables


Removing non-alphanumeric characters from a column using tally tables

Author
Message
sj999
sj999
SSC Rookie
SSC Rookie (41 reputation)SSC Rookie (41 reputation)SSC Rookie (41 reputation)SSC Rookie (41 reputation)SSC Rookie (41 reputation)SSC Rookie (41 reputation)SSC Rookie (41 reputation)SSC Rookie (41 reputation)

Group: General Forum Members
Points: 41 Visits: 33
I have a table TXNS
Column A is int (3)
Column B is varchar 10
I wish to remove all non-alphanumeric characters from Column B
using a tally table.
Lets set a limit of 20 rows to the tally table.
Any ideas?
surmountfear
surmountfear
Valued Member
Valued Member (58 reputation)Valued Member (58 reputation)Valued Member (58 reputation)Valued Member (58 reputation)Valued Member (58 reputation)Valued Member (58 reputation)Valued Member (58 reputation)Valued Member (58 reputation)

Group: General Forum Members
Points: 58 Visits: 20
PatIndex can be used
Cadavre
Cadavre
SSCrazy Eights
SSCrazy Eights (9.1K reputation)SSCrazy Eights (9.1K reputation)SSCrazy Eights (9.1K reputation)SSCrazy Eights (9.1K reputation)SSCrazy Eights (9.1K reputation)SSCrazy Eights (9.1K reputation)SSCrazy Eights (9.1K reputation)SSCrazy Eights (9.1K reputation)

Group: General Forum Members
Points: 9138 Visits: 8492
sj999 (10/16/2012)
I have a table TXNS
Column A is int (3)
Column B is varchar 10
I wish to remove all non-alphanumeric characters from Column B
using a tally table.
Lets set a limit of 20 rows to the tally table.
Any ideas?


You only need 10 rows, because your B column is VARCHAR(10).

Here's the sample data I used: -
SELECT X AS A, N AS B
INTO TXNS
FROM (VALUES(1,'r^$%^edsad'),(2,'fwsa38#'))a(X,N);



And here's the solution: -
WITH Tally(N) AS (SELECT ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) 
FROM(VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0))a(N))
SELECT a.A, a.B, (SELECT stripped
FROM TXNS b
CROSS APPLY (SELECT CASE WHEN PATINDEX('[A-Za-z0-9]',SUBSTRING(B,N,1)) = 1
THEN SUBSTRING(B,N,1)
ELSE '' END
FROM Tally
WHERE N <= LEN(B)) c(stripped)
WHERE a.A = b.A
FOR XML PATH(''), TYPE
).value('.', 'VARCHAR(10)') AS stripped
FROM TXNS a;



Note that I've used an inline CTE tally table rather than a physical one. This can be replaced if required.


Forever trying to learn

For better, quicker answers on T-SQL questions, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537/

For better, quicker answers on SQL Server performance related questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/



If you litter your database queries with nolock query hints, are you aware of the side effects?
Try reading a few of these links...

(*) Missing rows with nolock
(*) Allocation order scans with nolock
(*) Consistency issues with nolock
(*) Transient Corruption Errors in SQL Server error log caused by nolock
(*) Dirty reads, read errors, reading rows twice and missing rows with nolock


Craig Wilkinson - Software Engineer
LinkedIn
Usman Butt
Usman Butt
SSC Eights!
SSC Eights! (914 reputation)SSC Eights! (914 reputation)SSC Eights! (914 reputation)SSC Eights! (914 reputation)SSC Eights! (914 reputation)SSC Eights! (914 reputation)SSC Eights! (914 reputation)SSC Eights! (914 reputation)

Group: General Forum Members
Points: 914 Visits: 820
Nice. I would have thought it the same way :-) But I think the self join can be omitted

;WITH Tally(N) AS (SELECT ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) 
FROM(VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0))a(N))
SELECT A,B,stripped
FROM TXNS b
CROSS APPLY (SELECT (SELECT CASE WHEN PATINDEX('[A-Za-z0-9]',SUBSTRING(B,N,1)) = 1
THEN SUBSTRING(B,N,1)
ELSE '' END
FROM Tally
WHERE N <= LEN(B)
FOR XML PATH(''), TYPE
).value('.', 'VARCHAR(10)') ) c(stripped)



For easier readability we can omit TYPE and value combination as well as we are dealing with only alphanumeric values.
Tom Thomson
Tom Thomson
One Orange Chip
One Orange Chip (26K reputation)One Orange Chip (26K reputation)One Orange Chip (26K reputation)One Orange Chip (26K reputation)One Orange Chip (26K reputation)One Orange Chip (26K reputation)One Orange Chip (26K reputation)One Orange Chip (26K reputation)

Group: General Forum Members
Points: 26068 Visits: 12500
All these methods using [A-Za-z0-9] recognise 139 characters (with the Latin1_General_ci_as collation).

Now personally I think that's right (although it's a rather complicated pattern to use for the job), but I have met a lot of people (all Americans, I think) who claim that there are only 62 alphanumeric characters and lose their cool when anyone suggests that there are more; I guess these people must insist on using the bin collation instead of a dictionary one, and use the complicated pattern because the simple one delivers 75 characters even with the bin collation.

Tom

dwain.c
dwain.c
SSCoach
SSCoach (17K reputation)SSCoach (17K reputation)SSCoach (17K reputation)SSCoach (17K reputation)SSCoach (17K reputation)SSCoach (17K reputation)SSCoach (17K reputation)SSCoach (17K reputation)

Group: General Forum Members
Points: 17949 Visits: 6431
L' Eomot Inversé (10/16/2012)
All these methods using [A-Za-z0-9] recognise 139 characters (with the Latin1_General_ci_as collation).

Now personally I think that's right (although it's a rather complicated pattern to use for the job), but I have met a lot of people (all Americans, I think) who claim that there are only 62 alphanumeric characters and lose their cool when anyone suggests that there are more; I guess these people must insist on using the bin collation instead of a dictionary one, and use the complicated pattern because the simple one delivers 75 characters even with the bin collation.


There are only 62 alphanumeric characters. The others in your signature don't count. :-P


My mantra: No loops! No CURSORs! No RBAR! Hoo-uh!

My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.


Need to UNPIVOT? Why not CROSS APPLY VALUES instead?
Since random numbers are too important to be left to chance, let's generate some!
Learn to understand recursive CTEs by example.
Splitting strings based on patterns can be fast!
My temporal SQL musings: Calendar Tables, an Easter SQL, Time Slots and Self-maintaining, Contiguous Effective Dates in Temporal Tables
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search