Recent PostsRecent Posts Popular TopicsPopular Topics
 Home Search Members Calendar Who's On

 Removing non-alphanumeric characters from a column using tally tables Rate Topic Display Mode Topic Options
Author
 Message
 Posted Tuesday, October 16, 2012 12:31 AM
 Forum Newbie Group: General Forum Members Last Login: Wednesday, October 17, 2012 12:49 AM Points: 7, Visits: 33
 I have a table TXNSColumn A is int (3)Column B is varchar 10I wish to remove all non-alphanumeric characters from Column Busing a tally table.Lets set a limit of 20 rows to the tally table.Any ideas?
Post #1373048
 Posted Tuesday, October 16, 2012 1:01 AM
 SSC Rookie Group: General Forum Members Last Login: Friday, February 28, 2014 2:00 AM Points: 47, Visits: 18
 PatIndex can be used
Post #1373054
 Posted Tuesday, October 16, 2012 1:12 AM
 SSCrazy Group: General Forum Members Last Login: Thursday, April 17, 2014 7:06 AM Points: 2,404, Visits: 7,311
 sj999 (10/16/2012)I have a table TXNSColumn A is int (3)Column B is varchar 10I wish to remove all non-alphanumeric characters from Column Busing a tally table.Lets set a limit of 20 rows to the tally table.Any ideas?You only need 10 rows, because your B column is VARCHAR(10).Here's the sample data I used: -`SELECT X AS A, N AS BINTO TXNSFROM (VALUES(1,'r^\$%^edsad'),(2,'fwsa38#'))a(X,N);`And here's the solution: -`WITH Tally(N) AS (SELECT ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) FROM(VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0))a(N))SELECT a.A, a.B, (SELECT stripped FROM TXNS b CROSS APPLY (SELECT CASE WHEN PATINDEX('[A-Za-z0-9]',SUBSTRING(B,N,1)) = 1 THEN SUBSTRING(B,N,1) ELSE '' END FROM Tally WHERE N <= LEN(B)) c(stripped) WHERE a.A = b.A FOR XML PATH(''), TYPE ).value('.', 'VARCHAR(10)') AS strippedFROM TXNS a;`Note that I've used an inline CTE tally table rather than a physical one. This can be replaced if required. Not a DBA, just trying to learnFor better, quicker answers on T-SQL questions, click on the following...http://www.sqlservercentral.com/articles/Best+Practices/61537/For better, quicker answers on SQL Server performance related questions, click on the following...http://www.sqlservercentral.com/articles/SQLServerCentral/66909/If you litter your database queries with nolock query hints, are you aware of the side effects?Try reading a few of these links...(*) Missing rows with nolock(*) Allocation order scans with nolock(*) Consistency issues with nolock(*) Transient Corruption Errors in SQL Server error log caused by nolock(*) Dirty reads, read errors, reading rows twice and missing rows with nolockLinkedIn | Blog coming soon (for sufficiently large values of "soon" )!
Post #1373057
 Posted Tuesday, October 16, 2012 3:08 AM
 SSC-Addicted Group: General Forum Members Last Login: Thursday, March 14, 2013 11:00 AM Points: 444, Visits: 807
 Nice. I would have thought it the same way But I think the self join can be omitted `;WITH Tally(N) AS (SELECT ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) FROM(VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0))a(N))SELECT A,B,stripped FROM TXNS b CROSS APPLY (SELECT (SELECT CASE WHEN PATINDEX('[A-Za-z0-9]',SUBSTRING(B,N,1)) = 1 THEN SUBSTRING(B,N,1) ELSE '' END FROM Tally WHERE N <= LEN(B) FOR XML PATH(''), TYPE ).value('.', 'VARCHAR(10)') ) c(stripped)`For easier readability we can omit TYPE and value combination as well as we are dealing with only alphanumeric values.
Post #1373087
 Posted Tuesday, October 16, 2012 8:44 PM
 SSCrazy Eights Group: General Forum Members Last Login: Yesterday @ 8:37 AM Points: 8,287, Visits: 8,738
 All these methods using [A-Za-z0-9] recognise 139 characters (with the Latin1_General_ci_as collation). Now personally I think that's right (although it's a rather complicated pattern to use for the job), but I have met a lot of people (all Americans, I think) who claim that there are only 62 alphanumeric characters and lose their cool when anyone suggests that there are more; I guess these people must insist on using the bin collation instead of a dictionary one, and use the complicated pattern because the simple one delivers 75 characters even with the bin collation. Tom
Post #1373603
 Posted Tuesday, October 16, 2012 11:36 PM
 Hall of Fame Group: General Forum Members Last Login: Today @ 2:24 AM Points: 3,591, Visits: 5,100
 L' Eomot Inversé (10/16/2012)All these methods using [A-Za-z0-9] recognise 139 characters (with the Latin1_General_ci_as collation). Now personally I think that's right (although it's a rather complicated pattern to use for the job), but I have met a lot of people (all Americans, I think) who claim that there are only 62 alphanumeric characters and lose their cool when anyone suggests that there are more; I guess these people must insist on using the bin collation instead of a dictionary one, and use the complicated pattern because the simple one delivers 75 characters even with the bin collation.There are only 62 alphanumeric characters. The others in your signature don't count. My mantra: No loops! No CURSORs! No RBAR! Hoo-uh!My thought question: Have you ever been told that your query runs too fast?My advice:INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.Need to UNPIVOT? Why not CROSS APPLY VALUES instead?Since random numbers are too important to be left to chance, let's generate some!Learn to understand recursive CTEs by example.Splitting strings based on patterns can be fast!
Post #1373631

 Permissions