Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase

Remove Leading Zero(s) only if needed Expand / Collapse
Author
Message
Posted Wednesday, February 27, 2013 5:04 PM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Friday, August 8, 2014 3:33 PM
Points: 3, Visits: 131
I have four tables that have a varchar data type which gets imports or daily user updates and this field is an account number field which always has a length of 10, including leading zeros. My current code uses:

UPDATE table.Statements
SET field03 = REPLACE(LTRIM(REPLACE(field03, '0', ' ')), ' ', '0')

The issue is that this runs against all four tables and takes around 5 minutes. After the code runs, I see that it touched every record, which I don't want. I just want the code to look at the rows that this field needs to be changed on and disregard the previously trimmed rows.

I know I need some kind of IF statement to accomplish this, but not sure how.

TIA, Scott
Post #1424814
Posted Wednesday, February 27, 2013 7:02 PM
Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: Yesterday @ 11:51 AM
Points: 1,040, Visits: 3,106
Try
UPDATE table.Statements
SET field03 = REPLACE(LTRIM(REPLACE(field03, '0', ' ')), ' ', '0')
WHERE field03 like '0%'

This will only try to update where field03 starts with a 0. An index on this would help

However if you only have numbers in the string, I would probably go for a statement more like

UPDATE table.Statements
SET field03 = CAST(CAST(field03 as bigint) as varchar(20))
WHERE field03 like '0%'

Otherwise you could do

UPDATE table.Statements
SET field03 = SUBSTRING(field03,PATINDEX('%[^0]%',field03),20)
WHERE field03 like '0%'

Post #1424842
Posted Wednesday, February 27, 2013 9:34 PM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Yesterday @ 11:28 PM
Points: 35,263, Visits: 31,750
Scott Milburn (2/27/2013)
I have four tables that have a varchar data type which gets imports or daily user updates and this field is an account number field which always has a length of 10, including leading zeros. My current code uses:

UPDATE table.Statements
SET field03 = REPLACE(LTRIM(REPLACE(field03, '0', ' ')), ' ', '0')

The issue is that this runs against all four tables and takes around 5 minutes. After the code runs, I see that it touched every record, which I don't want. I just want the code to look at the rows that this field needs to be changed on and disregard the previously trimmed rows.

I know I need some kind of IF statement to accomplish this, but not sure how.

TIA, Scott


If Field03 is for account number, I'm thinking that you probably have an index or two on it. The real problems may be that 1) you're updating both the table and the index(es) and 2) you're fragging the hell out of the index(es).

I'd recommend dropping the index(es), doing the update, then re-add te index(es). You're likely going to have to rebuild them after this, anyway.


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #1424882
Posted Wednesday, February 27, 2013 9:52 PM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Yesterday @ 11:28 PM
Points: 35,263, Visits: 31,750
mickyT (2/27/2013)
However if you only have numbers in the string, I would probably go for a statement more like

UPDATE table.Statements
SET field03 = CAST(CAST(field03 as bigint) as varchar(20))
WHERE field03 like '0%'

Otherwise you could do

UPDATE table.Statements
SET field03 = SUBSTRING(field03,PATINDEX('%[^0]%',field03),20)
WHERE field03 like '0%'



Why? Please don't say "for performance reasons" until you've actually tested it. I tested all of the solutions offered so far and there's virtually no difference. Try it yourself. Here's a million row test table...

--===== Create and populate a large test table on-the-fly
SELECT TOP 1000000
Field03 = RIGHT('0000000000'+CAST(ABS(CHECKSUM(NEWID()))%2147483647+1 AS VARCHAR(10)),10)
INTO #TestTable
FROM sys.all_columns ac1
CROSS JOIN sys.all_columns ac2
;



--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #1424888
Posted Wednesday, February 27, 2013 10:43 PM
Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: Yesterday @ 11:51 AM
Points: 1,040, Visits: 3,106

Why? Please don't say "for performance reasons" until you've actually tested it. I tested all of the solutions offered so far and there's virtually no difference. Try it yourself. Here's a million row test table...


Sorry Jeff, I wasn't suggesting them for performance reasons, just other ways of doing it. I found the method the OP put up originally difficult to decipher at first glance.
Post #1424905
Posted Wednesday, February 27, 2013 11:10 PM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Yesterday @ 11:28 PM
Points: 35,263, Visits: 31,750
mickyT (2/27/2013)

Why? Please don't say "for performance reasons" until you've actually tested it. I tested all of the solutions offered so far and there's virtually no difference. Try it yourself. Here's a million row test table...


Sorry Jeff, I wasn't suggesting them for performance reasons, just other ways of doing it. I found the method the OP put up originally difficult to decipher at first glance.


Ah. Got it. Thanks for the feedback.


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #1424916
Posted Tuesday, August 20, 2013 2:43 PM


Mr or Mrs. 500

Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500

Group: General Forum Members
Last Login: Wednesday, September 24, 2014 11:34 AM
Points: 522, Visits: 235
Have you tried Val?
Post #1486441
Posted Tuesday, August 20, 2013 3:12 PM
Hall of Fame

Hall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of Fame

Group: General Forum Members
Last Login: 2 days ago @ 11:09 PM
Points: 3,108, Visits: 11,502
Scott Milburn (2/27/2013)
I have four tables that have a varchar data type which gets imports or daily user updates and this field is an account number field which always has a length of 10, including leading zeros. My current code uses:

UPDATE table.Statements
SET field03 = REPLACE(LTRIM(REPLACE(field03, '0', ' ')), ' ', '0')

The issue is that this runs against all four tables and takes around 5 minutes. After the code runs, I see that it touched every record, which I don't want. I just want the code to look at the rows that this field needs to be changed on and disregard the previously trimmed rows.

I know I need some kind of IF statement to accomplish this, but not sure how.

TIA, Scott


If the majority of rows do not need to be updated, then you could just do this:
UPDATE table.Statements
SET field03 = REPLACE(LTRIM(REPLACE(field03, '0', ' ')), ' ', '0')
WHERE field03 <> REPLACE(LTRIM(REPLACE(field03, '0', ' ')), ' ', '0')

It would still have to look at every row, but the actual update would be avoided and that is the most expensive part of the process.



Post #1486447
Posted Tuesday, August 20, 2013 10:54 PM


Hall of Fame

Hall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of Fame

Group: General Forum Members
Last Login: Yesterday @ 9:07 PM
Points: 3,420, Visits: 5,347
If anybody cares, here's my vote:

WITH CTE AS (
SELECT Field03
FROM #TestTable
WHERE Field03 LIKE '0%')
UPDATE CTE
SET Field03 = REPLACE(LTRIM(REPLACE(Field03 COLLATE LATIN1_GENERAL_BIN , '0', ' ')), ' ', '0');


Using Jeff's test harness, the CTE seemed to give a slight performance boost over methods that didn't use the CTE to limit the target row set.

And the Binary collation sequence is a known "feature" of REPLACE:
http://connect.microsoft.com/SQLServer/feedback/details/512459/replace-function-extremely-slow-with-non-binary-windows-collation

Edit: Attached my test harness in case anyone wants to offer a critique.



My mantra: No loops! No CURSORs! No RBAR! Hoo-uh!

My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.


Need to UNPIVOT? Why not CROSS APPLY VALUES instead?
Since random numbers are too important to be left to chance, let's generate some!
Learn to understand recursive CTEs by example.
Splitting strings based on patterns can be fast!


  Post Attachments 
UPDATE with Binary Collation.txt (0 views, 2.22 KB)
Post #1486510
« Prev Topic | Next Topic »

Add to briefcase

Permissions Expand / Collapse