August 10, 2004 at 1:00 pm
Hi,
I'm relatively new to T-SQL as an intern at an insurance company. I try to do most of my learning through Google, but I wasn't getting the results I needed so I'm going to try this forum and hopefully you pros can answer my questions.
I have a table which simply contains NAME and ID. I have a series of filters that I need to apply to the name fields, i.e. remove certain formatting, words, etc. I'm not sure whether to use cursors or a loop seeing how so many sources speak of cursor's negative performance impacts. I can't even find a good example - most of the ones out there perform complex operations, while I'm trying to do something very simple.
Any ideas? Just point me in the right direction and I'll take over from there.
Todd
August 10, 2004 at 10:25 pm
Actually, with doing string manipulations on the NAME field and comparing, you're gonna be doing table scans anyways for most operations.
It all depends on what you're most comfortable with. I personally would just do a stored proc that ran a bunch of UPDATE queries as needed.
i.e.,
create procedure FixName as
update mytable set name=replace(name, '#','');
update mytable set name = null where ltrim(name) = '';
/* yada yada yada...*/
go
August 11, 2004 at 2:29 am
From your question, I presume the ID column is unique else pls modify the following code to push the data to a temporary column with an identity column and loop through.
Here is my code.
Declare @ID INT
WHILE
1 = 1
BEGIN
SELECT @ID= MIN(ID)
FROM Table_Name
WHERE (ID_Num > @ID_Num)
IF (@@rowcount = 0) OR (@ID IS NULL)
BREAK
SELECT NAME FROM Table_Name Where ID = @ID
END
-- WHILE Loop through each name
Thanks,
Ganesh
August 11, 2004 at 3:29 am
The broad heuristic is that if what you want to do can be done with set-based operations, do it with set-based operations; only use a cursor if you have to.
Depending on how simple the filters you want to apply are, you may or may not be able to code them in T-SQL to go into an UPDATE statement. And you may end up with lots of UPDATE statements to do all the work you want doing.
Having said to prefer set-based operations, if this is a one-time operation - cleaning data after an initial import, say - then don't be afraid to use a cursor if it looks like the easiest thing to do. If you're only running code once, it doesn't matter if its performance is not optimal
August 11, 2004 at 6:17 am
Thanks for the responses. You are correct,the ID field is the unique identifier. Also, time is not really of the issue, as I can easily start the procedure when I leave, which gives it all night to run (I have no idea how long this will be for ~5 million rows, but I'm going to be doing some testing on smaller tables to estimate it).
Is there actually a replace function, or did you just use that as if I had already coded it? I will look that up today - if there is, my life just got easier.
August 11, 2004 at 1:14 pm
I was able to do 90% of the corrections with the Replace command. However, there are a few corrections that have me stumped that I think I will need to implement cursors for. For instance, I need to change 'I B M' to 'IBM'. Any ideas?
August 11, 2004 at 1:25 pm
How many 'I B M' s do you have? Can you do a visual check to be sure no
' SANDI B McCOY ' or ' I B Martin ' records exist? Perhaps a replace where you placed a space before the 'I', and one after the 'M'? An additional update covering ' I B M,'?
Viewing 7 posts - 1 through 7 (of 7 total)
You must be logged in to reply to this topic. Login to reply