Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase

Fastest way to blank all records in one char field Expand / Collapse
Author
Message
Posted Tuesday, October 30, 2012 3:15 PM
Mr or Mrs. 500

Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500

Group: General Forum Members
Last Login: Yesterday @ 5:25 AM
Points: 564, Visits: 1,457
We have a table with hundreds of thousands of social security numbers, which we want to blank. Table has 60 million records. I'm thinking changing the field from char(9) to char(1) then back again might be quick and just leave one digit in there. Quicker than an update statement? I'm running the script below on a test copy but expect it to take a very long time.

use dba_test
go
SET DEADLOCK_PRIORITY LOW

while (SELECT count(*) FROM [history] WHERE owner1ssn<>'')>0
begin
select top 500 idcode,owner1ssn into #tempdelete from otts_history where owner1ssn<>''
begin transaction
begin try
-- blank owner1ssn
update [history] set owner1ssn='' where idcode in (select idcode from #tempdelete)

commit
IF OBJECT_ID('tempdb..#tempdelete') IS NOT NULL
drop table #tempdelete
WAITFOR DELAY '00:00:10'
end try
begin catch
rollback
continue
end catch

end
go



Post #1379049
Posted Tuesday, October 30, 2012 5:22 PM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Today @ 2:32 PM
Points: 32,906, Visits: 26,792
What's the situation on idexes for the owner1ssn column? Do you have one and is it unique? Also, what is the PK of the table? That could turn out to be important depending on which method you use for deletion.

Last but not least, does the column have any foreign keys on it?


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

For better, quicker answers on T-SQL questions, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537/

For better answers on performance questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/
Post #1379090
Posted Tuesday, October 30, 2012 5:31 PM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Today @ 2:32 PM
Points: 32,906, Visits: 26,792
Also, here's what's going to happen if you try to change the column to CHAR(1) from T-SQL...

Msg 8152, Level 16, State 14, Line 1
String or binary data would be truncated.

The statement has been terminated.


If you do it in the designer mode from SSMS, it will copy the entire table which, for 60 million rows, will take a long time possbily blow your log file sky high.


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

For better, quicker answers on T-SQL questions, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537/

For better answers on performance questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/
Post #1379091
Posted Tuesday, October 30, 2012 6:17 PM
Mr or Mrs. 500

Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500

Group: General Forum Members
Last Login: Yesterday @ 5:25 AM
Points: 564, Visits: 1,457
No indexes on that field. I decided to try just dropping the column and recreating it. Only issue was that it had to be recreated with NULLs allowed so I'd have to set all records to "" (blank) before an alter table command could be used to set it to Not NULL.




Post #1379111
Posted Wednesday, October 31, 2012 8:15 AM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Today @ 2:32 PM
Points: 32,906, Visits: 26,792
Indianrock (10/30/2012)
No indexes on that field. I decided to try just dropping the column and recreating it. Only issue was that it had to be recreated with NULLs allowed so I'd have to set all records to "" (blank) before an alter table command could be used to set it to Not NULL.



There's a more insidious problem with that. The reason why dropping the column appears to run so quickly is that it doesn't actually drop the column or remove any of the data. It simply makes it unavailable to queries and the like. The data is still there and could be extracted fairly easily (even from a backup file) by someone intent on doing so. In order to get rid of the actual data, you'd have to rebuild the clustered index. If there is no clustered index, you'd have to build one and then drop it (although building a narrow unique clustered index could help the other indexes a whole lot)


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

For better, quicker answers on T-SQL questions, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537/

For better answers on performance questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/
Post #1379334
Posted Wednesday, October 31, 2012 8:49 AM
Mr or Mrs. 500

Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500

Group: General Forum Members
Last Login: Yesterday @ 5:25 AM
Points: 564, Visits: 1,457
Thanks Jeff. There is a clustered primary key so I'll rebuild that. Since this will take a while I should probably drop all of the columns in question first ( 2 SSN fields, 2 date of birth fields and 2 driver's license fields -- only SSN has significant population ) , recreate them, then rebuild the clustered index. I have to do this on several DEV sql servers which have copies of our prod database, then on production. Yes having this data all over the dev environments is a big problem, and Safenet encryption is on its way, although a large part of the problem will be wiped out by what we're discussing here.

I think I can safely ignore the fact that the newly created columns are allowing nulls since this table contains data converted from our legacy environment, is never written to and rarely read. Plus, our application does not display the fields in question. The only other possible issue is that when the fields are recreated they wind up "at the bottom" of the column order for the table. Theoretically that shouldn't be an issue but I'll discuss it with our application developers.



Post #1379365
Posted Wednesday, October 31, 2012 2:38 PM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Today @ 2:32 PM
Points: 32,906, Visits: 26,792
There may be another problem with doing that. If any of the apps have inserts without the "insert column list", the columns will literally have changed position as well as ordinal number and the inserts will either insert into incorrect columns or cause the apps to fail if they are written in such a manner.

For production and considering all of the unknowns, it may very well be easier and, certainly, safer to build a crawler to do the updates a bit faster than the one you currently have.


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

For better, quicker answers on T-SQL questions, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537/

For better answers on performance questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/
Post #1379529
Posted Wednesday, October 31, 2012 3:47 PM
Mr or Mrs. 500

Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500

Group: General Forum Members
Last Login: Yesterday @ 5:25 AM
Points: 564, Visits: 1,457
Thanks again. I think I'll hold off on changing the production database until this last conversion from legacy finishes November 30th. After that there will never be another insert into the table in question.
Most of our application code generates sql via a .NET ORM and stored procedures are not used for the most part.

After dropping, recreating the columns and rebuilding the clustered index on my test system, I logged into our browser-based application in the environment using that database and no problems pulling up records in that table.

thanks,
Randy



Post #1379554
Posted Wednesday, October 31, 2012 7:10 PM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Today @ 2:32 PM
Points: 32,906, Visits: 26,792
Good enough, Randy. Just wanted to make sure you knew all of the things that could go wrong so you could test them.

--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

For better, quicker answers on T-SQL questions, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537/

For better answers on performance questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/
Post #1379576
« Prev Topic | Next Topic »

Add to briefcase

Permissions Expand / Collapse