Click here to monitor SSC
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Tip: Calculated Table Truncate


Tip: Calculated Table Truncate

Author
Message
simonholzman 14059
simonholzman 14059
Grasshopper
Grasshopper (13 reputation)Grasshopper (13 reputation)Grasshopper (13 reputation)Grasshopper (13 reputation)Grasshopper (13 reputation)Grasshopper (13 reputation)Grasshopper (13 reputation)Grasshopper (13 reputation)

Group: General Forum Members
Points: 13 Visits: 31
I have a large Production Database that I need to occasionally copy to Development, Test and Training Databases.

However, the Production Database is MUCH too big for those purposes and has data in it that should not be visible to Trainees etc.

Therefore, I wrote an 'Anonymization' script to anonymize some of the real data (much more representative and complete than the Test data is) and then to remove the remaining real data.

Initially, I just used DELETE FROM xxx WHERE [...] to remove the remaining data, but this was slow and created huge log files which caused the script to fail if it were not run in stages.

However, although TRUNCATE is much faster and does not log the changes, it does not normally allow any data to be kept.

The following script snippet shows how a table can have the data that you want copied to another table and then copied back into the original table after it has been Truncated. It should handle foreign key constraints okay since the Truncate does not check them but I make no promises ! When running this on multiple tables, ensure that the tables are processed in the correct order - usually the Primary table should be handled first and then the Secondary ones etc. Thus, in the example shown, the assumption is that the Person table has already been processed before processing the Sales table.


/*
SQL Script to conditionally Truncate a Table

It is straightforward to wrap a table selection
query around this so that it works for multiple
tables.

Valid @Update_Type values are;
I Ignore Table - Do not change the Table

T Trim Table - Remove the unnecessary records from the
Table (Copy, Truncate and Copy back)

uses @Column_Name If this is blank, the table will have
all records removed but will not be dropped
If this has a value, it will use @Linked_Table
and @Linked_Column to check if the value in
@Column_Name exists and will keep those records.

X Delete Table - First Truncate the Table and then Drop the Table
Faster than just Dropping the Table
since it does not Log the Transaction

Written By: S Holzman
Date: 08/22/2013
Amendments:

*/


DECLARE @Table_Name VARCHAR(128),
@Column_Name VARCHAR(128),
@Linked_Table VARCHAR(128),
@Linked_Column VARCHAR(128),
@Update_Type CHAR(1),
@Table_Order INTEGER,
@Field_Name VARCHAR(50),
@SQL_Script VARCHAR(MAX),
@Fields VARCHAR(MAX),
@Row_Count INTEGER


-- Specify the Database name to avoid the chance of
-- running this against the Production database

DECLARE @Database_Name VARCHAR(20)

SET @Database_Name = 'Test'


-- Setup an example

SET @Update_Type = 'T'
SET @Table_Name = 'Sales'
SET @Column_Name = 'Customer_ID'
SET @Linked_Table = 'Person'
SET @Linked_Column = 'Person_ID'
SET @Table_Order = 1


-- The following could be wrapped into a Cursor
-- to process multiple tables

SET @Fields = ''

DECLARE Field_Cursor CURSOR FOR
SELECT Name
FROM SysColumns WITH (nolock)
WHERE id IN (SELECT id FROM SysObjects WITH (nolock)
WHERE Name = @Table_Name
AND XType = 'U')
AND Name <> 'row_timestamp'
ORDER BY ColOrder

OPEN Field_Cursor


-- Perform the first fetch

FETCH NEXT FROM Field_Cursor
INTO @Field_Name

-- Check @@FETCH_STATUS to see if there are any more rows to fetch.

WHILE @@FETCH_STATUS = 0
BEGIN

SET @Fields = @Fields + '"' + @Field_Name + '", '

FETCH NEXT FROM Field_Cursor
INTO @Field_Name
END

CLOSE Field_Cursor
DEALLOCATE Field_Cursor

-- X - Truncate the Table and then Drop the Table
-- T with a blank Column Name - Truncate the Table
-- T with a valid Column Name - Copy Data to _Temp_Anon, Truncate the Table and then copy Data back

IF @Update_Type = 'T' AND @Column_Name IS NOT NULL AND DB_Name() = @Database_Name
BEGIN
SET @SQL_Script = 'SELECT ' + LEFT(@Fields, (LEN(@Fields) - 1))
+ ' INTO _Temp_Anon'
+ ' FROM ' + @Table_Name + ' WITH (nolock)'
+ ' WHERE ' + @Column_Name + ' IN (SELECT ' + @Linked_Column
+ ' FROM ' + @Linked_Table + ' WITH (nolock))'
PRINT @SQL_Script
EXEC (@SQL_Script)
END

IF DB_Name() = @Database_Name
BEGIN
SET @SQL_Script = 'TRUNCATE TABLE ' + @Table_Name
PRINT @SQL_Script
EXEC (@SQL_Script)
END

IF @Update_Type = 'X' AND DB_Name() = @Database_Name
BEGIN
SET @SQL_Script = 'DROP TABLE ' + @Table_Name
PRINT @SQL_Script
EXEC (@SQL_Script)
END

IF @Update_Type = 'T' AND @Column_Name IS NOT NULL
BEGIN
SELECT @Row_Count = COUNT(*) FROM _Temp_Anon WITH (nolock)

/*
-- Used for processing multiple Table;
-- the _Anonymizer_Tables table stores a list of Tables to process
-- and this updates it to ensure that each table is processed
-- in turn and to store the status

IF @Row_Count <> 0
BEGIN
UPDATE _Anonymizer_Tables
SET Records_Left = @Row_Count
WHERE Table_Name = @Table_Name
AND Column_Name = @Column_Name
AND Linked_Table = @Linked_Table
AND Linked_Column = @Linked_Column
AND Update_Type = @Update_Type
AND Table_Order = @Table_Order
AND Records_Left = 0
END
*/
END

IF @Update_Type = 'T' AND @Column_Name IS NOT NULL AND DB_Name() = @Database_Name
BEGIN
IF @Row_Count > 0
BEGIN
SET @SQL_Script = 'INSERT INTO ' + @Table_Name + ' (' + LEFT(@Fields, (LEN(@Fields) - 1)) + ')'
+ ' SELECT * FROM _Temp_Anon'
PRINT @SQL_Script
EXEC (@SQL_Script)

SET @SQL_Script = 'TRUNCATE TABLE _Temp_Anon'
PRINT @SQL_Script
EXEC (@SQL_Script)
END

SET @SQL_Script = 'DROP TABLE _Temp_Anon'
PRINT @SQL_Script
EXEC (@SQL_Script)
END


Jeff Moden
Jeff Moden
SSC-Forever
SSC-Forever (45K reputation)SSC-Forever (45K reputation)SSC-Forever (45K reputation)SSC-Forever (45K reputation)SSC-Forever (45K reputation)SSC-Forever (45K reputation)SSC-Forever (45K reputation)SSC-Forever (45K reputation)

Group: General Forum Members
Points: 45191 Visits: 39925
It should handle foreign key constraints okay since the Truncate does not check them but I make no promises !


You can't truncate a table that has foreign keys involved on it. It won't let you.

--Jeff Moden

RBAR is pronounced ree-bar and is a Modenism for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
Although they tell us that they want it real bad, our primary goal is to ensure that we dont actually give it to them that way.
Although change is inevitable, change for the better is not.
Just because you can do something in PowerShell, doesnt mean you should. Wink

Helpful Links:
How to post code problems
How to post performance problems
Forum FAQs
jonysuise
jonysuise
SSC Journeyman
SSC Journeyman (78 reputation)SSC Journeyman (78 reputation)SSC Journeyman (78 reputation)SSC Journeyman (78 reputation)SSC Journeyman (78 reputation)SSC Journeyman (78 reputation)SSC Journeyman (78 reputation)SSC Journeyman (78 reputation)

Group: General Forum Members
Points: 78 Visits: 166
Hi simonholzman 14059, and welcome to the forum :-D

To delete a large number of rows switch the recovery model of your database (in your development environment) to bulk-log, run your script and switch back to full recovery. This should do with the log issues.

Anyway, if your tables are referenced by FKs you'll need to drop them before deleting rows and recreate them after, or, another approach would be to alter those constraints with the delete cascade option.

Hope it helps. Cool

Jonathan Bernardez Bernardez
___________________________________________________________
DBD. MCSA SQL Server 2012
Jeff Moden
Jeff Moden
SSC-Forever
SSC-Forever (45K reputation)SSC-Forever (45K reputation)SSC-Forever (45K reputation)SSC-Forever (45K reputation)SSC-Forever (45K reputation)SSC-Forever (45K reputation)SSC-Forever (45K reputation)SSC-Forever (45K reputation)

Group: General Forum Members
Points: 45191 Visits: 39925
jonysuise (10/20/2013)
Hi simonholzman 14059, and welcome to the forum :-D

To delete a large number of rows switch the recovery model of your database (in your development environment) to bulk-log, run your script and switch back to full recovery. This should do with the log issues.


Careful now... while that may (as in might) deal with what you perceive to be "log issues", switching to "Bulk Logged" will possibly create other log issues insofar as the ability to do "point in time recovery" should you need to ever do a restore.

--Jeff Moden

RBAR is pronounced ree-bar and is a Modenism for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
Although they tell us that they want it real bad, our primary goal is to ensure that we dont actually give it to them that way.
Although change is inevitable, change for the better is not.
Just because you can do something in PowerShell, doesnt mean you should. Wink

Helpful Links:
How to post code problems
How to post performance problems
Forum FAQs
jonysuise
jonysuise
SSC Journeyman
SSC Journeyman (78 reputation)SSC Journeyman (78 reputation)SSC Journeyman (78 reputation)SSC Journeyman (78 reputation)SSC Journeyman (78 reputation)SSC Journeyman (78 reputation)SSC Journeyman (78 reputation)SSC Journeyman (78 reputation)

Group: General Forum Members
Points: 78 Visits: 166
Jeff Moden (10/20/2013)
jonysuise (10/20/2013)
Hi simonholzman 14059, and welcome to the forum :-D

To delete a large number of rows switch the recovery model of your database (in your development environment) to bulk-log, run your script and switch back to full recovery. This should do with the log issues.


Careful now... while that may (as in might) deal with what you perceive to be "log issues", switching to "Bulk Logged" will possibly create other log issues insofar as the ability to do "point in time recovery" should you need to ever do a restore.


Hi Jeff, thank you for your reply. I indeed forgot to mention that it's strongly recommended to backup the transaction log before and after the bulk operation, in order to enable point-in-time recovery.

Cheers ! Cool

Jonathan Bernardez Bernardez
___________________________________________________________
DBD. MCSA SQL Server 2012
Sean Lange
Sean Lange
SSCoach
SSCoach (16K reputation)SSCoach (16K reputation)SSCoach (16K reputation)SSCoach (16K reputation)SSCoach (16K reputation)SSCoach (16K reputation)SSCoach (16K reputation)SSCoach (16K reputation)

Group: General Forum Members
Points: 16626 Visits: 17024
simonholzman 14059 (10/19/2013)


However, although TRUNCATE is much faster and does not log the changes, it does not normally allow any data to be kept.


This is not true. TRUNCATE is a fully logged operation. However unlike a delete it logs the page deallocations instead of rows. It must be logged in order to maintain ACID. Since it is logging page deallocations instead of rows it is much faster.

_______________________________________________________________

Need help? Help us help you.

Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

Need to split a string? Try Jeff Moden's splitter.

Cross Tabs and Pivots, Part 1 – Converting Rows to Columns
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs
Understanding and Using APPLY (Part 1)
Understanding and Using APPLY (Part 2)
Koen Verbeeck
Koen Verbeeck
SSCoach
SSCoach (16K reputation)SSCoach (16K reputation)SSCoach (16K reputation)SSCoach (16K reputation)SSCoach (16K reputation)SSCoach (16K reputation)SSCoach (16K reputation)SSCoach (16K reputation)

Group: General Forum Members
Points: 16467 Visits: 13207
Jeff Moden (10/20/2013)
jonysuise (10/20/2013)
Hi simonholzman 14059, and welcome to the forum :-D

To delete a large number of rows switch the recovery model of your database (in your development environment) to bulk-log, run your script and switch back to full recovery. This should do with the log issues.


Careful now... while that may (as in might) deal with what you perceive to be "log issues", switching to "Bulk Logged" will possibly create other log issues insofar as the ability to do "point in time recovery" should you need to ever do a restore.


In this particular case - where the data is a copy from production and only part of the data is needed - I don't believe point in time restore is necessary. Personally I would even switch to simply recovery model (on the training databases that is, of course).



How to post forum questions.
Need an answer? No, you need a question.
What’s the deal with Excel & SSIS?

Member of LinkedIn. My blog at SQLKover.

MCSA SQL Server 2012 - MCSE Business Intelligence
Kurt W. Zimmerman
Kurt W. Zimmerman
Ten Centuries
Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)

Group: General Forum Members
Points: 1062 Visits: 1396
Koen Verbeeck (10/21/2013)
Jeff Moden (10/20/2013)
jonysuise (10/20/2013)
Hi simonholzman 14059, and welcome to the forum :-D

To delete a large number of rows switch the recovery model of your database (in your development environment) to bulk-log, run your script and switch back to full recovery. This should do with the log issues.


Careful now... while that may (as in might) deal with what you perceive to be "log issues", switching to "Bulk Logged" will possibly create other log issues insofar as the ability to do "point in time recovery" should you need to ever do a restore.


In this particular case - where the data is a copy from production and only part of the data is needed - I don't believe point in time restore is necessary. Personally I would even switch to simply recovery model (on the training databases that is, of course).


Since most development and UAT boxes often don't contain mission critical data I've always made it a practice to change the recovery model to Simple.

It is always a struggle to decide which way to go to provide enough good data in DEV & UAT boxes. Personally I like to take a complete copy of the production database and move my new changes in it to make a UAT box. For DEV I have either stripped out all transaction data or leave just enough behind to give enough for the developers to perform their unit testing.

I have utilized a variety of solutions to move VLD from server to server. If you can afford the room on your production box, and it is robust enough to deal with the purging process I'd consider making a copy of the database on the production box, change the recovery model of it to SIMPLE then start whacking and hacking the data down to what is needed elsewhere. When the purge is finished, backup/restore THAT version to the various locations as needed.

This process could get quite complex based on the direction you take. You can create a complete custom SSIS package that restores the limited set of data to your various locations.

There are quite a number of solutions to address. Choosing the one that meets your requirements is going to drive you into the right solution.

Kurt

Kurt W. Zimmerman
SR DBA
Lefrak Organization
New York, NY

http://www.linkedin.com/in/kurtwzimmerman
simonholzman 14059
simonholzman 14059
Grasshopper
Grasshopper (13 reputation)Grasshopper (13 reputation)Grasshopper (13 reputation)Grasshopper (13 reputation)Grasshopper (13 reputation)Grasshopper (13 reputation)Grasshopper (13 reputation)Grasshopper (13 reputation)

Group: General Forum Members
Points: 13 Visits: 31
Thank you all for your replies and comments. While my job title is DBA, I am more of a SQL Programmer and true Database Administration is a rare task for me, as I suspect it is for most people with that job title ! There simply aren't that many occasions where it is needed when one is managing a single production database.

Thus, please accept my apologies for any misunderstandings I have over exactly what is logged and which contstraints etc are affected by specific SQL Statements.

The purpose of the script is to easily and quickly remove most of the data, but not all of it, from a non-Production database that has been copied from Production.

Originally, I had used something along the lines of the following on about 30 tables;

DELETE FROM Sales
WHERE Customer_Person_ID NOT IN (SELECT Person_ID FROM Person WITH (nolock))



However, since the vast majority of the records are being deleted, this is very slow and, since it is logging every single record deleted (and there are millions of records in most of the tables), it was usually running out of log space unless I ran it in smaller chunks. It took about 24 hours in total to run and needed regular monitoring.

Using the 'Calculated Truncate' approach, it is only selecting the few records that are being kept and so it runs against thousands of tables and takes an hour to run in a single chunk.

Thus, it is faster, more reliable and MUCH more effective in shrinking the amount of data left in the database.

My hope is that this approach will help others with a similar problem.

However, if certain constraints will cause issues for some systems using the script as I have written it, please suggest improvements to the script so that it can dynamically remove those constraints and reimpose them once the data has been replaced in the original tables.
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search