SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 

DBCC CLONEDATABASE and sensitive data

Just over 2 months ago Service Pack 2 for SQL server 2014 was released. Included in this Service Pack is a new DBCC command, DBCC CLONEDATABASE. As the name suggest this command can be used to make a clone of your user database without the table data. Instead it creates a schema and statistics only copy of your database. For now it is only available in SQL server 2014 SP2 but my bet is it will probably be introduced in SQL Server 2016 as well when SP1 gets released.

Microsoft documented the use of the DBCC CLONEDATABASE command on MSDN (https://support.microsoft.com/en-us/kb/3177838) and it seems like a great option to create a copy of a user database for troubleshooting purposes. There is a catch however which prompted me to write this article, but before we get into that, let’s look at what DBCC CLONEDATABASE actually does.

DBCC CLONEDATABASE

What DBCC CLONEDATABASE does is create a read-only copy of a user database that only contains the database schema (tables, indexes, etc) and statistics. The fact that statistics are copied over to the clone database is incredibly useful for testing problematic queries that, for one reason or the other, are effected by statistics. The testing of performance issues related to the query optimizer seems to be one, if not the, reason Microsoft created this command as we can read on the MSDN page:

“Microsoft Customer Support Services may ask you to generate a clone of a database by using DBCC CLONEDATABASE in order to investigate a performance issue related to the query optimizer.”

To perform the cloning operation the following actions are performed by DBCC CLONEDATABASE:

  • Create a new database. The file layout of the source database is used but the file sizes are reset to their default values as specified in the model database;
  • Creates an internal snapshot of the source database;
  • Copies the internal metadata information from the source to the new target database;
  • Copies all schema information of all objects in the source database to the new database;
  • Copies statistics from the source to the new database.

By looking at these actions you can immediately see the advantages of using the command. It creates an exact copy of your user database regarding tables, indexes and statistics (and some other objects you can find on the MSDN page) but without table data and at a minimal file size.

Let’s look how we can run the command and take a look at the results.

The command expects only two parameters: source database and target database. By issuing the following command through Management Studio we can clone the AdventureWorks2014 database to the new AdventureWorks2014_CLONE database.

DBCC CLONEDATABASE (N'AdventureWorks2014', N'AdventureWorks2014_CLONE');

After issuing the command we get a nice message reminding us this is a cloned database that should not be used in a production environment.

clonedb_01

After refreshing the databases inside SSMS you can see a cloned copy is created. As you can see in the figure below, the database is set to Read-Only.

clonedb_02

If you so wish you can change the database to Read/Write and, for instance, change the compatibility level so you can test your query performance against a different level of the Cardinality Estimator since we would only need the statistics for that which are included in the cloned database.

Let’s select some data against the source database and the new cloned version of the database to check if no table data was cloned.

-- Select some records from the original database
USE AdventureWorks2014
GO
SELECT LoginID
FROM HumanResources.Employee
-- Select some records from the cloned database
USE AdventureWorks2014_CLONE
GO
SELECT LoginID
FROM HumanResources.Employee

This returned the following results:

clonedb_03

So far so good! We can run a query against the cloned database and just as promised to table data is inside the cloned database.

If we compare the indexes and statistics between the two databases we see that those are left intact in the cloned database.

clonedb_04

Now let’s move on to the “catch” of using this command….

The catch

Remembered that paragraph in the MSDN article of DBCC CLONEDATABASE by Microsoft? Don’t worry if you don’t here it is again:

“Microsoft Customer Support Services may ask you to generate a clone of a database by using DBCC CLONEDATABASE in order to investigate a performance issue related to the query optimizer.”

I can understand using DBCC CLONEDATABASE to supply a vendor with a copy of my database without any of my data inside it for troubleshooting purposes, and so far it looks like it’s a very safe way of giving my schema data without providing the possibly sensitive data inside my user database. BUT….by using DBCC CLONEDATABASE I still run the risk of providing a 3rd party with user data.

Since DBCC CLONEDATABASE also clones the statistics from the source database I can very easily extract information from the statistics about the data that was in the original table. Just take a look at the example below that requests statistics information against the cloned database for a specific index.

DBCC SHOW_STATISTICS ('HumanResources.Employee', 'AK_Employee_LoginID')

By just looking at the statistics histogram I can see exactly what data was inside the column the index is created on.

clonedb_05

As you can see I can read a maximum of 200 rows worth of data (the maximum amount of statistics histogram steps) that was inside the original table but removed in the cloned database.

In this case I looked at the loginID’s of AdventureWorks users, information that can pose a security risk, but imagine if I can look at even more sensitive information? Like addresses? Or phone numbers? As you can probably guess by now, if a column has an index, we can grab data from it through the cloned database.

Conclusion

DBCC CLONEDATABASE is a great new command to create a cloned copy of your database for troubleshooting and testing purposes without row data. The key point I want you to take from this article is that it isn’t necessarily an entirely safe options of providing a 3rd party with a copy of your database for troubleshooting purposes. Even though the row data is removed, through the statistics histogram it is still possible to retrieve row data. Please keep this in mind when you are asked to provide a cloned database through DBCC CLONEDATABASE.

DotNine SQL Server and more

I am a database specialist from the Netherland with my main focus on Microsoft SQL Server. My IT career began 14 years ago as a System Administrator. I have been working with SQL Server for more then 10 years now and a full time DBA for 5 years.

Comments

Leave a comment on the original post [www.dotnine.net, opens in a new window]

Loading comments...