SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Two tables, exact same structure but with different names, behave differently while performing mass...


Two tables, exact same structure but with different names, behave differently while performing mass updates.

Author
Message
Jay-45986
Jay-45986
SSC-Enthusiastic
SSC-Enthusiastic (180 reputation)SSC-Enthusiastic (180 reputation)SSC-Enthusiastic (180 reputation)SSC-Enthusiastic (180 reputation)SSC-Enthusiastic (180 reputation)SSC-Enthusiastic (180 reputation)SSC-Enthusiastic (180 reputation)SSC-Enthusiastic (180 reputation)

Group: General Forum Members
Points: 180 Visits: 598
Hi all,

This has me stumped. I'm performing mass updates to a table. I'm using Data Masker by Redgate to de-identify the data. It's basically a wide update, involving 8+ columns. The table has around 662K+ rows. I have the constraints disabled, along with the indexes, except for the clustered indexed pk column. The tool takes around 22 rows/sec executation rate to updates all the data. Now, what has me stumped is that when I make an EXACT copy of the same table, with the exact number of indexes, rows and constraints and try the exactly same update, it takes 645 rows/sec execution rate.

So to summarize. One table, 662K rows, 22 rows/sec to update all the records. Exact same copy of the aforementioned table, the performance boost is ten folds and it's executing at much much higher rate of 645 rows/sec. Could there be a buffer issue related to the existing table? or anything else you all could think of?

Thanks.
Lynn Pettis
Lynn Pettis
SSC Guru
SSC Guru (273K reputation)SSC Guru (273K reputation)SSC Guru (273K reputation)SSC Guru (273K reputation)SSC Guru (273K reputation)SSC Guru (273K reputation)SSC Guru (273K reputation)SSC Guru (273K reputation)

Group: General Forum Members
Points: 273095 Visits: 41199
Jay-45986 - Tuesday, December 26, 2017 12:38 PM
Hi all,

This has me stumped. I'm performing mass updates to a table. I'm using Data Masker by Redgate to de-identify the data. It's basically a wide update, involving 8+ columns. The table has around 662K+ rows. I have the constraints disabled, along with the indexes, except for the clustered indexed pk column. The tool takes around 22 rows/sec executation rate to updates all the data. Now, what has me stumped is that when I make an EXACT copy of the same table, with the exact number of indexes, rows and constraints and try the exactly same update, it takes 645 rows/sec execution rate.

So to summarize. One table, 662K rows, 22 rows/sec to update all the records. Exact same copy of the aforementioned table, the performance boost is ten folds and it's executing at much much higher rate of 645 rows/sec. Could there be a buffer issue related to the existing table? or anything else you all could think of?

Thanks.


You could still be comparing apples to oranges. The data in the "fast" table may actually be stored differently than the data in the "slow" table. We have very little information other than the number of rows of data and updates involving 8+ columns. There could be a lot of data movement in the "slow" table and little data movement in the "fast" table.

Cool
Lynn Pettis

For better assistance in answering your questions, click here
For tips to get better help with Performance Problems, click here
For Running Totals and its variations, click here or when working with partitioned tables
For more about Tally Tables, click here
For more about Cross Tabs and Pivots, click here and here
Managing Transaction Logs

SQL Musings from the Desert Fountain Valley SQL (My Mirror Blog)
Jay-45986
Jay-45986
SSC-Enthusiastic
SSC-Enthusiastic (180 reputation)SSC-Enthusiastic (180 reputation)SSC-Enthusiastic (180 reputation)SSC-Enthusiastic (180 reputation)SSC-Enthusiastic (180 reputation)SSC-Enthusiastic (180 reputation)SSC-Enthusiastic (180 reputation)SSC-Enthusiastic (180 reputation)

Group: General Forum Members
Points: 180 Visits: 598
Lynn Pettis - Tuesday, December 26, 2017 1:00 PM
Jay-45986 - Tuesday, December 26, 2017 12:38 PM
Hi all,

This has me stumped. I'm performing mass updates to a table. I'm using Data Masker by Redgate to de-identify the data. It's basically a wide update, involving 8+ columns. The table has around 662K+ rows. I have the constraints disabled, along with the indexes, except for the clustered indexed pk column. The tool takes around 22 rows/sec executation rate to updates all the data. Now, what has me stumped is that when I make an EXACT copy of the same table, with the exact number of indexes, rows and constraints and try the exactly same update, it takes 645 rows/sec execution rate.

So to summarize. One table, 662K rows, 22 rows/sec to update all the records. Exact same copy of the aforementioned table, the performance boost is ten folds and it's executing at much much higher rate of 645 rows/sec. Could there be a buffer issue related to the existing table? or anything else you all could think of?

Thanks.


You could still be comparing apples to oranges. The data in the "fast" table may actually be stored differently than the data in the "slow" table. We have very little information other than the number of rows of data and updates involving 8+ columns. There could be a lot of data movement in the "slow" table and little data movement in the "fast" table.

Thank you for your reply Lynn.

To provide you more information. This is in a database that's on a sandbox and no other users and/or applications accessing it. It's just myself and I Smile To add more, both of these tables belong to the exact same file groups as well. There is 0 other activity in this database and into these tables, except for when I run queries to load and test.

Same # of rows, same exact structure (pks, indexes, constraints, filegroups etc.), 0 activity. Just the names are different.

Lynn Pettis
Lynn Pettis
SSC Guru
SSC Guru (273K reputation)SSC Guru (273K reputation)SSC Guru (273K reputation)SSC Guru (273K reputation)SSC Guru (273K reputation)SSC Guru (273K reputation)SSC Guru (273K reputation)SSC Guru (273K reputation)

Group: General Forum Members
Points: 273095 Visits: 41199
Jay-45986 - Tuesday, December 26, 2017 1:18 PM
Lynn Pettis - Tuesday, December 26, 2017 1:00 PM
Jay-45986 - Tuesday, December 26, 2017 12:38 PM
Hi all,

This has me stumped. I'm performing mass updates to a table. I'm using Data Masker by Redgate to de-identify the data. It's basically a wide update, involving 8+ columns. The table has around 662K+ rows. I have the constraints disabled, along with the indexes, except for the clustered indexed pk column. The tool takes around 22 rows/sec executation rate to updates all the data. Now, what has me stumped is that when I make an EXACT copy of the same table, with the exact number of indexes, rows and constraints and try the exactly same update, it takes 645 rows/sec execution rate.

So to summarize. One table, 662K rows, 22 rows/sec to update all the records. Exact same copy of the aforementioned table, the performance boost is ten folds and it's executing at much much higher rate of 645 rows/sec. Could there be a buffer issue related to the existing table? or anything else you all could think of?

Thanks.


You could still be comparing apples to oranges. The data in the "fast" table may actually be stored differently than the data in the "slow" table. We have very little information other than the number of rows of data and updates involving 8+ columns. There could be a lot of data movement in the "slow" table and little data movement in the "fast" table.

Thank you for your reply Lynn.

To provide you more information. This is in a database that's on a sandbox and no other users and/or applications accessing it. It's just myself and I Smile To add more, both of these tables belong to the exact same file groups as well. There is 0 other activity in this database and into these tables, except for when I run queries to load and test.

Same # of rows, same exact structure (pks, indexes, constraints, filegroups etc.), 0 activity. Just the names are different.


Well, we can't see what you see, and if EVERYTHING is identical, I don't buy it. Can you duplicate the problem in another sandbox database on the same server, a different server? Don't use detach/attach or backup/restore. Create the tables from scripts and move the data using T-SQL (same server) and Import/Export Wizard (SSIS, different servers).

Cool
Lynn Pettis

For better assistance in answering your questions, click here
For tips to get better help with Performance Problems, click here
For Running Totals and its variations, click here or when working with partitioned tables
For more about Tally Tables, click here
For more about Cross Tabs and Pivots, click here and here
Managing Transaction Logs

SQL Musings from the Desert Fountain Valley SQL (My Mirror Blog)
Jay-45986
Jay-45986
SSC-Enthusiastic
SSC-Enthusiastic (180 reputation)SSC-Enthusiastic (180 reputation)SSC-Enthusiastic (180 reputation)SSC-Enthusiastic (180 reputation)SSC-Enthusiastic (180 reputation)SSC-Enthusiastic (180 reputation)SSC-Enthusiastic (180 reputation)SSC-Enthusiastic (180 reputation)

Group: General Forum Members
Points: 180 Visits: 598
These are quiet large databases, terabytes of data. So recreating it by scripting out the objects and then loading the data using SSIS or TSQL can be extremely time consuming and error-prone.

The idea is to de-identify this data in the lower environments (dev, staging, uat etc) I will be given a backup of a prod db, which is this exact database as the db/table(s) in question. I restore it and then use the tool to de-identify any PII or PHI data. I simply do not have the luxury of recreating and reloading everything, as these multiple requests would be coming in down the pipeline to restore/purge data on daily basis.

I have however, restored the exact same db and repeated the steps detailed in the original post on totally different box, with the same results.

I just wanted to know, whether anybody has dealt with something similar in the past? So let's just say for arguments sake that I am correct with having two tables (one existing, one new), same exact structure, same # of rows, but they behave differently while being mass loaded. Does it make sense? I just want to make sure that I haven't left a stone unturned with respect to troubleshooting this.

If you would like to see the DDL of the tables, I could share that as well.
Lynn Pettis
Lynn Pettis
SSC Guru
SSC Guru (273K reputation)SSC Guru (273K reputation)SSC Guru (273K reputation)SSC Guru (273K reputation)SSC Guru (273K reputation)SSC Guru (273K reputation)SSC Guru (273K reputation)SSC Guru (273K reputation)

Group: General Forum Members
Points: 273095 Visits: 41199
The only time I have seen two identical tables behave differently came down to the fact the data was not stored identical between the two tables. The data distribution was different even though the data was the same.

Cool
Lynn Pettis

For better assistance in answering your questions, click here
For tips to get better help with Performance Problems, click here
For Running Totals and its variations, click here or when working with partitioned tables
For more about Tally Tables, click here
For more about Cross Tabs and Pivots, click here and here
Managing Transaction Logs

SQL Musings from the Desert Fountain Valley SQL (My Mirror Blog)
Subramaniam Chandrasekar
Subramaniam Chandrasekar
SSCommitted
SSCommitted (1.7K reputation)SSCommitted (1.7K reputation)SSCommitted (1.7K reputation)SSCommitted (1.7K reputation)SSCommitted (1.7K reputation)SSCommitted (1.7K reputation)SSCommitted (1.7K reputation)SSCommitted (1.7K reputation)

Group: General Forum Members
Points: 1696 Visits: 493
Jay-45986 - Tuesday, December 26, 2017 1:56 PM
These are quiet large databases, terabytes of data. So recreating it by scripting out the objects and then loading the data using SSIS or TSQL can be extremely time consuming and error-prone.

The idea is to de-identify this data in the lower environments (dev, staging, uat etc) I will be given a backup of a prod db, which is this exact database as the db/table(s) in question. I restore it and then use the tool to de-identify any PII or PHI data. I simply do not have the luxury of recreating and reloading everything, as these multiple requests would be coming in down the pipeline to restore/purge data on daily basis.

I have however, restored the exact same db and repeated the steps detailed in the original post on totally different box, with the same results.

I just wanted to know, whether anybody has dealt with something similar in the past? So let's just say for arguments sake that I am correct with having two tables (one existing, one new), same exact structure, same # of rows, but they behave differently while being mass loaded. Does it make sense? I just want to make sure that I haven't left a stone unturned with respect to troubleshooting this.

If you would like to see the DDL of the tables, I could share that as well.

Were they any rebuild indexes that had happened on the original table ? (22 rows/sec), If not, Check for any backup jobs or maintenance jobs activity was happened at the same time of execution ? Could you please retry your operation and let us know whether you've got the same results on Original (22 rows/sec) & Duplicate ( 645 rows/sec).

Jay-45986
Jay-45986
SSC-Enthusiastic
SSC-Enthusiastic (180 reputation)SSC-Enthusiastic (180 reputation)SSC-Enthusiastic (180 reputation)SSC-Enthusiastic (180 reputation)SSC-Enthusiastic (180 reputation)SSC-Enthusiastic (180 reputation)SSC-Enthusiastic (180 reputation)SSC-Enthusiastic (180 reputation)

Group: General Forum Members
Points: 180 Visits: 598
Lynn Pettis - Tuesday, December 26, 2017 2:03 PM
The only time I have seen two identical tables behave differently came down to the fact the data was not stored identical between the two tables. The data distribution was different even though the data was the same.

Would you mind explaining why should that make any difference?

To answer your question, no it's not stored any differently. Both of the tables have the data stored ordered by the bigint PK. The rows are ordered and match between the two tables based on the identity column.

Jay-45986
Jay-45986
SSC-Enthusiastic
SSC-Enthusiastic (180 reputation)SSC-Enthusiastic (180 reputation)SSC-Enthusiastic (180 reputation)SSC-Enthusiastic (180 reputation)SSC-Enthusiastic (180 reputation)SSC-Enthusiastic (180 reputation)SSC-Enthusiastic (180 reputation)SSC-Enthusiastic (180 reputation)

Group: General Forum Members
Points: 180 Visits: 598
subramaniam.chandrasekar - Tuesday, December 26, 2017 10:55 PM
Jay-45986 - Tuesday, December 26, 2017 1:56 PM
These are quiet large databases, terabytes of data. So recreating it by scripting out the objects and then loading the data using SSIS or TSQL can be extremely time consuming and error-prone.

The idea is to de-identify this data in the lower environments (dev, staging, uat etc) I will be given a backup of a prod db, which is this exact database as the db/table(s) in question. I restore it and then use the tool to de-identify any PII or PHI data. I simply do not have the luxury of recreating and reloading everything, as these multiple requests would be coming in down the pipeline to restore/purge data on daily basis.

I have however, restored the exact same db and repeated the steps detailed in the original post on totally different box, with the same results.

I just wanted to know, whether anybody has dealt with something similar in the past? So let's just say for arguments sake that I am correct with having two tables (one existing, one new), same exact structure, same # of rows, but they behave differently while being mass loaded. Does it make sense? I just want to make sure that I haven't left a stone unturned with respect to troubleshooting this.

If you would like to see the DDL of the tables, I could share that as well.

Were they any rebuild indexes that had happened on the original table ? (22 rows/sec), If not, Check for any backup jobs or maintenance jobs activity was happened at the same time of execution ? Could you please retry your operation and let us know whether you've got the same results on Original (22 rows/sec) & Duplicate ( 645 rows/sec).

Indexes are disabled on BOTH of the tables, including any and all fk constraints.

drew.allen
drew.allen
SSC-Forever
SSC-Forever (45K reputation)SSC-Forever (45K reputation)SSC-Forever (45K reputation)SSC-Forever (45K reputation)SSC-Forever (45K reputation)SSC-Forever (45K reputation)SSC-Forever (45K reputation)SSC-Forever (45K reputation)

Group: General Forum Members
Points: 45187 Visits: 14686
Jay-45986 - Wednesday, December 27, 2017 7:17 AM
Lynn Pettis - Tuesday, December 26, 2017 2:03 PM
The only time I have seen two identical tables behave differently came down to the fact the data was not stored identical between the two tables. The data distribution was different even though the data was the same.

Would you mind explaining why should that make any difference?

To answer your question, no it's not stored any differently. Both of the tables have the data stored ordered by the bigint PK. The rows are ordered and match between the two tables based on the identity column.

You're forgetting about fragmentation. The old table is likely to be fragmented and the new table is not. Even with exactly the same indexes and rows, the two tables are not the same.

Drew


J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA
How to post data/code on a forum to get the best help.
How to Post Performance Problems
Make sure that you include code in the appropriate IFCode tags, e.g. [code=sql]<your code here>[/code]. You can find the IFCode tags under the INSERT options when you are writing a post.
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum








































































































































































SQLServerCentral


Search