SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


DTS and Primary Keys


DTS and Primary Keys

Author
Message
teareal_1
teareal_1
SSC-Enthusiastic
SSC-Enthusiastic (123 reputation)SSC-Enthusiastic (123 reputation)SSC-Enthusiastic (123 reputation)SSC-Enthusiastic (123 reputation)SSC-Enthusiastic (123 reputation)SSC-Enthusiastic (123 reputation)SSC-Enthusiastic (123 reputation)SSC-Enthusiastic (123 reputation)

Group: General Forum Members
Points: 123 Visits: 22
Here is my scenario,

I am converting records from various sytems/databases into a mssql database. I retrieve a copy of the data on Friday afternoon and run my conversion process locally during the weekend. I then create a .bak file of this converted data and transfer it to the client machine where I restore it into a staging database. I then dts the data into the empty production database for the client to start using Tuesday. They continue to use their old system during the weekend and will have more data that was entered after the data pull on Friday which needs to be converted as well. So, after they start using the new system, they stop using the old, and I pull the data from their old system one last time and process only the records that were added after the Friday data pull. I check the MAX(identity field) value from each table that I will be putting data into and then reseed on my local machine with a gap of 1000 (depending on the average number of records that they process per day. Usually less than 100). I then process this new data set locally and then transfer a .bak file to the client machine and restore to the staging database. After this, I dts that data into the production database.

My question is, if their MAX() values was 10,250 and I reseed my local machine to 11,250 to accommodate any records added during the time that I pull and process the final days worth of data, and then process the records and I end with MAX() = 11,750, when I start the dts process, if it takes 30 seconds to run and during that 30 seconds, 250 records have transferred so far and there are still 250 more to go...if a user enters a new record during this millisecond timeframe, will sql assign the new record an identity value of 10,251 since the dts process is not completed yet, or will it assign the new record with an identity value of 11,500 falling within the values of the data set that I am transferring?

I know this may be a confusing scenario, but it is highly important to know whether or not I should seek an alternate method of transferring this "gap" data.
Sue_H
Sue_H
SSC-Forever
SSC-Forever (49K reputation)SSC-Forever (49K reputation)SSC-Forever (49K reputation)SSC-Forever (49K reputation)SSC-Forever (49K reputation)SSC-Forever (49K reputation)SSC-Forever (49K reputation)SSC-Forever (49K reputation)

Group: General Forum Members
Points: 49195 Visits: 12347

You can insert identity values that are gaps which don't exist in the table using identity_insert on. Once identity_insert is off, it goes back to using the value you had set with reseed.
The only time it would change from what you set in the reseed is if the value inserted is greater than your reseed value. Then it increments from the highest value.
All of that is explained in the help topic for SET IDENTITY_INSERT (Transact-SQL)SET IDENTITY_INSERT.

Sue




Steve Jones
Steve Jones
SSC Guru
SSC Guru (467K reputation)SSC Guru (467K reputation)SSC Guru (467K reputation)SSC Guru (467K reputation)SSC Guru (467K reputation)SSC Guru (467K reputation)SSC Guru (467K reputation)SSC Guru (467K reputation)

Group: Administrators
Points: 467586 Visits: 20599
teareal_1 - Saturday, April 1, 2017 12:15 PM
Here is my scenario,

I am converting records from various sytems/databases into a mssql database. I retrieve a copy of the data on Friday afternoon and run my conversion process locally during the weekend. I then create a .bak file of this converted data and transfer it to the client machine where I restore it into a staging database. I then dts the data into the empty production database for the client to start using Tuesday. They continue to use their old system during the weekend and will have more data that was entered after the data pull on Friday which needs to be converted as well. So, after they start using the new system, they stop using the old, and I pull the data from their old system one last time and process only the records that were added after the Friday data pull. I check the MAX(identity field) value from each table that I will be putting data into and then reseed on my local machine with a gap of 1000 (depending on the average number of records that they process per day. Usually less than 100). I then process this new data set locally and then transfer a .bak file to the client machine and restore to the staging database. After this, I dts that data into the production database.

My question is, if their MAX() values was 10,250 and I reseed my local machine to 11,250 to accommodate any records added during the time that I pull and process the final days worth of data, and then process the records and I end with MAX() = 11,750, when I start the dts process, if it takes 30 seconds to run and during that 30 seconds, 250 records have transferred so far and there are still 250 more to go...if a user enters a new record during this millisecond timeframe, will sql assign the new record an identity value of 10,251 since the dts process is not completed yet, or will it assign the new record with an identity value of 11,500 falling within the values of the data set that I am transferring?

I know this may be a confusing scenario, but it is highly important to know whether or not I should seek an alternate method of transferring this "gap" data.

I'm slightly unclear. You grab the max() in a query, then in the next statement do a reseed to 11, 250, correct? Using DBCC to reseed? Once you do that, new records added using the identity property (not SET IDENTITY INSERT) will start at the new seed value. This isn't based on the work you perform with DTS (SSIS?). I assume you're using SET IDENTITY INSERT on in your process to avoid the identity overlap.


Follow me on Twitter: @way0utwest
Forum Etiquette: How to post data/code on a forum to get the best help
My Blog: www.voiceofthedba.com
teareal_1
teareal_1
SSC-Enthusiastic
SSC-Enthusiastic (123 reputation)SSC-Enthusiastic (123 reputation)SSC-Enthusiastic (123 reputation)SSC-Enthusiastic (123 reputation)SSC-Enthusiastic (123 reputation)SSC-Enthusiastic (123 reputation)SSC-Enthusiastic (123 reputation)SSC-Enthusiastic (123 reputation)

Group: General Forum Members
Points: 123 Visits: 22
Steve Jones - SSC Editor - Sunday, April 2, 2017 3:21 AM
teareal_1 - Saturday, April 1, 2017 12:15 PM
Here is my scenario,

I am converting records from various sytems/databases into a mssql database. I retrieve a copy of the data on Friday afternoon and run my conversion process locally during the weekend. I then create a .bak file of this converted data and transfer it to the client machine where I restore it into a staging database. I then dts the data into the empty production database for the client to start using Tuesday. They continue to use their old system during the weekend and will have more data that was entered after the data pull on Friday which needs to be converted as well. So, after they start using the new system, they stop using the old, and I pull the data from their old system one last time and process only the records that were added after the Friday data pull. I check the MAX(identity field) value from each table that I will be putting data into and then reseed on my local machine with a gap of 1000 (depending on the average number of records that they process per day. Usually less than 100). I then process this new data set locally and then transfer a .bak file to the client machine and restore to the staging database. After this, I dts that data into the production database.

My question is, if their MAX() values was 10,250 and I reseed my local machine to 11,250 to accommodate any records added during the time that I pull and process the final days worth of data, and then process the records and I end with MAX() = 11,750, when I start the dts process, if it takes 30 seconds to run and during that 30 seconds, 250 records have transferred so far and there are still 250 more to go...if a user enters a new record during this millisecond timeframe, will sql assign the new record an identity value of 10,251 since the dts process is not completed yet, or will it assign the new record with an identity value of 11,500 falling within the values of the data set that I am transferring?

I know this may be a confusing scenario, but it is highly important to know whether or not I should seek an alternate method of transferring this "gap" data.

I'm slightly unclear. You grab the max() in a query, then in the next statement do a reseed to 11, 250, correct? Using DBCC to reseed? Once you do that, new records added using the identity property (not SET IDENTITY INSERT) will start at the new seed value. This isn't based on the work you perform with DTS (SSIS?). I assume you're using SET IDENTITY INSERT on in your process to avoid the identity overlap.

Correct. I reseed using DBCC on my local copy. This way the users can still add records on their production machine while I am processing the remainder of records, and I understand that the new records that I process locally will start with the new seed number. I push those records into the production environment on their machine, IDENTY_INSERT ON. I just want to know if while the DTS process is running and adding those records if it is possible that a user entering a record through the front end application would use a primary key value that falls within the range of numbers I am pushing up. It is very important because if I have three name records 1,2,3 that r linked to arrest record 4,5,6...I cannot use this method if the user could possibly enter a name record after my 1 processes but before my 2 process and therefore creating a duplicate identity. It would throw a duplicate primary key error, and I assume roll the db back to prior to dts...but that would also delete the record entered by the user?

Steve Jones
Steve Jones
SSC Guru
SSC Guru (467K reputation)SSC Guru (467K reputation)SSC Guru (467K reputation)SSC Guru (467K reputation)SSC Guru (467K reputation)SSC Guru (467K reputation)SSC Guru (467K reputation)SSC Guru (467K reputation)

Group: Administrators
Points: 467586 Visits: 20599
If I get the timing, then yes. I don't believe a batch insert "reserves" any numbers, so it's possible.

If you are concerned, I'd do this
- dbcc reseed to max + 1010 (allow for some activity between query MAX() and dbcc)
- run DTS

Follow me on Twitter: @way0utwest
Forum Etiquette: How to post data/code on a forum to get the best help
My Blog: www.voiceofthedba.com
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum








































































































































































SQLServerCentral


Search