Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase 12»»

Maintaining the Value of My Primary Key Question Expand / Collapse
Author
Message
Posted Monday, November 25, 2013 9:45 AM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Tuesday, August 5, 2014 1:52 PM
Points: 143, Visits: 412
I find the hardest rule to follow for Primary Keys is that once a record has its ID it can never change. It's all easy for me if the table is loaded incrementally, or is static. It is when I have to do a truncate and repopulate process that things get complicated for me. Most of the time it isn't an issue because I have some ID field I can use, but once in a while I come across a table where I need to create a surrogate key. I need to make sure that surrogate key never changes for that record.

I have a table of 200,000 records. The natural primary key for this table is the combination of columns A1, B2, C3 and D4. I want to create a surrogate key so that I can have one column for my primary key. Right now to do this, I concatenate the four columns that make up the natural primary keys. If those columns are char(20), then my surrogate key is now char(80).

The big question is: Is there another way to do this so that my surrogate key is smaller?

I know I could use the Identity option, but over time some records are added and some are deleted for whatever reason. I need to be able to recreate the table. If I recreate the table, I need to be able to ensure that the surrogate key is still assigned to the correct record and I can't do that if I use the Identity option.

Are there any functions out there that can take the values of those four columns and assign them a hexadecimal value or something?

Any suggestions?

Thanks
Post #1517346
Posted Monday, November 25, 2013 9:55 AM
Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: Monday, August 18, 2014 3:04 AM
Points: 1,362, Visits: 15,269
adams.squared (11/25/2013)


I have a table of 200,000 records. The natural primary key for this table is the combination of columns A1, B2, C3 and D4. I want to create a surrogate key so that I can have one column for my primary key. Right now to do this, I concatenate the four columns that make up the natural primary keys. If those columns are char(20), then my surrogate key is now char(80).


Why do you want to create a surrogate key from these four columns instead of using them as they are and saving space?

adams.squared (11/25/2013)

I know I could use the Identity option, but over time some records are added and some are deleted for whatever reason. I need to be able to recreate the table. If I recreate the table, I need to be able to ensure that the surrogate key is still assigned to the correct record and I can't do that if I use the Identity option.


You don't have to rebuild the table and loose the primary key if its an identity. You can use IDENTITY_INSERT.
Post #1517349
Posted Monday, November 25, 2013 9:58 AM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Tuesday, August 5, 2014 1:52 PM
Points: 143, Visits: 412
The reason I am creating a surrogate is because I need a KEY field.

I can't use the identity because now and then records are added or removed. Using identity will not guarantee that the ID field is the same after the rebuild.
Post #1517351
Posted Monday, November 25, 2013 10:01 AM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Tuesday, August 5, 2014 1:52 PM
Points: 143, Visits: 412
Let me add a note here. This has to do with an ETL process. I'm taking the data from the source and creating a table on the destination. I want my key on the destination. I also do not have any control over the source.
Post #1517353
Posted Monday, November 25, 2013 10:16 AM
SSC-Addicted

SSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-Addicted

Group: General Forum Members
Last Login: Yesterday @ 10:08 AM
Points: 448, Visits: 3,353
Use ROW_NUMBER() for surrogate key assignment and order on the natural key. As long as you receive the same data set each time the surrogates will be the same.

Better still, don't truncate your target table. Why would you want to do that? E.g.:
1. Populate a "landing" table with your snapshot of source data.
2. Join to the target table to identify only those rows that are newly arriving in the data set. Assign surrogates to the new rows.
3. Insert new rows to the target table.


David
Post #1517357
Posted Monday, November 25, 2013 10:21 AM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Tuesday, August 5, 2014 1:52 PM
Points: 143, Visits: 412
sqlvogel (11/25/2013)
Use ROW_NUMBER() for surrogate key assignment and order on the natural key. As long as you receive the same data set each time the surrogates will be the same.

Better still, don't truncate your target table. Why would you want to do that? E.g.:
1. Populate a "landing" table with your snapshot of source data.
2. Join to the target table to identify only those rows that are newly arriving in the data set. Assign surrogates to the new rows.
3. Insert new rows to the target table.


I'm not always receiving the same data set, so using row number does not work. For the second part, sometimes we have to rebuild the table.

The question is:
Is there a way to get a single value for the four columns, aside from concatenation and not using identity?


Post #1517359
Posted Monday, November 25, 2013 12:31 PM
SSC-Addicted

SSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-Addicted

Group: General Forum Members
Last Login: Yesterday @ 10:08 AM
Points: 448, Visits: 3,353
adams.squared (11/25/2013)

The question is:
Is there a way to get a single value for the four columns, aside from concatenation and not using identity?


Sorry but if you care about what the numbers are then you clearly can't/shouldn't be using IDENTITY to generate them. If you want to rebuild the table while retaining the same numbers then for most practical purposes it doesn't make sense to use IDENTITY. IDENTITY is best avoided in data warehouse / data integration situations - at least as a way of generating keys for a target table. A SEQUENCE is somewhat better because you can generate the values independently of the table.

SEQUENCE or not, this is a common situation and it really isn't a difficult problem to solve but I'm not sure why you think you have to use an IDENTITY column to do it.


David
Post #1517426
Posted Monday, November 25, 2013 12:45 PM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Tuesday, August 5, 2014 1:52 PM
Points: 143, Visits: 412
sqlvogel (11/25/2013)
adams.squared (11/25/2013)

The question is:
Is there a way to get a single value for the four columns, aside from concatenation and not using identity?


Sorry but if you care about what the numbers are then you clearly can't/shouldn't be using IDENTITY to generate them. If you want to rebuild the table while retaining the same numbers then for most practical purposes it doesn't make sense to use IDENTITY. IDENTITY is best avoided in data warehouse / data integration situations - at least as a way of generating keys for a target table. A SEQUENCE is somewhat better because you can generate the values independently of the table.

SEQUENCE or not, this is a common situation and it really isn't a difficult problem to solve but I'm not sure why you think you have to use an IDENTITY column to do it.


I do not want to use Identity.
Post #1517431
Posted Monday, November 25, 2013 1:39 PM
SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Yesterday @ 4:13 PM
Points: 2,087, Visits: 3,137
You could create a separate "key master" table, containing only an identity column and the four natural key columns, clustered on the natural key columns, NOT the identity.

Then, the first time a given combination of
A1, B2, C3 and D4
values are to be inserted to the main table, you first insert a row into the "key master" table; then, when inserting to the main table, you lookup the identity key value to be used from the "key master" table.

The main table can also be clustered on the natural key columns, if that works best for the queries, while the PK could still be the single identity value.


SQL DBA,SQL Server MVP('07, '08, '09)

"While in these days of quiet desperation /
As I wander through the world in which I live /
I search everywhere for some new inspiration /
But it's more than cold reality can give /
If I need a cause for celebration /
Or a comfort I can use to ease my mind /
I rely on my imagination /
And I dream of an imaginary time" : the inimitable Mr. Billy Joel
Post #1517448
Posted Monday, November 25, 2013 2:46 PM
SSC-Addicted

SSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-Addicted

Group: General Forum Members
Last Login: Yesterday @ 10:08 AM
Points: 448, Visits: 3,353
What Scott suggested is the way I would do it.

"Intelligent" numeric keys derived directly from other data (without any lookups) can only work in very limited circumstances I think - mainly those where the full set of data values is predictable and is within the limits of what can be enumerated by a numeric type. Calendar tables with numeric keys based on dates are an exceptional example.

Keys based on hashes of data are possible and maybe that is an option for you if you want another alternative.


David
Post #1517479
« Prev Topic | Next Topic »

Add to briefcase 12»»

Permissions Expand / Collapse