To Delete / Truncate / Drop & Create

Question

To Delete / Truncate / Drop & Create

Viewing 15 posts - 31 through 45 (of 49 total)

You must be logged in to reply to this topic. Login to reply

Ed Wagner SSC Guru Points: 287024 More actions · Answer 1

Jeff Moden - Monday, May 15, 2017 6:45 PM
jcelko212 32090 - Monday, May 15, 2017 1:34 PM
The major difference between delete and truncate is that delete is a standard SQL command, but it does logging so that the deletion can be backed out. Truncation, on the other hand, is an old Sybase extension that moves a pointer to the very start of the physical storage of the data, so the storage can be overwritten. It doesn't do any logging. It also doesn't do any porting that was important to you.
You might want to reconsider using identity as a key. It can never be a proper relational key since it's a count of physical insertion attempts on one machine, to one table in one particular SQL product. You'd be better off using the CREATE SEQUENCE structure instead. Essentially, this creates a generator for well-controlled numeric values in an increasing sequence. It is not a table property! Since its external it can be referenced in other places in your SQL.
You need to study SQL Server and T-SQL a bit more before saying such things, Joe. Truncate does, in fact, do logging. It logs the fact that the pages have been de-allocated and, if used in a transaction, can indeed be rolled back as with any logged action.
For single table usage, a SEQUENCE is no better than using IDENTITY and, in fact, can be worse because everyone has to remember to get the next value for a SEQUENCE. Of course, a SEQUENCE does make the infamous Invoice/Invoice Detail problem a bit easier and is the absolute berries when using a common incremental value across multiple tables, but that would also fly in your face as to what a surrogate key should actually be used for (you normally say it shouldn't generally be used at all).
And, to date, you have not identified what would be used as a natural key for a "personel" table that relies on no externally available key such as a license or certification number that would actually survive the test of time as a Primary Key.
As for true portability, that's a myth.

I'd enjoy seeing a realistic example of when a natural key for an employees (or personnel) table would beat an identity column as well. Keeping in mind the unique, ever-increasing, narrow and non-volatile practices, I don't know of one.

As for true portability being a myth, +2,000,000.

Tally Tables - Performance Personified
String Splitting with True Performance
Best practices on how to ask questions

RonKyle SSC-Dedicated Points: 31537 More actions · Answer 2

I'd enjoy seeing a realistic example of when a natural key for an employees (or personnel) table would beat an identity column as well.

I'm not sure the issue here is one "beating" the other. Often there is a natural key for employees or people. Often there is not one. If the natural key exists, it should be used. If it doesn't, use an identity. I'm not sure that you're suggesting that you should always use an identity because it's faster. I also agree with others, however, that the comment that an identity should never be used is simply unrealistic. And for OLAP environments, it's not even desirable.

jcelko212 32090 SSCrazy Eights Points: 9294 More actions · Answer 3

RonKyle - Thursday, May 18, 2017 7:08 AM
I'd enjoy seeing a realistic example of when a natural key for an employees (or personnel) table would beat an identity column as well.
I'm not sure the issue here is one "beating" the other. Often there is a natural key for employees or people. Often there is not one. If the natural key exists, it should be used. If it doesn't, use an identity. I'm not sure that you're suggesting that you should always use an identity because it's faster. I also agree with others, however, that the comment that an identity should never be used is simply unrealistic. And for OLAP environments, it's not even desirable.

First of all, most people in the data model play a role. I've never seen a system where people are generic; this is as silly as having a data model where things are generic and need a Kabbalah number. But let's say instead of doing employees, prisoners, students or other roles that come with identifiers defined by law, and we have to make up our own identifier. Then all of those things about validation, verification types of scales and all that other good stuff about how you design data come into play. Why would using a table property on one table, in one particular hardware implementation of one particular database product be a good idea? Would you think that instead of using the identity property, which is based on a sequential tape file from UNIX, should be replaced with a cylinder and track model from disk systems? They'd be the same thing; based on the physical locator for the physical storage of the data.

Please post DDL and follow ANSI/ISO standards when asking for help.

TheSQLGuru SSC Guru Points: 134017 More actions · Answer 4

jcelko212 32090 - Sunday, May 21, 2017 3:06 PM
RonKyle - Thursday, May 18, 2017 7:08 AM
I'd enjoy seeing a realistic example of when a natural key for an employees (or personnel) table would beat an identity column as well.
I'm not sure the issue here is one "beating" the other. Often there is a natural key for employees or people. Often there is not one. If the natural key exists, it should be used. If it doesn't, use an identity. I'm not sure that you're suggesting that you should always use an identity because it's faster. I also agree with others, however, that the comment that an identity should never be used is simply unrealistic. And for OLAP environments, it's not even desirable.
First of all, most people in the data model play a role. I've never seen a system where people are generic; this is as silly as having a data model where things are generic and need a Kabbalah number. But let's say instead of doing employees, prisoners, students or other roles that come with identifiers defined by law, and we have to make up our own identifier. Then all of those things about validation, verification types of scales and all that other good stuff about how you design data come into play. Why would using a table property on one table, in one particular hardware implementation of one particular database product be a good idea? Would you think that instead of using the identity property, which is based on a sequential tape file from UNIX, should be replaced with a cylinder and track model from disk systems? They'd be the same thing; based on the physical locator for the physical storage of the data.

Once again I will say to please leave these forums Joe.

Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service

Sean Lange SSC Guru Points: 286573 More actions · Answer 5

jcelko212 32090 - Sunday, May 21, 2017 3:06 PM
RonKyle - Thursday, May 18, 2017 7:08 AM
I'd enjoy seeing a realistic example of when a natural key for an employees (or personnel) table would beat an identity column as well.
I'm not sure the issue here is one "beating" the other. Often there is a natural key for employees or people. Often there is not one. If the natural key exists, it should be used. If it doesn't, use an identity. I'm not sure that you're suggesting that you should always use an identity because it's faster. I also agree with others, however, that the comment that an identity should never be used is simply unrealistic. And for OLAP environments, it's not even desirable.
First of all, most people in the data model play a role. I've never seen a system where people are generic; this is as silly as having a data model where things are generic and need a Kabbalah number. But let's say instead of doing employees, prisoners, students or other roles that come with identifiers defined by law, and we have to make up our own identifier. Then all of those things about validation, verification types of scales and all that other good stuff about how you design data come into play. Why would using a table property on one table, in one particular hardware implementation of one particular database product be a good idea? Would you think that instead of using the identity property, which is based on a sequential tape file from UNIX, should be replaced with a cylinder and track model from disk systems? They'd be the same thing; based on the physical locator for the physical storage of the data.

Not disagreeing with your comments here but given the nature of people what do you propose as a viable alternative? You can't use SSN as that needs to be encrypted. You can't use email or name as both are subject to change (and names are not unique). Since there is nothing you can use as a natural key the only option is to use a surrogate key. And why not use an identity? It already handles all the nasty parts of concurrency and such that nearly everybody gets wrong. A sequence would work but it requires more code and provides no tangible benefit when being used on a single table like employees. And seriously....does it matter what the history of the identity property is? Does it somehow invalidate the usefulness because it was based on sequential tapes? You can trace the history of nearly anything with computers back to something that is archaic.

_______________________________________________________________

Need help? Help us help you.

Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

Jeff Moden SSC Guru Points: 1003869 More actions · Answer 6

RonKyle - Thursday, May 18, 2017 7:08 AM
Often there is a natural key for employees or people. Often there is not one. If the natural key exists, it should be used. If it doesn't, use an identity.

It sounds like you may have run into an "employee" table that does, in fact, have a natural key. I've run into a couple myself but none of the could withstand the test of time as a true unique row key. What have you seen?

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

RonKyle SSC-Dedicated Points: 31537 More actions · Answer 7

RonKyle - Thursday, May 18, 2017 7:08 AM
Often there is a natural key for employees or people. Often there is not one. If the natural key exists, it should be used. If it doesn't, use an identity.

It sounds like you may have run into an "employee" table that does, in fact, have a natural key. I've run into a couple myself but none of the could withstand the test of time as a true unique row key. What have you seen?

I have been able to use the actual employee number before. There was no chance for duplication and I was able to enter years worth of employees who had come an gone. The only downside was that is was possible to override the system and somethings a number with a different format was entered. The unique was still enforced, but instead of being the same length as the others stuck out for one reason or the other. The companies used ADP software, which I would imagine is in widespread use. Also was able to use valid technician and installer numbers in other systems. For a decade now, there has been no issue.

On the other hand, sometimes the system doesn't issue one or one that can be reused. About a decade ago I set up an ETL for one of those that could be reused. The reuse frequency wasn't high, so to make it unique (necessary for a 2SCD OLAP dimension) I appended the users initials to it. So a duplicate was possible, but not likely. But about a year ago there was a duplicate. Two people with the same initials. I'm about to upgrade that system from 2005 to 2016, so I may consider whether a sequence would be a better way to do this. (An identity is out because it's not a table property).

Bottom line, sometimes there's one available. If there is, I use it. If there's not, I use an identity. Just not a one size fits all solution in this case.

Ed Wagner SSC Guru Points: 287024 More actions · Answer 8

RonKyle - Tuesday, May 23, 2017 7:43 AM
RonKyle - Thursday, May 18, 2017 7:08 AM
Often there is a natural key for employees or people. Often there is not one. If the natural key exists, it should be used. If it doesn't, use an identity.

It sounds like you may have run into an "employee" table that does, in fact, have a natural key. I've run into a couple myself but none of the could withstand the test of time as a true unique row key. What have you seen?

I have been able to use the actual employee number before. There was no chance for duplication and I was able to enter years worth of employees who had come an gone. The only downside was that is was possible to override the system and somethings a number with a different format was entered. The unique was still enforced, but instead of being the same length as the others stuck out for one reason or the other. The companies used ADP software, which I would imagine is in widespread use. Also was able to use valid technician and installer numbers in other systems. For a decade now, there has been no issue.

On the other hand, sometimes the system doesn't issue one or one that can be reused. About a decade ago I set up an ETL for one of those that could be reused. The reuse frequency wasn't high, so to make it unique (necessary for a 2SCD OLAP dimension) I appended the users initials to it. So a duplicate was possible, but not likely. But about a year ago there was a duplicate. Two people with the same initials. I'm about to upgrade that system from 2005 to 2016, so I may consider whether a sequence would be a better way to do this. (An identity is out because it's not a table property).

Bottom line, sometimes there's one available. If there is, I use it. If there's not, I use an identity. Just not a one size fits all solution in this case.

What you say sounds eerily familiar. I've seen the same thing used (employee number) at a previous employer. And yes, I also wrote an integration of ADP data with it. It's a surrogate key, but since it's not ever-increasing I wouldn't use it as the clustering key is SQL Server.

Tally Tables - Performance Personified
String Splitting with True Performance
Best practices on how to ask questions

Sean Lange SSC Guru Points: 286573 More actions · Answer 9

RonKyle - Tuesday, May 23, 2017 7:43 AM
RonKyle - Thursday, May 18, 2017 7:08 AM
Often there is a natural key for employees or people. Often there is not one. If the natural key exists, it should be used. If it doesn't, use an identity.

It sounds like you may have run into an "employee" table that does, in fact, have a natural key. I've run into a couple myself but none of the could withstand the test of time as a true unique row key. What have you seen?

I have been able to use the actual employee number before. There was no chance for duplication and I was able to enter years worth of employees who had come an gone. The only downside was that is was possible to override the system and somethings a number with a different format was entered. The unique was still enforced, but instead of being the same length as the others stuck out for one reason or the other. The companies used ADP software, which I would imagine is in widespread use. Also was able to use valid technician and installer numbers in other systems. For a decade now, there has been no issue.

On the other hand, sometimes the system doesn't issue one or one that can be reused. About a decade ago I set up an ETL for one of those that could be reused. The reuse frequency wasn't high, so to make it unique (necessary for a 2SCD OLAP dimension) I appended the users initials to it. So a duplicate was possible, but not likely. But about a year ago there was a duplicate. Two people with the same initials. I'm about to upgrade that system from 2005 to 2016, so I may consider whether a sequence would be a better way to do this. (An identity is out because it's not a table property).

Bottom line, sometimes there's one available. If there is, I use it. If there's not, I use an identity. Just not a one size fits all solution in this case.

That isn't a natural key. It is using a surrogate key from another system.

_______________________________________________________________

Need help? Help us help you.

Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

RonKyle SSC-Dedicated Points: 31537 More actions · Answer 10

That isn't a natural key. It is using a surrogate key from another system.

We may have a difference of definition, but I consider the key generated by another system to be it's natural key. Otherwise most things don't have a natural key. If there is no key available to me, like a list of Cub Scouts, then there is no natural key. I have to add an identity column in this case.

jcelko212 32090 SSCrazy Eights Points: 9294 More actions · Answer 11

RonKyle - Tuesday, May 23, 2017 8:17 AM
That isn't a natural key. It is using a surrogate key from another system.
We may have a difference of definition, but I consider the key generated by another system to be it's natural key.

I often get the feeling I'm the only person who still reads Dr. Codd's papers 🙁 His definition of a surrogate key is that it is unique, generated by the system and hidden from the users. Think about a system that uses hashing, but does not expose the hash.

Please post DDL and follow ANSI/ISO standards when asking for help.

jcelko212 32090 SSCrazy Eights Points: 9294 More actions · Answer 12

RonKyle - Tuesday, May 23, 2017 8:17 AM
That isn't a natural key. It is using a surrogate key from another system.
We may have a difference of definition, but I consider the key generated by another system to be it's natural key.

I often get the feeling I'm the only person who still reads Dr. Codd's papers 🙁 His definition of a surrogate key is that it is unique, generated by the system and hidden from the users. Think about a system that uses hashing, but does not expose the hash.

Please post DDL and follow ANSI/ISO standards when asking for help.

RonKyle SSC-Dedicated Points: 31537 More actions · Answer 13

I often get the feeling I'm the only person who still reads Dr. Codd's papers

You aren't, though it's been years and I can only follow the math to a point. Many times the identity key can be hidden. The surrogate keys in my OLAP designs are completely hidden from the users. But in OLTP environments it can depend. Many times it isn't desirable to hide them. In the case of work order numbers or installer numbers in our systems, the number is the next number in line for that item. That becomes the number by which that work order or installer is known by the users. Should I tell users they can't use them because in theory they're supposed to be hidden? Theory is a good starting point, but sometimes have to give way when solving business problems in reality. I say that with a great deal of reluctance because too many people do not use primary keys and foreign keys and would say the same thing, and they would be wrong almost all the time. I would say know the theory, but when you break away from it, be ready to explain it.

Jeff Moden SSC Guru Points: 1003869 More actions · Answer 14

jcelko212 32090 - Tuesday, May 23, 2017 10:36 AM
RonKyle - Tuesday, May 23, 2017 8:17 AM
That isn't a natural key. It is using a surrogate key from another system.
We may have a difference of definition, but I consider the key generated by another system to be it's natural key.
I often get the feeling I'm the only person who still reads Dr. Codd's papers 🙁 His definition of a surrogate key is that it is unique, generated by the system and hidden from the users. Think about a system that uses hashing, but does not expose the hash.

Heh... Dr. Codd didn't know that Joe Celko wouldn't be able to come up with a durable PK for employee/personel/customer tables. 😉

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jeff Moden SSC Guru Points: 1003869 More actions · Answer 15

RonKyle - Tuesday, May 23, 2017 10:49 AM
I often get the feeling I'm the only person who still reads Dr. Codd's papers
You aren't, though it's been years and I can only follow the math to a point. Many times the identity key can be hidden. The surrogate keys in my OLAP designs are completely hidden from the users. But in OLTP environments it can depend. Many times it isn't desirable to hide them. In the case of work order numbers or installer numbers in our systems, the number is the next number in line for that item. That becomes the number by which that work order or installer is known by the users. Should I tell users they can't use them because in theory they're supposed to be hidden? Theory is a good starting point, but sometimes have to give way when solving business problems in reality. I say that with a great deal of reluctance because too many people do not use primary keys and foreign keys and would say the same thing, and they would be wrong almost all the time. I would say know the theory, but when you break away from it, be ready to explain it.

It a real shame that a post can't be "Liked" for more than 1 point per person in this case. I forget who it is on these forums but one of the folks here has the old saying in their signature line of "In theory, practice and theory are the same. In practice, they are not". 😉

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)