To Delete / Truncate / Drop & Create

  • Jeff Moden - Monday, May 15, 2017 6:45 PM

    jcelko212 32090 - Monday, May 15, 2017 1:34 PM

    The major difference between delete and truncate is that delete is a standard SQL command, but it does logging so that the deletion can be backed out. Truncation, on the other hand, is an old Sybase extension that moves a pointer to the very start of the physical storage of the data, so the storage can be overwritten. It doesn't do any logging. It also doesn't do any porting that was important to you.

    You might want to reconsider using identity as a key. It can never be a proper relational key since it's a count of physical insertion attempts on one machine, to one table in one particular SQL product. You'd be better off using the CREATE SEQUENCE structure instead. Essentially, this creates a generator for well-controlled numeric values in an increasing sequence. It is not a table property! Since its external it can be referenced in other places in your SQL.

    You need to study SQL Server and T-SQL a bit more before saying such things, Joe.  Truncate does, in fact, do logging.  It logs the fact that the pages have been de-allocated and, if used in a transaction, can indeed be rolled back as with any logged action.

    For single table usage, a SEQUENCE is no better than using IDENTITY and, in fact, can be worse because everyone has to remember to get the next value for a SEQUENCE.  Of course, a SEQUENCE does make the infamous Invoice/Invoice Detail problem a bit easier and is the absolute berries when using a common incremental value across multiple tables, but that would also fly in your face as to what a surrogate key should actually be used for (you normally say it shouldn't generally be used at all).

    And, to date, you have not identified what would be used as a natural key for a "personel" table that relies on no externally available key such as a license or certification number that would actually survive the test of time as a Primary Key.

    As for true portability, that's a myth.

    I'd enjoy seeing a realistic example of when a natural key for an employees (or personnel) table would beat an identity column as well.  Keeping in mind the unique, ever-increasing, narrow and non-volatile practices, I don't know of one.

    As for true portability being a myth, +2,000,000.

  • I'd enjoy seeing a realistic example of when a natural key for an employees (or personnel) table would beat an identity column as well. 

    I'm not sure the issue here is one "beating" the other.  Often there is a natural key for employees or people.  Often there is not one.  If the natural key exists, it should be used.  If it doesn't, use an identity.  I'm not sure that you're suggesting that you should always use an identity because it's faster.  I also agree with others, however, that the comment that an identity should never be used is simply unrealistic.  And for OLAP environments, it's not even desirable.

  • RonKyle - Thursday, May 18, 2017 7:08 AM

    I'd enjoy seeing a realistic example of when a natural key for an employees (or personnel) table would beat an identity column as well. 

    I'm not sure the issue here is one "beating" the other.  Often there is a natural key for employees or people.  Often there is not one.  If the natural key exists, it should be used.  If it doesn't, use an identity.  I'm not sure that you're suggesting that you should always use an identity because it's faster.  I also agree with others, however, that the comment that an identity should never be used is simply unrealistic.  And for OLAP environments, it's not even desirable.

    First of all, most people in the data model play a role. I've never seen a system where people are generic; this is as silly as having a data model where things are generic and need a Kabbalah number. But let's say instead of doing employees, prisoners, students or other roles that come with identifiers defined by law, and we have to make up our own identifier. Then all of those things about validation, verification types of scales and all that other good stuff about how you design data come into play. Why would using a table property on one table, in one particular hardware implementation of one particular database product be a good idea? Would you think that instead of using the identity property, which is based on a sequential tape file from UNIX, should be replaced with a cylinder and track model from disk systems? They'd be the same thing; based on the physical locator for the physical storage of the data.

    Please post DDL and follow ANSI/ISO standards when asking for help. 

  • jcelko212 32090 - Sunday, May 21, 2017 3:06 PM

    RonKyle - Thursday, May 18, 2017 7:08 AM

    I'd enjoy seeing a realistic example of when a natural key for an employees (or personnel) table would beat an identity column as well. 

    I'm not sure the issue here is one "beating" the other.  Often there is a natural key for employees or people.  Often there is not one.  If the natural key exists, it should be used.  If it doesn't, use an identity.  I'm not sure that you're suggesting that you should always use an identity because it's faster.  I also agree with others, however, that the comment that an identity should never be used is simply unrealistic.  And for OLAP environments, it's not even desirable.

    First of all, most people in the data model play a role. I've never seen a system where people are generic; this is as silly as having a data model where things are generic and need a Kabbalah number. But let's say instead of doing employees, prisoners, students or other roles that come with identifiers defined by law, and we have to make up our own identifier. Then all of those things about validation, verification types of scales and all that other good stuff about how you design data come into play. Why would using a table property on one table, in one particular hardware implementation of one particular database product be a good idea? Would you think that instead of using the identity property, which is based on a sequential tape file from UNIX, should be replaced with a cylinder and track model from disk systems? They'd be the same thing; based on the physical locator for the physical storage of the data.

    Once again I will say to please leave these forums Joe.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • jcelko212 32090 - Sunday, May 21, 2017 3:06 PM

    RonKyle - Thursday, May 18, 2017 7:08 AM

    I'd enjoy seeing a realistic example of when a natural key for an employees (or personnel) table would beat an identity column as well. 

    I'm not sure the issue here is one "beating" the other.  Often there is a natural key for employees or people.  Often there is not one.  If the natural key exists, it should be used.  If it doesn't, use an identity.  I'm not sure that you're suggesting that you should always use an identity because it's faster.  I also agree with others, however, that the comment that an identity should never be used is simply unrealistic.  And for OLAP environments, it's not even desirable.

    First of all, most people in the data model play a role. I've never seen a system where people are generic; this is as silly as having a data model where things are generic and need a Kabbalah number. But let's say instead of doing employees, prisoners, students or other roles that come with identifiers defined by law, and we have to make up our own identifier. Then all of those things about validation, verification types of scales and all that other good stuff about how you design data come into play. Why would using a table property on one table, in one particular hardware implementation of one particular database product be a good idea? Would you think that instead of using the identity property, which is based on a sequential tape file from UNIX, should be replaced with a cylinder and track model from disk systems? They'd be the same thing; based on the physical locator for the physical storage of the data.

    Not disagreeing with your comments here but given the nature of people what do you propose as a viable alternative? You can't use SSN as that needs to be encrypted. You can't use email or name as both are subject to change (and names are not unique). Since there is nothing you can use as a natural key the only option is to use a surrogate key. And why not use an identity? It already handles all the nasty parts of concurrency and such that nearly everybody gets wrong. A sequence would work but it requires more code and provides no tangible benefit when being used on a single table like employees. And seriously....does it matter what the history of the identity property is? Does it somehow invalidate the usefulness because it was based on sequential tapes? You can trace the history of nearly anything with computers back to something that is archaic.

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • RonKyle - Thursday, May 18, 2017 7:08 AM

    Often there is a natural key for employees or people.  Often there is not one.  If the natural key exists, it should be used.  If it doesn't, use an identity.

    It sounds like you may have run into an "employee" table that does, in fact, have a natural key.  I've run into a couple myself but none of the could withstand the test of time as a true unique row key.  What have you seen?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • RonKyle - Thursday, May 18, 2017 7:08 AM

    Often there is a natural key for employees or people. Often there is not one. If the natural key exists, it should be used. If it doesn't, use an identity.


    It sounds like you may have run into an "employee" table that does, in fact, have a natural key. I've run into a couple myself but none of the could withstand the test of time as a true unique row key. What have you seen?

    I have been able to use the actual employee number before.  There was no chance for duplication and I was able to enter years worth of employees who had come an gone.  The only downside was that is was possible to override the system and somethings a number with a different format was entered.  The unique was still enforced, but instead of being the same length as the others stuck out for one reason or the other.  The companies used ADP software, which I would imagine is in widespread use.  Also was able to use valid technician and installer numbers in other systems.  For a decade now, there has been no issue. 

    On the other hand, sometimes the system doesn't issue one or one that can be reused.  About a decade ago I set up an ETL for one of those that could be reused.  The reuse frequency wasn't high, so to make it unique (necessary for a 2SCD OLAP dimension) I appended the users initials to it.  So a duplicate was possible, but not likely.  But about a year ago there was a duplicate.  Two people with the same initials.  I'm about to upgrade that system from 2005 to 2016, so I may consider whether a sequence would be a better way to do this.  (An identity is out because it's not a table property).

    Bottom line, sometimes there's  one available.  If there is, I use it.  If there's not, I use an identity.  Just not a one size fits all solution in this case.

  • RonKyle - Tuesday, May 23, 2017 7:43 AM

    RonKyle - Thursday, May 18, 2017 7:08 AM

    Often there is a natural key for employees or people. Often there is not one. If the natural key exists, it should be used. If it doesn't, use an identity.

    It sounds like you may have run into an "employee" table that does, in fact, have a natural key. I've run into a couple myself but none of the could withstand the test of time as a true unique row key. What have you seen?

    I have been able to use the actual employee number before.  There was no chance for duplication and I was able to enter years worth of employees who had come an gone.  The only downside was that is was possible to override the system and somethings a number with a different format was entered.  The unique was still enforced, but instead of being the same length as the others stuck out for one reason or the other.  The companies used ADP software, which I would imagine is in widespread use.  Also was able to use valid technician and installer numbers in other systems.  For a decade now, there has been no issue. 

    On the other hand, sometimes the system doesn't issue one or one that can be reused.  About a decade ago I set up an ETL for one of those that could be reused.  The reuse frequency wasn't high, so to make it unique (necessary for a 2SCD OLAP dimension) I appended the users initials to it.  So a duplicate was possible, but not likely.  But about a year ago there was a duplicate.  Two people with the same initials.  I'm about to upgrade that system from 2005 to 2016, so I may consider whether a sequence would be a better way to do this.  (An identity is out because it's not a table property).

    Bottom line, sometimes there's  one available.  If there is, I use it.  If there's not, I use an identity.  Just not a one size fits all solution in this case.

    What you say sounds eerily familiar.  I've seen the same thing used (employee number) at a previous employer.  And yes, I also wrote an integration of ADP data with it.  It's a surrogate key, but since it's not ever-increasing I wouldn't use it as the clustering key is SQL Server.

  • RonKyle - Tuesday, May 23, 2017 7:43 AM

    RonKyle - Thursday, May 18, 2017 7:08 AM

    Often there is a natural key for employees or people. Often there is not one. If the natural key exists, it should be used. If it doesn't, use an identity.

    It sounds like you may have run into an "employee" table that does, in fact, have a natural key. I've run into a couple myself but none of the could withstand the test of time as a true unique row key. What have you seen?

    I have been able to use the actual employee number before.  There was no chance for duplication and I was able to enter years worth of employees who had come an gone.  The only downside was that is was possible to override the system and somethings a number with a different format was entered.  The unique was still enforced, but instead of being the same length as the others stuck out for one reason or the other.  The companies used ADP software, which I would imagine is in widespread use.  Also was able to use valid technician and installer numbers in other systems.  For a decade now, there has been no issue. 

    On the other hand, sometimes the system doesn't issue one or one that can be reused.  About a decade ago I set up an ETL for one of those that could be reused.  The reuse frequency wasn't high, so to make it unique (necessary for a 2SCD OLAP dimension) I appended the users initials to it.  So a duplicate was possible, but not likely.  But about a year ago there was a duplicate.  Two people with the same initials.  I'm about to upgrade that system from 2005 to 2016, so I may consider whether a sequence would be a better way to do this.  (An identity is out because it's not a table property).

    Bottom line, sometimes there's  one available.  If there is, I use it.  If there's not, I use an identity.  Just not a one size fits all solution in this case.

    That isn't a natural key. It is using a surrogate key from another system.

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • That isn't a natural key. It is using a surrogate key from another system.

    We may have a difference of definition, but I consider the key generated by another system to be it's natural key.  Otherwise most things don't have a natural key.  If there is no key available to me, like a list of Cub Scouts, then there is no natural key.  I have to add an identity column in this case. 

  • RonKyle - Tuesday, May 23, 2017 8:17 AM

    That isn't a natural key. It is using a surrogate key from another system.

    We may have a difference of definition, but I consider the key generated by another system to be it's natural key.  

    I often get the feeling I'm the only person who still reads Dr. Codd's papers 🙁 His definition of a surrogate key is that it is unique, generated by the system and hidden from the users. Think about a system that uses hashing, but does not expose the hash.

    Please post DDL and follow ANSI/ISO standards when asking for help. 

  • RonKyle - Tuesday, May 23, 2017 8:17 AM

    That isn't a natural key. It is using a surrogate key from another system.

    We may have a difference of definition, but I consider the key generated by another system to be it's natural key.  

    I often get the feeling I'm the only person who still reads Dr. Codd's papers 🙁 His definition of a surrogate key is that it is unique, generated by the system and hidden from the users. Think about a system that uses hashing, but does not expose the hash.

    Please post DDL and follow ANSI/ISO standards when asking for help. 

  • I often get the feeling I'm the only person who still reads Dr. Codd's papers

    You aren't, though it's been years and I can only follow the math to a point.  Many times the identity key can be hidden.  The surrogate keys in my OLAP designs are completely hidden from the users.  But in OLTP environments it can depend.  Many times it isn't desirable to hide them.  In the case of work order numbers or installer numbers in our systems, the number is the next number in line for that item.  That becomes the number by which that work order or installer is known by the users.  Should I tell users they can't use them because in theory they're supposed to be hidden? Theory is a good starting point, but sometimes have to give way when solving business problems in reality.  I say that with a great deal of reluctance because too many people do not use primary keys and foreign keys and would say the same thing, and they would be wrong almost all the time.  I would say know the theory, but when you break away from it, be ready to explain it.

  • jcelko212 32090 - Tuesday, May 23, 2017 10:36 AM

    RonKyle - Tuesday, May 23, 2017 8:17 AM

    That isn't a natural key. It is using a surrogate key from another system.

    We may have a difference of definition, but I consider the key generated by another system to be it's natural key.  

    I often get the feeling I'm the only person who still reads Dr. Codd's papers 🙁 His definition of a surrogate key is that it is unique, generated by the system and hidden from the users. Think about a system that uses hashing, but does not expose the hash.

    Heh... Dr. Codd didn't know that Joe Celko wouldn't be able to come up with a durable PK for employee/personel/customer tables. 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • RonKyle - Tuesday, May 23, 2017 10:49 AM

    I often get the feeling I'm the only person who still reads Dr. Codd's papers

    You aren't, though it's been years and I can only follow the math to a point.  Many times the identity key can be hidden.  The surrogate keys in my OLAP designs are completely hidden from the users.  But in OLTP environments it can depend.  Many times it isn't desirable to hide them.  In the case of work order numbers or installer numbers in our systems, the number is the next number in line for that item.  That becomes the number by which that work order or installer is known by the users.  Should I tell users they can't use them because in theory they're supposed to be hidden? Theory is a good starting point, but sometimes have to give way when solving business problems in reality.  I say that with a great deal of reluctance because too many people do not use primary keys and foreign keys and would say the same thing, and they would be wrong almost all the time.  I would say know the theory, but when you break away from it, be ready to explain it.

    It a real shame that a post can't be "Liked" for more than 1 point per person in this case.  I forget who it is on these forums but one of the folks here has the old saying in their signature line of "In theory, practice and theory are the same.  In practice, they are not". 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 15 posts - 31 through 45 (of 49 total)

You must be logged in to reply to this topic. Login to reply