Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase 12345»»»

Stairway to Database Design - STEP 1: Data Elements Expand / Collapse
Author
Message
Posted Monday, April 26, 2010 11:18 PM


SSCommitted

SSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommitted

Group: General Forum Members
Last Login: Yesterday @ 12:12 PM
Points: 1,945, Visits: 3,180
Comments posted to this topic are about the item Stairway to Database Design - STEP 1: Data Elements

Books in Celko Series for Morgan-Kaufmann Publishing
Analytics and OLAP in SQL
Data and Databases: Concepts in Practice
Data, Measurements and Standards in SQL
SQL for Smarties
SQL Programming Style
SQL Puzzles and Answers
Thinking in Sets
Trees and Hierarchies in SQL
Post #910828
Posted Tuesday, April 27, 2010 12:46 AM


Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: Thursday, December 11, 2014 3:46 AM
Points: 1,256, Visits: 773
Strange, this low rating. (So I will rate it a * higher)

Obviously, auto-incrementing values are not identifiers. I will get into that fallacy later.

I have read this remark sometime before, looking forward to read it fully explained.

Could you perhaps suggest an alternative for the ID in [Purchasing].[PurchaseOrderHeader] in AdventureWorks?




Dutch Anti-RBAR League
Post #910850
Posted Tuesday, April 27, 2010 6:23 AM
Valued Member

Valued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued Member

Group: General Forum Members
Last Login: Thursday, December 12, 2013 1:44 PM
Points: 69, Visits: 205
Very good reading. I think we all need to be reminded to adopt and maintain some standard, even if it doesn't follow this exactly.
Post #911024
Posted Tuesday, April 27, 2010 7:52 AM
SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Friday, December 12, 2014 8:24 AM
Points: 2,818, Visits: 2,567
Very interesting article....very structured. More structured I think than I have seen in practice. It is very useful to be reminded that a well defined structure makes many things easier and more understandable in the long run.

I am curious a bit about:
"_id" = identifier. It is unique in the schema and refer to one entity anywhere it appears in the schema. Uniqueness is not the same thing as being an identifier. The number pi is unique, but it does not identify an entity. Never use "<table name>_id"; that is a name based on location and tell you this is probably not a real key at all. Just plain "id" is too vague and screws up your data dictionary when you have a zillion of them. Obviously, auto-incrementing values are not identifiers. I will get into that fallacy later.


If I have a customer table, and an identifier of customer_id, I think both are logically named and clear. Does the article suggest that I change it to something like client_id just to avoid using the same word in the id as in the table? In my opinion there are some sorts of table names and id names that are quite reasonably named the same and don't necessarily indicate a location as much as a useful name. I do also agree that there are many cases where doing this seems a bad idea such as an orderDetails table, I would not likely name the id as orderDetails_id as I would detail_id or something like that.

Is the article suggesting a rigid style that must be followed exactingly, or more a set of useful guidelines that help to clarify what everything means?






Post #911130
Posted Tuesday, April 27, 2010 8:17 AM


Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: Friday, December 5, 2014 8:45 AM
Points: 1,035, Visits: 412
One point...

There is NO SUCH THING as unstructured data. Pictures, movies, whatever you can think of in terms of data must be structured for it to be of any use at any time. I get what Mr. Celko is saying, but this is one of my little pet peeves. The difference is that there is usually no need for the dbms to "know" how to decode or make sense of this kind of data, it just needs to store and retrieve it as needed, and from this perspective, very complex data can be considered scalar.



/*****************

If most people are not willing to see the difficulty, this is mainly because, consciously or unconsciously, they assume that it will be they who will settle these questions for the others, and because they are convinced of their own capacity to do this. -Friedrich August von Hayek



*****************/
Post #911162
Posted Tuesday, April 27, 2010 8:22 AM
Old Hand

Old HandOld HandOld HandOld HandOld HandOld HandOld HandOld Hand

Group: General Forum Members
Last Login: Monday, November 24, 2014 7:25 AM
Points: 395, Visits: 446
I think the idea would be that the table is called “Customers” (plural to represent a collection), and then the key / identifier of a unique customer would be “customer_id”.

Beer's Law: Absolutum obsoletum
"if it works it's out-of-date"
Post #911166
Posted Tuesday, April 27, 2010 8:32 AM


Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: Friday, December 5, 2014 8:45 AM
Points: 1,035, Visits: 412
I'm pretty sure it's intended to be very structured.

I have used the "Id" only name for most of my databases because while you might end up with a whole bunch of "Id" columns in the database, each one is given an exact context by the table it belongs to. I always name tables in the singular "Customer" instead of "Customers" and then in any referencing tables such as "Invoice" the Customer.Id column is referenced as a foreign key from the Invoice.CustomerId column.



/*****************

If most people are not willing to see the difficulty, this is mainly because, consciously or unconsciously, they assume that it will be they who will settle these questions for the others, and because they are convinced of their own capacity to do this. -Friedrich August von Hayek



*****************/
Post #911183
Posted Tuesday, April 27, 2010 9:06 AM
Old Hand

Old HandOld HandOld HandOld HandOld HandOld HandOld HandOld Hand

Group: General Forum Members
Last Login: Monday, November 24, 2014 7:25 AM
Points: 395, Visits: 446
DCPeterson (4/27/2010)
I have used the "Id" only name for most of my databases because while you might end up with a whole bunch of "Id" columns in the database, each one is given an exact context by the table it belongs to. I always name tables in the singular "Customer" instead of "Customers" and then in any referencing tables such as "Invoice" the Customer.Id column is referenced as a foreign key from the Invoice.CustomerId column.


Don't you now have 2 different fields in the schema representing the same thing? CustomerId is in the Invoice field, and Customer.Id (or just Id) in the customer table. I believe the reference to "unique in the schema" is meant to be that the customer, being represented by the ID field, uses the same field name throughout the schema, otherwise it is not unique.

I've implemented that approach in some of my later efforts, specifically using a "k_XXX" approach to reference either the primary key (in the subject table) or the foreign key field in another table. Yes, the naming convention used is against one of the other rules outlined in the article, but I'm learning...


Beer's Law: Absolutum obsoletum
"if it works it's out-of-date"
Post #911216
Posted Tuesday, April 27, 2010 9:07 AM
SSC Veteran

SSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC Veteran

Group: General Forum Members
Last Login: Saturday, May 24, 2014 7:58 PM
Points: 219, Visits: 824
Hi Joe.

Regarding this recommendation:

"Likewise, avoid quoted identifiers with either the Microsoft-only square brackets or the ANSI/ISO double quote marks. This is just sloppy programming done for display formatting in the data base instead of the front end. The one possible exception might be a language translation problem where the Latin-1 alphabet will not work."

Suppose my customers want the following four column names in all the inputs, all the reports, and all the result sets: open, high, low, close.
Currently I have four columns with these names.
If I change my column names to something else and rename columns as they are selected, what benefits shall I get for my additional effort, more complexity in my code, my longer code?

Another question:
"I give the research for this set of conventions in my book SQL PROGRAMMING STYLE, but I will skip the details here."

I have the impression that your research was done a while ago and is not completely relevant to modern realities. Considering a modern team in Silicon Valley, where every team member may come from a different country, and many people simply don't have capital letters in their first languages, how can we apply the research done several decades ago, most likely in a much more homogeneous team?
Post #911218
Posted Tuesday, April 27, 2010 9:20 AM


SSCommitted

SSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommitted

Group: General Forum Members
Last Login: Yesterday @ 12:12 PM
Points: 1,945, Visits: 3,180
>> I have used the "Id" only name for most of my databases because while you might end up with a whole bunch of "Id" columns in the database, each one is given an exact context by the table it belongs to. <<

So sometimes "id" is squid, sometimes "id" is an automobile, etc. and you think this is a good thing? Imagine a creature that changes form as it moves from room to room. Fun for Science Fiction, but not for data modeling.

Did you notice that a VIN is always a VIN? Validated by a regular expression from ISO and verified by the manufacturers (or more likely, the DMV, CarFax or dealership).

>> I always name tables in the singular "Customer" instead of "Customers" and then in any referencing tables such as "Invoice" the Customer.Id column is referenced as a foreign key from the Invoice.CustomerId column. <<

The old IDEF standard used singular names for files because you process the records of the file one at a time. In the RDBMS world, we process an entire set of rows at a time, so we use collective names (or plural if no collective is available). If you do have only one Customer, then that would be the correct table name. Sorry business is so bad :)

Consider this:
Employee = bad, only one guy
Employees = better, but your mindset is still locked into record at a time process; you see individuals, trees, not the forest.
Personnel = Best. This is an abstract, higher level set and it shapes your mindset in that direction.

IDENTITY and other auto-increment features are not attributes of an entity or a relationship. In the case of IDENTITY in the SQL Server family, it is a count of the physical insertion attempts -- not even the successes. It is generated by the local computer and totally without verification or validation. It has nothing to do with a logical data model.

In Kabbalah magic, God puts a Hebrew number on all things in his creation. So if your computer is God and has a Hebrew font, then IDENTITY makes sense in a magical world :)

But did you notice that your insurance company, DMV and automobile chassis all were able to agree on the VIN,even tho your automobile does not have a network connection? Logical identifiers are not anything like physical locators.




Books in Celko Series for Morgan-Kaufmann Publishing
Analytics and OLAP in SQL
Data and Databases: Concepts in Practice
Data, Measurements and Standards in SQL
SQL for Smarties
SQL Programming Style
SQL Puzzles and Answers
Thinking in Sets
Trees and Hierarchies in SQL
Post #911229
« Prev Topic | Next Topic »

Add to briefcase 12345»»»

Permissions Expand / Collapse