Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase «««23456»»

Stairway to Database Design - STEP 1: Data Elements Expand / Collapse
Author
Message
Posted Tuesday, September 4, 2012 9:25 AM
SSC Rookie

SSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC Rookie

Group: General Forum Members
Last Login: Tuesday, September 23, 2014 2:01 PM
Points: 38, Visits: 538
Thank you for your reply. I understand that I should use a natural key when one exists, however, for my customers, there is no natural key.
e-mail address: This is bricks and mortar so this information is not collected. Even if it was, it doesn't uniquely identify a customer as some people share addresses.
DUNS: The customers are consumers not businesses so this isn't applicable.
Credit Card: This identifies the payer, not the customer. Additionally, people will have multiple credit cards, and cards will be shared between multiple people.

I could continue down the list of options, but the conclusion is that there is nothing our customers can provide that we can guaranty to be unique. We have to provide a unique identifier to our customers. I can write a program to generate random (or sequential) numbers for the customer ID, or I can use an identity column (or something else)?

In this case, why wouldn't I use an identity column? I want to make sure I get my design right, and I am having trouble with tables that have no natural key. What do I use?
Post #1353991
Posted Wednesday, September 5, 2012 1:44 AM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Today @ 7:24 AM
Points: 5,430, Visits: 10,099
If you have to provide that unique identifier to your customers, that's a good reason not to use an identity property to generate it. It makes it too easy to guess one customer's ID given another's.

John
Post #1354362
Posted Monday, April 15, 2013 4:50 AM
Mr or Mrs. 500

Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500

Group: General Forum Members
Last Login: Yesterday @ 5:47 AM
Points: 594, Visits: 3,495
this is an interesting article, and one that i have only just got round to reading.
i have a few questions which i would like to ask, just to expand my understanding.
in the first page of the comments, Celko mentions

The old IDEF standard used singular names for files because you process the records of the file one at a time. In the RDBMS world, we process an entire set of rows at a time, so we use collective names (or plural if no collective is available). If you do have only one Customer, then that would be the correct table name. Sorry business is so bad :)


my query is this: what if you start off with only one Customer, e.g You have exclusivity to one Customer for 12 months. After that, you may get more Customers but there's no real guarantee. depends on how good you are i suppose!
would you name the table Customer or Customers? changing the table name at a later date wouldn't be advisable.

Daniel Bowlin brings up the issue of the article stating you shouldn't use the format <table name>_id for you column name.

DEK46656 states that he believes its about calliong the table 'Customers' and the column 'Customer_id'.
i'm wondering if there is a 'Standards' reply; perhaps Celko knows? i couldn't see a specific response from him about this.
Post #1442249
Posted Wednesday, July 3, 2013 11:18 AM


Valued Member

Valued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued Member

Group: General Forum Members
Last Login: Friday, October 17, 2014 8:47 AM
Points: 54, Visits: 428
After re-reading this, I almost changed my score to Average rather than Good. This is supposed to be step one in Database Design, but instead is a discussion of standards, primary key and identifiers and data element domains. The first step of relational database design is understanding the value and purpose of the database, the scope, the users and contexts for usage. Next is ER-modeling, or "what tables will I need and how are they related?". After that I worry about attributes. Mr. Celko is so physical, he skips past table design to column naming and key selection. His points are valid and I even agree with most of them (I will never like underscores in names, sorry), but he definitely skipped some stuff. What he presents here is like arguing about the quantity and type of materials before designing the floor plan.

Post #1470172
Posted Wednesday, July 3, 2013 12:29 PM
Old Hand

Old HandOld HandOld HandOld HandOld HandOld HandOld HandOld Hand

Group: General Forum Members
Last Login: 2 days ago @ 10:26 AM
Points: 357, Visits: 1,972
Many of the concepts discussed are fine in the abstract, but fail, or are difficult to use, in the real world. Let's use temperature, for example. I support a system that has a table that stores temperature data. Users are located around the world. Some enter temperatures in Celsius, some in Fahrenheit, depending on the usage at their location. So, the table has a column for temperature and a column for temperature system that's populated off of user defaults at the time the data is captured (users may be transferred to another part of the world, and change their default scaling). If there is a need to store values in Kelvin or Rankine, there's no need to add columns to the table, just add another value to the table that contains the constraints for the scaling, and carry on. Under the concept in the article, I would have to have a column for each type of temperature scaling, and add logic to determine the column to use when the data is captured.

Another example is vendors. We have vendors that have as many as 20 different locations for making payments. Absent an arbitrary number assigned by the system, there's no easy way to differentiate between them. Using an ever increasing ID makes this simple, and doesn't require any effort by users to track number assignments, which is important when you have 50,000+ vendors in a system.
Post #1470197
Posted Tuesday, July 9, 2013 10:25 AM
SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Yesterday @ 11:01 AM
Points: 2,401, Visits: 1,485
Good to have another opinion. It is a good starting point for those who are not familiar with norms and practices.

I have known of many of these points of view and have followed some. But there are a few that I do not follow. These include

1. I do use generated values as identifiers. The argument or VIN is really a reason to not use the type of key suggested, for parts of a car can be removed from one car and used on another and the VIN can end up as an identifier of any car that holds a part from the original car where a stolen or otherwise procured part is installed. Thus multiple vehicles with the same VIN.

2. A value field is of the same measurement, like the quantity of weight is always pounds, thus if it is not in pounds it is in grams, or it is in stones etc. We have found that this works if you want to bloat your database and structure. However, if you have a value for the number value and a qualifier such as unit of measure you can better model your database, although it can complicate your query selection. In this case we may collect a value and the original unit of measure or collection method and then a standardized value. This is particularly interesting in the collection of spatial data where we can collect in any number of location verification methods, but then convert it to the standard.

3. We use to use an abbreviated suffix to identify the classification of a data element, but have for the past whatever years used the full word such as name, number, address etc. The limitation of the original RDMS tools to 8-18 characters in length caused many to confine the names of data elements to crunched up abbreviated names that were almost useless. A more full name is of higher value since we are seeing more people using the "database" as source data and the business expert using data tools can now attach to the database and read stuff to their hearts content associated with the names that are correct for that type of data.

Thanks, the article as I said is a great starting point, and of use to many.


Not all gray hairs are Dinosaurs!
Post #1471733
Posted Tuesday, July 9, 2013 12:30 PM
Old Hand

Old HandOld HandOld HandOld HandOld HandOld HandOld HandOld Hand

Group: General Forum Members
Last Login: 2 days ago @ 10:26 AM
Points: 357, Visits: 1,972
Great examples! I would also add that in a purchasing application, it would be insanity to not use a column for units of measure, in association with a quantity column. The last time I looked, we have 30+ different units of measure values in one of th epurchasing systems, each, lbs, kg, lot, pallet, etc. I can't imagine having to refactor the database every time someone comes up with a new unit of measure requirement.
Post #1471798
Posted Wednesday, July 10, 2013 1:39 AM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Today @ 7:24 AM
Points: 5,430, Visits: 10,099
Ross McMicken (7/9/2013)
Great examples! I would also add that in a purchasing application, it would be insanity to not use a column for units of measure, in association with a quantity column. The last time I looked, we have 30+ different units of measure values in one of th epurchasing systems, each, lbs, kg, lot, pallet, etc. I can't imagine having to refactor the database every time someone comes up with a new unit of measure requirement.

Surely the way to go here is to have a "base" unit (kilograms, say) and then have a separate table that contains a conversion factor between that unit and every other unit? Then your application can do the conversion before it stores data in your main table. The advantages: your main table is one column smaller, and calculations such as aggregates are made a hell of a lot simpler.

John
Post #1471989
Posted Wednesday, July 10, 2013 7:50 AM
Old Hand

Old HandOld HandOld HandOld HandOld HandOld HandOld HandOld Hand

Group: General Forum Members
Last Login: 2 days ago @ 10:26 AM
Points: 357, Visits: 1,972
John Mitchell-245523 (7/10/2013)
Ross McMicken (7/9/2013)
Great examples! I would also add that in a purchasing application, it would be insanity to not use a column for units of measure, in association with a quantity column. The last time I looked, we have 30+ different units of measure values in one of th epurchasing systems, each, lbs, kg, lot, pallet, etc. I can't imagine having to refactor the database every time someone comes up with a new unit of measure requirement.

Surely the way to go here is to have a "base" unit (kilograms, say) and then have a separate table that contains a conversion factor between that unit and every other unit? Then your application can do the conversion before it stores data in your main table. The advantages: your main table is one column smaller, and calculations such as aggregates are made a hell of a lot simpler.

John

I don't think using a base measure really helps, and storing the conversion in another table just complicates matters. For the example I gave above, there are no conversion factors that work all the time. What's the conversion factor from pallet to liters?
Post #1472147
Posted Wednesday, July 10, 2013 7:59 AM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Today @ 7:24 AM
Points: 5,430, Visits: 10,099
Ross McMicken (7/10/2013)
What's the conversion factor from pallet to liters?

There isn't one, because they're different entities and therefore should not be stored in the same column. If you do as you suggest, you'll end up with something that looks like an EAV table. Some people are comfortable with that; others think that EAVs should be avoided at all costs. You'll make up your own mind, of course.

John
Post #1472161
« Prev Topic | Next Topic »

Add to briefcase «««23456»»

Permissions Expand / Collapse