Data types and sizing for commonly used fields.

Question

Data types and sizing for commonly used fields.

Paul Coghlan-340072

Grasshopper

Points: 13
More actions
June 18, 2006 at 7:14 am

#169292

I have spent the last 30 minutes trying to find something that gives guidance on the best data types/sizes for common fields (names, address, telephone nunbers, email addresses etc). Also for long/lat fields.
I see multiple posts where people fell into holes because they did something like using ints for ZIPs and telephone numbers or chars for textual fields and had NULL issues so I think there is a need for a recommendation for these common fields.
I am using varchars for all name, address and contact fiedls , including emails, telephone numbers etc but are there thoughts about correct sizing or does it even matter?!
For example my Street fields are varchar(50), does this sound right? IS there any reason not to make it varchar(100) or even (1000)?!
I do understand this is a pretty basic question but bear with someone trying to do it right and asking. The most elegant design will fall over if the fundamentals arent taken care of and having seen a design which had 2 fields for email address (one had the full address [me@yahoo.com], the second had the network [@yahoo.com]) I see there are so called pro's out there that make dubious design decisions so I am trying not to become one of them.
Thanks,
Paul

Viewing 4 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply

Prakash Sawant SSCrazy Points: 2173 More actions · Answer 1

varchar[(n)]

Variable-length non-Unicode character data with length of n bytes. n must be a value from 1 through 8,000. Storage size is the actual length in bytes of the data entered, not n bytes. The data entered can be 0 characters in length. The SQL-92 synonyms for varchar are char varying or character varying.

See BOL for more info.

------------
Prakash Sawant

http://psawant.blogspot.com

stax68 SSChampion Points: 11761 More actions · Answer 2

There is no advantage from the POV of storage or efficiency in reducing the maximum size of varchar fields. The only reason to do so is to avoid overrunning the max row size. So other things equal, a column for which there is no obvious length limit should be defined as long as possible. Increasing length of variable columns is one of the simplest and least disruptive changes you can make to your DB design, so it's not a disaster if you have to.

I would use a string type for any 'numbers' that you wouldn't do arithmetic on and aren't surrogate keys. You should add appropriate CHECK constraints to fields like this to ensure the string is in the right format, has no illegal characters, etc. You can probably find out the standards for postal codes and telephone numbers in your country on the web somewhere.

The email address split wouldn't be a ridiculous idea, if the domain weren't repeated. Generally you should try to break down values as much as possible. I would store area code separately from local phone number, for example. Once you do that, you realise that it might be worth moving area codes into a lookup table and using an FK. An example of why this would be useful: data entry users could choose area code from a list thus reducing the likelihood of data entry errors.

Don't be put off normalising data in this way by concern about overcomplicated queries. Queries will be complicated anyway, and (indexed) views, calc columns and functions can be used to improve retrieval speeds while not confusing the database design by introducing denormalised writable columns.

Lat/Long would be best stored in two fields. I expect you could knock up a UDT containing an ordered pair of coordinates in SQL9 using the CLR, but I wouldn't bother. Give the procedural types their pair of values in the right order and then let them do whatever they like with them, I say.

Tim Wilkinson

"If it doesn't work in practice, you're using the wrong theory"
- Immanuel Kant

Jeff Moden SSC Guru Points: 1004664 More actions · Answer 3

True but you have to be careful not to allow things so large that you have the potential for trying to store more than 8060 bytes (or less) in a row. Like you said, it's good to limit the size.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)