t-sql columns

  • I am setting up a sql server 2008 r2 database with the associated tables, views and stored procedures for one main customer only. I receive files from this customer daily. My company says that I just need to change as they change.

    Basically I am trying to decide what to do with the size of a transaction code field. Up until last week, the field was a number that was a size of 3. For future expansion, I was thinking of making the field a numeric field that was size of 10.

    As of last week, I started to receive values that was 1 alpha character and 2 numbers.

    Thus I am trying to decide how to define the field size. I am thinking of using varchar(10) over char(10) or some smaller sized field? What size of field would you use and why?

  • Personally I would ask the company you recieve the files from for a copy or details of the schema which they export from, so that you can get an exact replica on your side to import into. Then its a case of a bit of fore warning from both parties saying XYZ has changed/is going to change, can you update your systems to be ready for the change.

    From what I remember hearing there is a cut off as to where char and varchar are the same and I believe it was around char(9) or varchar(9) due to the additional bytes that varchar uses to determine the variable length, anything over 9 its cheaper to store strings in a varchar field in case you dont fill the field.

    I my opinion I would load the data into a staging table which is varchar(max) then do some length calculations and issue a number of ALTER TABLE statements to ensure that the data can be imported into the main table with no issues.

  • Can you ask the customer about their rules on that? If not, you're crippled and will be working on "best guess" at best.

    If it's mixed alphanumeric, you'll want to use char, or varchar. Might want to use Unicode on that and make it nvarchar, even, just in case.

    Another possibility, of course, is SQL_Variant datatype. That'll definitely allow them to do whatever they want with their data, but it has some significant drawbacks in other regards.

    As for length allowed, I generally use char/nchar for anything up to 10, and varchar/nvarchar for anything over 10, and usually go with multiples of 25 or powers of 10. Those are rules of thumb, not laws, but I find they work well for me.

    On shorter data, I stick to char/nchar because of the 2-byte storage overhead for varchar/nvarchar. In my experience, storing variable-length data in a fixed-length column at those sizes works just fine. Also helps with capacity planning and with page-split-prevention, so it has some long-term benefits.

    If you're not familiar with the storage: Variable-length string datatypes in SQL Server use 2 bytes per column to store the length of the string. So the string "Hello" is 5 bytes in a char(5) field, 10 bytes in a char(10) field, and 7 bytes in a varchar(5) field or varchar(10) field. In nchar(5), it's 10 bytes, in nchar(10) it's 20 bytes, and in nvarchar(5) or nvarchar(10) it's 12 bytes. See how that works? So, if you have char(5), it's 10 bytes, regardless of whether you have "I" or "12345" stored in it, but if you have varchar(5), it's 3 bytes for "I" and 7 for "12345". That 7 bytes for 5 characters datum is why it's sometimes better to store in fixed-length for variable-length data, and let the application or data access layer worry about trailing spaces. Since it can also be 5 bytes for 1 character in fixed-length, you need to know the distribution of the data to make the decision.

    Can't help you directly on choosing fixed vs variable on this one, since I don't know your data, but hopefully that helps you figure out what to look at. In this particular case, where you may not be able to know before-hand what the data is going to look like from day to day, I'd suggest nvarchar, and be generous and make it something like nvarchar(100) or at least nvarchar(50). At least varchar with one of those lengths, but Unicode allows the customer to go really crazy with their data if they want to, without you having to rebuild the table again.

    Does that help?

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

Viewing 3 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply