June 16, 2013 at 9:48 pm
Two Scenarios
1)A table with 200 fields. OR
2)200 fields spanning across multiple tables? For e.g 10 tables with 20 fields each.
Which one is better in terms of performance and space used and why?
June 17, 2013 at 2:22 am
I'm going to use the classic answer.
"It Depends."
if you ALWAYS want all columns retrieved together then a single table will be faster at getting everything compared with multiple tables.
BUT in practice I doubt that will be true, and updating a wide table like that will be expensive, you will also have to manage lots of nulls I suspect.
However I would suggest a table with 200 columns is very unlikely to be in anything like 3rd normal form (or better) , which means the most likely answer is to redesign it so it is.
If you can post the table DDL, plus some examples of what you are doing you may get some more complete answers.
HTH -
Mike John
June 17, 2013 at 4:43 am
To add to Mike's reply, if you have a lot of fixed length columns that frequently contain a NULL value, you might want to consider putting those columns in a "sister table" so that you don't actually have to take up the space by storing NULLs. You just wouldn't have rows in the "sister table" for the NULLs. It does complicate lookups and inserts a bit but a well formed view can ease that pain especially for the lookups.
--Jeff Moden
Change is inevitable... Change for the better is not.
June 17, 2013 at 6:08 am
All are varchar(50) except a pk which is an int. To clarify further, users enter/ update data through the web to this table. It is kind of necessary to show all fields through web. 45 of them are required field and rest could be blank.
June 17, 2013 at 6:15 am
200 freetext fields and just a PK in addition?
That sounds a pretty unusual requirement, what's it for?
June 17, 2013 at 6:33 am
It stores an account. Users will either and new account or update existing account.
June 17, 2013 at 6:51 am
SQL_Surfer (6/16/2013)
Two Scenarios1)A table with 200 fields. OR
2)200 fields spanning across multiple tables? For e.g 10 tables with 20 fields each.
Which one is better in terms of performance and space used and why?
Have you heard about data normalization?
If OLTP I would at least go to the 3NF
If DSS I would either go for 3NF or Dimensional modeling depending on your "datawarehouse modeling beliefs"
The data model would tell how to group your columns, in how many tables and how they relate to each other.
Hope this helps.
_____________________________________
Pablo (Paul) Berzukov
Author of Understanding Database Administration available at Amazon and other bookstores.
Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.June 17, 2013 at 8:05 am
Each record will be unique though as it will be identified by AccountID.
June 17, 2013 at 8:12 am
SQL_Surfer (6/17/2013)
Each record will be unique though as it will be identified by AccountID.
That isn't the point of normalization. I would guess that if this is an online account you must have things like phone1, phone2, email1, email2, email3. 200 columns is a LOT of information to capture for an online account in a single table.
_______________________________________________________________
Need help? Help us help you.
Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.
Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.
Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/
June 17, 2013 at 8:44 am
From what you are saying these different columns sound like should be of several different data types, such things as account balances, credit limits, date last ordered, date last accessed, date last password change, password (encrypted?) etc etc come to mind.
Precisely what those columns should be is clearly dependant on exactly what is meant by "account", but storing each of them in a varchar(50) sounds odd, and I suspect a decision that will be regretted if you actually implement that.
Can you post more details of that table and it's relationships with others?
Mike John
June 17, 2013 at 9:05 am
My 2 Cents worth.
If in doubt, create your tables to reflect the entity that it models and use the proper identity, primary and foreign keys to keep the data in line.
As already stated, it sounds like your table is likely to contain lots of Phone1, Phone2, ... Phone(n) type fields.
These should be in a table of contact details
ContactMethodID (PK),CustomerID(FK to customer), ContactTypeID (FK to contactTypes),ContactDetails
This would mean you could have 0, 1 or 100million contactMethod records for each customer. You have very little wasted space in the database and queries are simpler because you are extracting data against one table/column instead of against several.
E.g.
select CustomerID, ContactDetails from ContactMethods where ContactTypeID = 'Phone' and ContactDetails begins ('+44')
Vs.
select CustomerID, Phone1 as 'Phone' where Phone1 begins ('+44')
UNION ALL
select CustomerID, Phone2 as 'Phone' where Phone2 begins ('+44')
UNION ALL
select CustomerID, Phone3 as 'Phone' where Phone3 begins ('+44')
...
select CustomerID, Phone_n as 'Phone' where Phone_n begins ('+44')
The way the data is diaplyed on the screen for the user to fill in, and the validation done on that data may bear no reflection on the representation of that data in the database
June 17, 2013 at 11:13 am
SQL_Surfer (6/17/2013)
All are varchar(50) except a pk which is an int. To clarify further, users enter/ update data through the web to this table. It is kind of necessary to show all fields through web. 45 of them are required field and rest could be blank.
That's a max of 10,000 characters per row... if a substantial number of the columns have near-capacity data in them, your data will go "out of row" and that will slow things down a bit.
--Jeff Moden
Change is inevitable... Change for the better is not.
June 17, 2013 at 12:29 pm
I tend to create 3 tables for this excercise. The first table defines the "Account". A non data related PK, first name, last name. Real simple stuff that help uniquely define an account.
The 2nd table is a list of all the other "Properties" of an account, that have no requirement to be filled in but may be present. I.E. phone 1, phone 2, phone 3, address1, address 2 etc.
Finally, a table that contains the account PK, the property ID from the 2nd table and then the data. This allows for simple indexing, and flexibility and scalability.
But thats just me.
Viewing 13 posts - 1 through 12 (of 12 total)
You must be logged in to reply to this topic. Login to reply