April 15, 2008 at 11:37 am
I was asked to develop some data base recommendations for the development team to reference. One of my recommendations was:
"Consider using BIT data type columns for Boolean values, as opposed to storing "TRUE / FALSE", "yes/no" or other character strings. SQL Server can store up to 8 columns with BIT data type in a single byte."
A recommended best practice in Baya Pavliashvili's best practice guide.
I am being chastised for making this recommendation by another team member -- his reasons are:
1. bit fields, contrary to what the above says, can be indexed (not through enterprise manager which does not let you, but through query analyzer or straight SQL, it can be indexed) but the performance is atrocious, so the mandate "they can not be indexed" is as good as true.
2. bit fields to accomodate the "don't know" value, require the use of NULLs and there are a ton of articles against the SQL problems ran into using NULLs -- the use of NULLs should be minimized wherever reasonable.
3. the use of NULL bit fields causes problems in Access, and for that matter, the use of BIT fields in access, even without NULLs causes confusion, because the values are displayed as "0" or "-1" (where "-1" is the same as the SQL bit value "1" !!) not "0" or "1". There is use of our tables occasionally by Access developers on the business side or support side. BIT fields can be confusing. Look at the below screen prints to show how confusing they are:
4. Instead of using bit fields, recommend the use of a smallint field with a constraint on to only allow the truth values you want (and it's better to use -1 for "don't know" than NULL). An index on the smallint field will provide far better performance.
5. If you disagree, I could create a table of significant size with a bit field and a smallint field and then index both fields flush the cache and show the performance hit one takes using a bit field. But I prefer not to belabor this point. It's pretty obvious. A bit field requires overhead code to get to the individual bits that is not required to just look at the field if it is a smallint, and it is very significant if that bit field is part of an index. Also, bit fields are very confusing for Microsoft Access people (how many people think of "-1" as meaning "TRUE" or "YES")?
I've being using bit data types without incident for the past 15 years... is there something I'm missing?
He references these articles:
http://www.ssw.com.au/ssw/Standards/Rules/RulesToBetterSQLServerDatabases.aspx#BitFields
http://www.ssw.com.au/ssw/KB/KB.aspx?KBID=Q482854
**Can I have nulls in Yes/No Column?
Answer:
SQL Server 7 only allows Yes or No in bit field. SQL 2000 introduced the ability to also store Null.
Often you don't want this behaviour.
Note: It causes a bug if your front end if Access
ACC2000: Write Conflict Error When You Try to Update Records in a Linked SQL Server Table (Q280730) http://support.microsoft.com/default.aspx?scid=kb;EN-US;q280730
ACC97: Write Conflict Error When You Try to Update Records in a Linked SQL Server Table (Q278696)
http://support.microsoft.com/default.aspx?scid=kb;EN-US;q278696
April 15, 2008 at 12:02 pm
I have typically avoided using bit fields, but don't rule them out. I believe your developer is right in regard to #1. Because of teh lack of selectivity bit fields are good candidates for indexes, but a Yes/No or smallint with 3 values is not really any better in that regard.
I don't mind having Nulls in my database because I understand how to deal with them and they do really represent an Unkown value.
As far as creating problems with Access, I would not know because I avoid Access like the plague.
As for #5, I would find this hard to believe without testing.
It sounds like you did what you were asked by making recommendations and, as is their perogative, they can choose to implement your recommendations or not and in the case of bit fields it sounds like not. I would document your suggestions so that, if there are problems because they chose to ignore your recommendations, you have something to fall back on.
Of course if you are the Senior DBA, you could make them abide by your recommendations by refusing to deploy any databases not developed according to "Best Practices" or in-house standards. I'm not recommending this, just saying it's possible. I always think resolving issues by edict just causes more problems than it solves.
Jack Corbett
Consultant - Straight Path Solutions
Check out these links on how to get faster and more accurate answers:
Forum Etiquette: How to post data/code on a forum to get the best help
Need an Answer? Actually, No ... You Need a Question
April 15, 2008 at 12:10 pm
gkruchten (4/15/2008)
I was asked to develop some data base recommendations for the development team to reference. One of my recommendations was:"Consider using BIT data type columns for Boolean values, as opposed to storing "TRUE / FALSE", "yes/no" or other character strings. SQL Server can store up to 8 columns with BIT data type in a single byte."
A recommended best practice in Baya Pavliashvili's best practice guide.
I am being chastised for making this recommendation by another team member -- his reasons are:
1. bit fields, contrary to what the above says, can be indexed (not through enterprise manager which does not let you, but through query analyzer or straight SQL, it can be indexed) but the performance is atrocious, so the mandate "they can not be indexed" is as good as true.
2. bit fields to accomodate the "don't know" value, require the use of NULLs and there are a ton of articles against the SQL problems ran into using NULLs -- the use of NULLs should be minimized wherever reasonable.
3. the use of NULL bit fields causes problems in Access, and for that matter, the use of BIT fields in access, even without NULLs causes confusion, because the values are displayed as "0" or "-1" (where "-1" is the same as the SQL bit value "1" !!) not "0" or "1". There is use of our tables occasionally by Access developers on the business side or support side. BIT fields can be confusing. Look at the below screen prints to show how confusing they are:
4. Instead of using bit fields, recommend the use of a smallint field with a constraint on to only allow the truth values you want (and it's better to use -1 for "don't know" than NULL). An index on the smallint field will provide far better performance.
5. If you disagree, I could create a table of significant size with a bit field and a smallint field and then index both fields flush the cache and show the performance hit one takes using a bit field. But I prefer not to belabor this point. It's pretty obvious. A bit field requires overhead code to get to the individual bits that is not required to just look at the field if it is a smallint, and it is very significant if that bit field is part of an index. Also, bit fields are very confusing for Microsoft Access people (how many people think of "-1" as meaning "TRUE" or "YES")?
I've being using bit data types without incident for the past 15 years... is there something I'm missing?
He references these articles:
http://www.ssw.com.au/ssw/Standards/Rules/RulesToBetterSQLServerDatabases.aspx#BitFields
http://www.ssw.com.au/ssw/KB/KB.aspx?KBID=Q482854
**Can I have nulls in Yes/No Column?
Answer:
SQL Server 7 only allows Yes or No in bit field. SQL 2000 introduced the ability to also store Null.
Often you don't want this behaviour.
Note: It causes a bug if your front end if Access
ACC2000: Write Conflict Error When You Try to Update Records in a Linked SQL Server Table (Q280730) http://support.microsoft.com/default.aspx?scid=kb;EN-US;q280730
ACC97: Write Conflict Error When You Try to Update Records in a Linked SQL Server Table (Q278696)
http://support.microsoft.com/default.aspx?scid=kb;EN-US;q278696
Well - I'd tend to agree with your recommendation.
Let's go over the points:
1. columns that can have only 2 values have rather atrocious selectivity, so indexing them buys you just about nothing....
2. Anyone with an understanding of NULLs understands how to deal with NULLS when you find them. If you DECIDE to allow your bit fields to be NULL, then you need to be ready to handle it. If you don't handle it, then strange things happen. But then again - bad code (a.k.a. code not making sure its inputs are "clean") will do that, no matter what language you write it in.
3. contrary to popular belief, Access developers aren't plankton; they're at least as smart as a trained seal, and can on occasion be made to understand simple concepts. Seeing a -1 (considered to be a signed bit notation) instead of just plain 1 (unsigned bit notation) is something you can train them to do. Really - try it - just get a big bucket of fish, and you'll be getting them to understand (and clap their fins) in no time flat.
4. refer to point #2. Selectivity is crap anyway. Big whoop.
As an aside - if you don't like having tri-state bit fields, you can do something about it: you can put a constraint of NOT NULL on the bit field, possibly with a DEFAULT value to match your business logic if need be.
I'm not saying that the smallint (why smallint - either tinyint or int, since SQL is "optimized for storing and retrieving int") is a particularly bad choice. It's way better than "true", but I don't find it that compellingly better than bit (it's also nullable after all). Oh, and speaking of confusing us trained seals, go ahead and use that -1 as your "null" replacement. Now THAT will help clear things up.....:hehe:
Take it from a long-time Access developer.... I'll go bounce that beach ball on my nose now...:)
----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
April 15, 2008 at 12:20 pm
Matt Miller (4/15/2008)
gkruchten (4/15/2008)
I was asked to develop some data base recommendations for the development team to reference. One of my recommendations was:"Consider using BIT data type columns for Boolean values, as opposed to storing "TRUE / FALSE", "yes/no" or other character strings. SQL Server can store up to 8 columns with BIT data type in a single byte."
A recommended best practice in Baya Pavliashvili's best practice guide.
I am being chastised for making this recommendation by another team member -- his reasons are:
1. bit fields, contrary to what the above says, can be indexed (not through enterprise manager which does not let you, but through query analyzer or straight SQL, it can be indexed) but the performance is atrocious, so the mandate "they can not be indexed" is as good as true.
2. bit fields to accomodate the "don't know" value, require the use of NULLs and there are a ton of articles against the SQL problems ran into using NULLs -- the use of NULLs should be minimized wherever reasonable.
3. the use of NULL bit fields causes problems in Access, and for that matter, the use of BIT fields in access, even without NULLs causes confusion, because the values are displayed as "0" or "-1" (where "-1" is the same as the SQL bit value "1" !!) not "0" or "1". There is use of our tables occasionally by Access developers on the business side or support side. BIT fields can be confusing. Look at the below screen prints to show how confusing they are:
4. Instead of using bit fields, recommend the use of a smallint field with a constraint on to only allow the truth values you want (and it's better to use -1 for "don't know" than NULL). An index on the smallint field will provide far better performance.
5. If you disagree, I could create a table of significant size with a bit field and a smallint field and then index both fields flush the cache and show the performance hit one takes using a bit field. But I prefer not to belabor this point. It's pretty obvious. A bit field requires overhead code to get to the individual bits that is not required to just look at the field if it is a smallint, and it is very significant if that bit field is part of an index. Also, bit fields are very confusing for Microsoft Access people (how many people think of "-1" as meaning "TRUE" or "YES")?
I've being using bit data types without incident for the past 15 years... is there something I'm missing?
He references these articles:
http://www.ssw.com.au/ssw/Standards/Rules/RulesToBetterSQLServerDatabases.aspx#BitFields
http://www.ssw.com.au/ssw/KB/KB.aspx?KBID=Q482854
**Can I have nulls in Yes/No Column?
Answer:
SQL Server 7 only allows Yes or No in bit field. SQL 2000 introduced the ability to also store Null.
Often you don't want this behaviour.
Note: It causes a bug if your front end if Access
ACC2000: Write Conflict Error When You Try to Update Records in a Linked SQL Server Table (Q280730) http://support.microsoft.com/default.aspx?scid=kb;EN-US;q280730
ACC97: Write Conflict Error When You Try to Update Records in a Linked SQL Server Table (Q278696)
http://support.microsoft.com/default.aspx?scid=kb;EN-US;q278696
Well - I'd tend to agree with your recommendation.
Let's go over the points:
1. columns that can have only 2 values have rather atrocious selectivity, so indexing them buys you just about nothing....
2. Anyone with an understanding of NULLs understands how to deal with NULLS when you find them. If you DECIDE to allow your bit fields to be NULL, then you need to be ready to handle it. If you don't handle it, then strange things happen.
3. contrary to popular belief, Access developers aren't plankton; they're at least as smart as a trained seal, and can on occasion be made to understand simple concepts. Seeing a -1 (considered to be a signed bit notation) instead of just plain 1 (unsigned bit notation) is something you can train them to do. Really - try it - just get a big bucket of fish, and you'll be getting them to understand (and clap their fins) in no time flat.
4. refer to point #2. Selectivity is crap anyway. Big whoop.
As an aside - if you don't like having tri-state bit fields, you can do something about it: you can put a constraint of NOT NULL on the bit field, possibly with a DEFAULT value to match your business logic if need be.
I'm not saying that the smallint (why smallint - either tinyint or int, since SQL is "optimized for storing and retrieving int") is a particular bad choice. It's way better than "true", but I don't find it that compellingly better than bit. Oh, and speaking of confusing us trained seals, go ahead and use that -1 as your "null" replacement. Now THAT will help clear things up.....:hehe:
Take it from a long-time Access developer.... I'll go bounce that beach ball on my nose now...:)
#1 is really a "it depends". If the number of 0's and 1's is about even, then yes the selectivity is questionable. I used a bit field to flag records as historical or current at a previous employer. The field was indexed and significantly improved performance for 90%+ of the queries that accessed those tables. The reason being was that the majority of the queries were looking at current records, not historical, and 90 to 95% of the records were historical.
😎
April 15, 2008 at 12:41 pm
Lynn Pettis (4/15/2008)
#1 is really a "it depends". If the number of 0's and 1's is about even, then yes the selectivity is questionable. I used a bit field to flag records as historical or current at a previous employer. The field was indexed and significantly improved performance for 90%+ of the queries that accessed those tables. The reason being was that the majority of the queries were looking at current records, not historical, and 90 to 95% of the records were historical.😎
True - but that works only if your usage is tilted towards the "small" end of the bit distribution. As always - "it depends" beats an absolute 🙂
----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
April 15, 2008 at 1:22 pm
Matt Miller (4/15/2008)
Lynn Pettis (4/15/2008)
#1 is really a "it depends". If the number of 0's and 1's is about even, then yes the selectivity is questionable. I used a bit field to flag records as historical or current at a previous employer. The field was indexed and significantly improved performance for 90%+ of the queries that accessed those tables. The reason being was that the majority of the queries were looking at current records, not historical, and 90 to 95% of the records were historical.😎
True - but that works only if your usage is tilted towards the "small" end of the bit distribution. As always - "it depends" beats an absolute 🙂
The other part of the "it depends" is "know your data". At my previous employer I knew the data, and that indexing the bit field would help significantly for the majority of queries. When you are generating nearly 10,000 invoices a month, you tend to know what the distribution of open (current) and paid (historical) is when you look at having about 10 years worth of data.
😎
April 15, 2008 at 1:45 pm
This looks like a fun one to pile on to
1) A bit field, all by its lonesome, can be a horrible choice for an index (although in rare circumstances as outlined above, the opposite is true) but what about when used in conjunction with one more other fields? Suddenly it works just fine.
2) So, make it NOT NULL. Varchar, char, nvarchar, nchar, all of which could be used for storing "YES/NO" fields also support NULL values. NULL does not mean unknown, it means not supplied. It's an empty set with no actual meaning or value besides not being there.
3) This one is easily worked around in views or procedures on SQL Server or in the code in Access.
4) A smallint field with three values is not going to see a very measurable improvement in speed over a bit field with two values. Sorry. Also, I thought that -1 would confuse Access and poorly educated Access users?
5) I don't understand this one "A bit field requires overhead code to get to the individual bits that is not required to just look at the field if it is a smallint, and it is very significant if that bit field is part of an index." I've never heard of anything like this, so I don't have an answer.
Short answer is, there is no hard and fast reason to not use a bit field unless you're saying that it's use is inappropriate for the business data at hand. For that matter, there is no hard and fast reason TO use a bit field unless it's use is appropriate for the business data at hand.
In other words, it depends.
Tell the other guy to lighten up Francis.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
Viewing 7 posts - 1 through 7 (of 7 total)
You must be logged in to reply to this topic. Login to reply