SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Introduction to Bitmasking in SQL Server, Part 2


Introduction to Bitmasking in SQL Server, Part 2

Author
Message
ab5sr
ab5sr
Mr or Mrs. 500
Mr or Mrs. 500 (530 reputation)Mr or Mrs. 500 (530 reputation)Mr or Mrs. 500 (530 reputation)Mr or Mrs. 500 (530 reputation)Mr or Mrs. 500 (530 reputation)Mr or Mrs. 500 (530 reputation)Mr or Mrs. 500 (530 reputation)Mr or Mrs. 500 (530 reputation)

Group: General Forum Members
Points: 530 Visits: 701
Comments posted to this topic are about the item Introduction to Bitmasking in SQL Server, Part 2

Lee Everest


LP-181697
LP-181697
SSC-Enthusiastic
SSC-Enthusiastic (176 reputation)SSC-Enthusiastic (176 reputation)SSC-Enthusiastic (176 reputation)SSC-Enthusiastic (176 reputation)SSC-Enthusiastic (176 reputation)SSC-Enthusiastic (176 reputation)SSC-Enthusiastic (176 reputation)SSC-Enthusiastic (176 reputation)

Group: General Forum Members
Points: 176 Visits: 143
I see one problem with this solution. You can store only 64 values (bigint 2^63 ) And even with numeric data type there are limitations.
This is where you need to be careful.

Leo P.
ab5sr
ab5sr
Mr or Mrs. 500
Mr or Mrs. 500 (530 reputation)Mr or Mrs. 500 (530 reputation)Mr or Mrs. 500 (530 reputation)Mr or Mrs. 500 (530 reputation)Mr or Mrs. 500 (530 reputation)Mr or Mrs. 500 (530 reputation)Mr or Mrs. 500 (530 reputation)Mr or Mrs. 500 (530 reputation)

Group: General Forum Members
Points: 530 Visits: 701
Great point, Leo. For over 64 items you'd have to revert back to the method in part one, or make multiple masks to handle the additional options.

Thanks much

Lee Everest


Mike Dougherty
Mike Dougherty
SSC Veteran
SSC Veteran (226 reputation)SSC Veteran (226 reputation)SSC Veteran (226 reputation)SSC Veteran (226 reputation)SSC Veteran (226 reputation)SSC Veteran (226 reputation)SSC Veteran (226 reputation)SSC Veteran (226 reputation)

Group: General Forum Members
Points: 226 Visits: 952
This still feels to me like it's violating a basic principle of normalization by creating a dependency of multiple (possibly unrelated) points of data on a single field. It's analogous to implementing an object using a memo field to record properties. You have to parse that encoding every time you want to do something with it. Five fields encoded in a single integer can only be indexed one way. Five separate bit fields allows for much greater flexibility in indexing. I admire the geek elegance, but it's not something that 'average Joe' will understand. So for anyone to use this information, we would need a view to turn the values back into something readable. Seems like unnecessary overhead to introduce this abstraction. If I ever inherit a solution like this, I would probably work to replace it with something more obvious. Even if there is a slight performance advantage to bitmask-encoded data, I don't feel it's worth the extra effort for maintenance people to (re)figure it out every time they(we) have to work on it.

thanks for letting me add my 2 cents.
Bradley Deem
Bradley Deem
Right there with Babe
Right there with Babe (743 reputation)Right there with Babe (743 reputation)Right there with Babe (743 reputation)Right there with Babe (743 reputation)Right there with Babe (743 reputation)Right there with Babe (743 reputation)Right there with Babe (743 reputation)Right there with Babe (743 reputation)

Group: General Forum Members
Points: 743 Visits: 1248
Perhaps using Persisted Computed Columns could provide the best of both worlds? Define the attributes as columns accordingly and then use the computed columns to be the bit stuffed column. Then you have the best of both worlds with the denormalization being a computed column. From a record insertion stand point, you would have to put them in their individual columns or provide a stored procedure for splitting the bit mask accordingly.
Trey Staker
Trey Staker
Ten Centuries
Ten Centuries (1.4K reputation)Ten Centuries (1.4K reputation)Ten Centuries (1.4K reputation)Ten Centuries (1.4K reputation)Ten Centuries (1.4K reputation)Ten Centuries (1.4K reputation)Ten Centuries (1.4K reputation)Ten Centuries (1.4K reputation)

Group: General Forum Members
Points: 1376 Visits: 2788
I missed Part 1, can you provide a link to it?

EDIT:
Never mind I found it at the top of the article:'

http://www.sqlservercentral.com/articles/Miscellaneous/2748/

---------------------------------------------------------------------
Use Full Links:
KB Article from Microsoft on how to ask a question on a Forum
LP-181697
LP-181697
SSC-Enthusiastic
SSC-Enthusiastic (176 reputation)SSC-Enthusiastic (176 reputation)SSC-Enthusiastic (176 reputation)SSC-Enthusiastic (176 reputation)SSC-Enthusiastic (176 reputation)SSC-Enthusiastic (176 reputation)SSC-Enthusiastic (176 reputation)SSC-Enthusiastic (176 reputation)

Group: General Forum Members
Points: 176 Visits: 143
Mike Dougherty (5/21/2009)
This still feels to me like it's violating a basic principle of normalization by creating a dependency of multiple (possibly unrelated) points of data on a single field. It's analogous to implementing an object using a memo field to record properties. You have to parse that encoding every time you want to do something with it. Five fields encoded in a single integer can only be indexed one way. Five separate bit fields allows for much greater flexibility in indexing. I admire the geek elegance, but it's not something that 'average Joe' will understand. So for anyone to use this information, we would need a view to turn the values back into something readable. Seems like unnecessary overhead to introduce this abstraction. If I ever inherit a solution like this, I would probably work to replace it with something more obvious. Even if there is a slight performance advantage to bitmask-encoded data, I don't feel it's worth the extra effort for maintenance people to (re)figure it out every time they(we) have to work on it.

thanks for letting me add my 2 cents.



Let me explain how we using this methodology without violation of basic principles. We using it for metadata lookup tables and each table has a primary key. There is enumeration column in each table one to one to primary key. So, when application is picking up multiple rows it get combined value and translate it on the front end. But in database all relationships are done through the primary keys.
Mitch Miller
Mitch Miller
SSC-Enthusiastic
SSC-Enthusiastic (163 reputation)SSC-Enthusiastic (163 reputation)SSC-Enthusiastic (163 reputation)SSC-Enthusiastic (163 reputation)SSC-Enthusiastic (163 reputation)SSC-Enthusiastic (163 reputation)SSC-Enthusiastic (163 reputation)SSC-Enthusiastic (163 reputation)

Group: General Forum Members
Points: 163 Visits: 46
LP (5/21/2009)
Let me explain how we using this methodology without violation of basic principles. We using it for metadata lookup tables and each table has a primary key. There is enumeration column in each table one to one to primary key. So, when application is picking up multiple rows it get combined value and translate it on the front end. But in database all relationships are done through the primary keys.


That doesn't seem to address the "basic principle" of not storing multiple attributes/values in a single column. That, too, is part of normalization.

Besides the comments that have been posted already about why bitmasking is not *generally* a good idea, I've not seen anyone mention how database engines can leverage multiple indexes with multiple criteria to quickly find a resulting rowset.

Storing multiple values into a single column will ALWAYS require a row scan to find rows that match the criteria. ALWAYS! (Although it was mentioned earlier that one might limit the domain with additional criteria such as Zip Code, which could allow at least a partial indexed search.)

Additionally, bitmasking is limited to the size of the object being used to store the actual value. In the examples the author gave, I believe 64 bits is the limit, and that's a hard limit. If you go beyond that, you need more columns and now your code has to know which data is stored where *and* how to access it.

In short, I think what I'm really tryin' to say is that bitmasking is okay in some situations, but it's not really a scalable solution (either in values stored, nor data rows), eliminates the server's ability to make statistics and use those to resolve queries, can't be indexed (or at least offers no value of being indexed) and finally, requires loads more documentation than just a single column.

-- Mitch


--Mitch
philnewell
philnewell
Forum Newbie
Forum Newbie (9 reputation)Forum Newbie (9 reputation)Forum Newbie (9 reputation)Forum Newbie (9 reputation)Forum Newbie (9 reputation)Forum Newbie (9 reputation)Forum Newbie (9 reputation)Forum Newbie (9 reputation)

Group: General Forum Members
Points: 9 Visits: 62
I am a big proponent of bitmasks but I have found that other SQL developers have a hard time coming to grips with the concept. (I am an engineer by trade, maybe that helps?)

When implementing bitmaps (or flags is another term) I suggest providing helper functions or stored procs to i) make the SQL more readable and ii) minimize errors in using bitmaps associated with using the wrong bitwise operator or bitmask.

I have written stored procs that will update (set/reset) individual bits based on the flagname (stored in a reference table). I also allow the user to pass the table and the keys/values to the dataset to be updated and dynamically generate the where clause.

Regarding performance, I was under the impression that using bitwise operators in a where clause was very efficient, but I may be completely wrong.

Is [ColumnName] & FlagNumberMask
quicker than
field1 = 1 and field2 = 1 and field3 = 1 and field4 = 0....

Would every bit field have to be indexed???
LP-181697
LP-181697
SSC-Enthusiastic
SSC-Enthusiastic (176 reputation)SSC-Enthusiastic (176 reputation)SSC-Enthusiastic (176 reputation)SSC-Enthusiastic (176 reputation)SSC-Enthusiastic (176 reputation)SSC-Enthusiastic (176 reputation)SSC-Enthusiastic (176 reputation)SSC-Enthusiastic (176 reputation)

Group: General Forum Members
Points: 176 Visits: 143
Mitch Miller (5/21/2009)
LP (5/21/2009)
Let me explain how we using this methodology without violation of basic principles. We using it for metadata lookup tables and each table has a primary key. There is enumeration column in each table one to one to primary key. So, when application is picking up multiple rows it get combined value and translate it on the front end. But in database all relationships are done through the primary keys.


That doesn't seem to address the "basic principle" of not storing multiple attributes/values in a single column. That, too, is part of normalization.

Besides the comments that have been posted already about why bitmasking is not *generally* a good idea, I've not seen anyone mention how database engines can leverage multiple indexes with multiple criteria to quickly find a resulting rowset.

Storing multiple values into a single column will ALWAYS require a row scan to find rows that match the criteria. ALWAYS! (Although it was mentioned earlier that one might limit the domain with additional criteria such as Zip Code, which could allow at least a partial indexed search.)

Additionally, bitmasking is limited to the size of the object being used to store the actual value. In the examples the author gave, I believe 64 bits is the limit, and that's a hard limit. If you go beyond that, you need more columns and now your code has to know which data is stored where *and* how to access it.

In short, I think what I'm really tryin' to say is that bitmasking is okay in some situations, but it's not really a scalable solution (either in values stored, nor data rows), eliminates the server's ability to make statistics and use those to resolve queries, can't be indexed (or at least offers no value of being indexed) and finally, requires loads more documentation than just a single column.

-- Mitch


create table test (code char(2) primary key, descr varchar(100), enumvalue bigint)

insert into test (code,descr,enumvalue)
values ('c1', 'code 1', 1)


insert into test (code,descr,enumvalue)
values ('c2', 'code 2', 2)


insert into test (code,descr,enumvalue)
values ('c3', 'code 3', 4)

Now, when you select you get result as one value. If value is 5 then you select code 1 and 4.
How it is violate the design if combination is used only in stored procedure and translation on front end.
Database perfectly storing 1 row and link one row to another table. I am not discussing why we need it.
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search