Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase 123»»»

Introduction to Bitmasking in SQL Server, Part 2 Expand / Collapse
Author
Message
Posted Tuesday, April 28, 2009 2:33 AM


SSC Veteran

SSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC Veteran

Group: General Forum Members
Last Login: Thursday, December 18, 2014 4:07 PM
Points: 215, Visits: 658
Comments posted to this topic are about the item Introduction to Bitmasking in SQL Server, Part 2




Lee Everest

Post #705575
Posted Thursday, May 21, 2009 6:37 AM
SSC Rookie

SSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC Rookie

Group: General Forum Members
Last Login: Monday, January 30, 2012 5:12 AM
Points: 40, Visits: 143
I see one problem with this solution. You can store only 64 values (bigint 2^63 ) And even with numeric data type there are limitations.
This is where you need to be careful.

Leo P.
Post #721271
Posted Thursday, May 21, 2009 7:18 AM


SSC Veteran

SSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC Veteran

Group: General Forum Members
Last Login: Thursday, December 18, 2014 4:07 PM
Points: 215, Visits: 658
Great point, Leo. For over 64 items you'd have to revert back to the method in part one, or make multiple masks to handle the additional options.

Thanks much





Lee Everest

Post #721318
Posted Thursday, May 21, 2009 7:58 AM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Thursday, July 22, 2010 8:59 AM
Points: 110, Visits: 952
This still feels to me like it's violating a basic principle of normalization by creating a dependency of multiple (possibly unrelated) points of data on a single field. It's analogous to implementing an object using a memo field to record properties. You have to parse that encoding every time you want to do something with it. Five fields encoded in a single integer can only be indexed one way. Five separate bit fields allows for much greater flexibility in indexing. I admire the geek elegance, but it's not something that 'average Joe' will understand. So for anyone to use this information, we would need a view to turn the values back into something readable. Seems like unnecessary overhead to introduce this abstraction. If I ever inherit a solution like this, I would probably work to replace it with something more obvious. Even if there is a slight performance advantage to bitmask-encoded data, I don't feel it's worth the extra effort for maintenance people to (re)figure it out every time they(we) have to work on it.

thanks for letting me add my 2 cents.
Post #721357
Posted Thursday, May 21, 2009 8:18 AM
Mr or Mrs. 500

Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500

Group: General Forum Members
Last Login: Wednesday, November 26, 2014 4:10 PM
Points: 554, Visits: 1,208
Perhaps using Persisted Computed Columns could provide the best of both worlds? Define the attributes as columns accordingly and then use the computed columns to be the bit stuffed column. Then you have the best of both worlds with the denormalization being a computed column. From a record insertion stand point, you would have to put them in their individual columns or provide a stored procedure for splitting the bit mask accordingly.
Post #721377
Posted Thursday, May 21, 2009 8:27 AM


Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: Tuesday, December 2, 2014 11:00 AM
Points: 1,142, Visits: 2,693
I missed Part 1, can you provide a link to it?

EDIT:
Never mind I found it at the top of the article:'

http://www.sqlservercentral.com/articles/Miscellaneous/2748/


---------------------------------------------------------------------
Use Full Links:
KB Article from Microsoft on how to ask a question on a Forum
Post #721389
Posted Thursday, May 21, 2009 9:24 AM
SSC Rookie

SSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC Rookie

Group: General Forum Members
Last Login: Monday, January 30, 2012 5:12 AM
Points: 40, Visits: 143
Mike Dougherty (5/21/2009)
This still feels to me like it's violating a basic principle of normalization by creating a dependency of multiple (possibly unrelated) points of data on a single field. It's analogous to implementing an object using a memo field to record properties. You have to parse that encoding every time you want to do something with it. Five fields encoded in a single integer can only be indexed one way. Five separate bit fields allows for much greater flexibility in indexing. I admire the geek elegance, but it's not something that 'average Joe' will understand. So for anyone to use this information, we would need a view to turn the values back into something readable. Seems like unnecessary overhead to introduce this abstraction. If I ever inherit a solution like this, I would probably work to replace it with something more obvious. Even if there is a slight performance advantage to bitmask-encoded data, I don't feel it's worth the extra effort for maintenance people to (re)figure it out every time they(we) have to work on it.

thanks for letting me add my 2 cents.



Let me explain how we using this methodology without violation of basic principles. We using it for metadata lookup tables and each table has a primary key. There is enumeration column in each table one to one to primary key. So, when application is picking up multiple rows it get combined value and translate it on the front end. But in database all relationships are done through the primary keys.
Post #721441
Posted Thursday, May 21, 2009 10:02 AM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Monday, April 11, 2011 9:23 AM
Points: 121, Visits: 46
LP (5/21/2009)
Let me explain how we using this methodology without violation of basic principles. We using it for metadata lookup tables and each table has a primary key. There is enumeration column in each table one to one to primary key. So, when application is picking up multiple rows it get combined value and translate it on the front end. But in database all relationships are done through the primary keys.


That doesn't seem to address the "basic principle" of not storing multiple attributes/values in a single column. That, too, is part of normalization.

Besides the comments that have been posted already about why bitmasking is not *generally* a good idea, I've not seen anyone mention how database engines can leverage multiple indexes with multiple criteria to quickly find a resulting rowset.

Storing multiple values into a single column will ALWAYS require a row scan to find rows that match the criteria. ALWAYS! (Although it was mentioned earlier that one might limit the domain with additional criteria such as Zip Code, which could allow at least a partial indexed search.)

Additionally, bitmasking is limited to the size of the object being used to store the actual value. In the examples the author gave, I believe 64 bits is the limit, and that's a hard limit. If you go beyond that, you need more columns and now your code has to know which data is stored where *and* how to access it.

In short, I think what I'm really tryin' to say is that bitmasking is okay in some situations, but it's not really a scalable solution (either in values stored, nor data rows), eliminates the server's ability to make statistics and use those to resolve queries, can't be indexed (or at least offers no value of being indexed) and finally, requires loads more documentation than just a single column.

-- Mitch



--Mitch
Post #721472
Posted Thursday, May 21, 2009 11:14 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Friday, May 13, 2011 4:49 PM
Points: 3, Visits: 62
I am a big proponent of bitmasks but I have found that other SQL developers have a hard time coming to grips with the concept. (I am an engineer by trade, maybe that helps?)

When implementing bitmaps (or flags is another term) I suggest providing helper functions or stored procs to i) make the SQL more readable and ii) minimize errors in using bitmaps associated with using the wrong bitwise operator or bitmask.

I have written stored procs that will update (set/reset) individual bits based on the flagname (stored in a reference table). I also allow the user to pass the table and the keys/values to the dataset to be updated and dynamically generate the where clause.

Regarding performance, I was under the impression that using bitwise operators in a where clause was very efficient, but I may be completely wrong.

Is [ColumnName] & FlagNumberMask
quicker than
field1 = 1 and field2 = 1 and field3 = 1 and field4 = 0....

Would every bit field have to be indexed???

Post #721522
Posted Thursday, May 21, 2009 11:55 AM
SSC Rookie

SSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC Rookie

Group: General Forum Members
Last Login: Monday, January 30, 2012 5:12 AM
Points: 40, Visits: 143
Mitch Miller (5/21/2009)
LP (5/21/2009)
Let me explain how we using this methodology without violation of basic principles. We using it for metadata lookup tables and each table has a primary key. There is enumeration column in each table one to one to primary key. So, when application is picking up multiple rows it get combined value and translate it on the front end. But in database all relationships are done through the primary keys.


That doesn't seem to address the "basic principle" of not storing multiple attributes/values in a single column. That, too, is part of normalization.

Besides the comments that have been posted already about why bitmasking is not *generally* a good idea, I've not seen anyone mention how database engines can leverage multiple indexes with multiple criteria to quickly find a resulting rowset.

Storing multiple values into a single column will ALWAYS require a row scan to find rows that match the criteria. ALWAYS! (Although it was mentioned earlier that one might limit the domain with additional criteria such as Zip Code, which could allow at least a partial indexed search.)

Additionally, bitmasking is limited to the size of the object being used to store the actual value. In the examples the author gave, I believe 64 bits is the limit, and that's a hard limit. If you go beyond that, you need more columns and now your code has to know which data is stored where *and* how to access it.

In short, I think what I'm really tryin' to say is that bitmasking is okay in some situations, but it's not really a scalable solution (either in values stored, nor data rows), eliminates the server's ability to make statistics and use those to resolve queries, can't be indexed (or at least offers no value of being indexed) and finally, requires loads more documentation than just a single column.

-- Mitch


create table test (code char(2) primary key, descr varchar(100), enumvalue bigint)

insert into test (code,descr,enumvalue)
values ('c1', 'code 1', 1)


insert into test (code,descr,enumvalue)
values ('c2', 'code 2', 2)


insert into test (code,descr,enumvalue)
values ('c3', 'code 3', 4)

Now, when you select you get result as one value. If value is 5 then you select code 1 and 4.
How it is violate the design if combination is used only in stored procedure and translation on front end.
Database perfectly storing 1 row and link one row to another table. I am not discussing why we need it.
Post #721547
« Prev Topic | Next Topic »

Add to briefcase 123»»»

Permissions Expand / Collapse