Introduction to Bitmasking in SQL Server 2005

  • It gets materialised when indexed.  So you add another 8 bytes / row + PK size.

     

    But then again this may be very acceptable if the inserts can be a tiny bit slower but the selects much easier to write, and maybe more efficient (still don't know if that bunch of ors completely miss the index seek or not)).

  • I'm not sure I'm following you here Joe.  Can you explain more on what you think is the best indexing approach?  I have no experience live or theorical in that case, especially with ultra large amounts of data.

  • In 192.168.2.10 the octets would be as follows

    1. 192
    2. 168
    3. 2
    4. 10

    Would I be correct in thinking that by using the most changing octet SQL Server is most likely to think that the index is a good selective index?

  • The problem has never been selectivity... I'm still wondering how the heck the server will be able to do a seek on a search like this one :

     

    WHERE(

    Oct_1 = 72

    AND (Oct_2>108 OR (Oct_2=108 AND Oct_3>10) OR(Oct_2=108 AND Oct_3=10 AND Oct_4>=2))

    )

    OR

    Oct_1 = 73

    AND (Oct_2<90 OR (Oct_2=90 AND Oct_3<=205) OR(Oct_2=90 AND Oct_3=205 AND Oct_4<=1))

    )

  • I've got to put up a Christmas tree, swear at the lights, murder a few carols and fight with relatives, but I will give it a go after Christmas and get back to you.

  • Joe you still in the Austin area?

  • Doesn't sound like you are slowing down any. Good. Live longer!

    I'll be heading down that way tomorrow, G'town, to Mom's for Christmas. Hope to plow over as many 'horns as possible en route

  • What do you get when you cross a sheep with a bee?

    Baa Hum Bug

  •  

    Lee,

      I found your article very intriguing.  The idea of bitmasking use hex values is indeed an old technique.  Back when programmers actually cared about memory and disk space; back when you had to.  Today's programmers take this for granted and create bloated programs that cost the end user gigabytes of space and a trip to the local computer shack for more RAM.

      Your article reminds us how to maximize space by representing data in hex format.  Actually reminds me a lot of IPV6 and all the hex values now used in this protocol.  Hex allows us to represent far more data in less space.  Even the authors of IPV4 learned this the hard way.  Now we use Hex for IPV6, Ok lessoned learned.

      Again great article and as far as using a DB for this, who cares? A DB is a data repository key word being data just as the file system (e.g. Windows registry) is used as a repository, or memory (temporary, however it is), and XML.

    Fantastic!

    - M. Ross

  • Thanks Matt. I agree and have come to the conclusion that those not agreeing with the article, or those otherwise opposed to using the relational engine in this fashion, would be the first to store XML, a blob, etc. in the database, but would rather have someone else's engine mask the data to binary. I am simply replacing someone's engine for doing this. This is the what they cannot comprehend or make sense out of. And, as far as I can remember, a binary data type is ANSI compliant regardless of who fashions the binary data which goes into it - a third-party program, .Net, or a programmer/db developer.

  • ab5sr (3/23/2007)


    ... those otherwise opposed to using the relational engine in this fashion, would be the first to store XML, a blob, etc. in the database...

    I'd say just the opposite. Those who have issues with storing data in this manner would likely have the same issue with XML in the database and for the same reason. Sure it's possible to do it, but then it's not data anymore. It cannot be queried with SET logic. All the benefits of using a relational database are lost. I suppose if you're okay with that, then Godspeed. I would personally avoid it if at all possible.

    It is an interesting article nonetheless. 🙂

  • Matthew Ross (2/12/2007)


    Lee,

    ... as far as using a DB for this, who cares? A DB is a data repository key word being data just as the file system (e.g. Windows registry) is used as a repository, or memory (temporary, however it is), and XML.

    - M. Ross

    Yeah... a database is a place to store data, just like an F-15 is a place to store jet fuel. Sorry, but you don't know what you are talking about.

    /*****************

    If most people are not willing to see the difficulty, this is mainly because, consciously or unconsciously, they assume that it will be they who will settle these questions for the others, and because they are convinced of their own capacity to do this. -Friedrich August von Hayek

    *****************/

  • Whether I agree with bit-masking in SQL Server or not, I enjoyed the article. Thanks.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • good article, it will be relevant to folks working with hierarchy ids that need to maintain sibling order.

    I think I see why Lee avoided varbinary in the example. It looks like assigning a value to a varbinary, witnessing it's length as n, and then bit masking it against itself can result in it having a length longer than n, even if it (the growth in bytes) wasnt necessary.

  • I can understand why people might not like the idea of storing multiple column values in one encoded field, but it does occur to me that this would be a great technique for passing values in a web app. instead of http://blah.com?field1=1&field2=5&field3=7

    Bitmask all the values selected and pass them encoded

    http://blah.com?referringData=00bx7f

    then decode on the page to values for SQL or whatever.

    You get a lot of things here: users can't fish by inserting other values in the POST string; your POST string looks LEE7 (sorry...:-D); you have a shorter POST string.

    If you do want to store bitmasked vals in the DB, then your response data could also be de-coded on the client side using Javascript (if this is supported). The downside there is, you can see Javascript in View Source. But, seems like this technique could help speed up data-intensive web applications.

    Just my 2 centavos. Sorry it's off-topic, but hey.

Viewing 15 posts - 46 through 59 (of 59 total)

You must be logged in to reply to this topic. Login to reply