Storing Checkbox Responses as Integers

  • Comments posted to this topic are about the content posted at

    Bruce Szabo, MCSE+I, MCDBA, MCSD

  • This is also the way I have done things. We have a form with 7 check boxes for days of the week, the user can

    check any combination and this can be stored neatly and efficiently as 1 tinyint range [0-127]. The bitwise queries to retrieve

    the data are also efficient and very useful once you have taken the time to understand the maths behind the operations. The alternative

    is 7 database fields and queries with lots of OR operators - a lot more work to product some ugly queries.


    Andy Jones


  • Interesting. What about when you have a pick list of a hundred or a thousand items? Not uncommon at all.

    The approach we've taken is store a code for each answer selected. If you pick 3 items, we store 3 rows in the answer table.


  • I think it's a question of scale. If business rules dictate a small number that is unlikely to substantially grow, I like having storage in a few columns. However, if this is an open-ended solution, I prefer to use multiple rows. Makes querying easier.

    Steve Jones

  • I've used all three methods, depending on the situation. I use packing bits most of the time when I need to "score" rows based on some criteria to determine a sort order, or ranking. This often takes the form of:

    update my_table

    set score = case when criteria_1 = <somevalue> then 2^15 else 0 end +

    case when criteria_2 = <anothervalue> then 2^14 else 0 end

    This works well for me when some criteria have a higher value, and so if one fact is true about a row, that causes it to score better than all rows that do not have that value set. Of course in this case, I generally don't need to pull the individual bits back out since I could instead just query the data, and I'm really just using the calculated field to sort the table.

    Matthew Galbraith

  • Kudos for the originality and efficiency. However, very respectfully, from a design perspective, I do not share an enthusiasm for such schemes for the following reasons:

    1. Like a couple other respondents, I think there are scalability issues. There is a practical limit to the number of possible responses for any given question.

    2. I'll be interested in seeing your approach to querying, but to me it looks like you'll be using the '&' operator. Not everyone would agree that this is programmer-friendly.

    3. I have an old-fashioned bias in favor of storing data in a manner, where possible, that looks like what's being modeled. A simple schema would have a table containing a row for each question on a survey, and another table containing a row for each user/response to each question; multiple responses (e.g., question 2: 'a', 'b', and 'd') would get multiple rows. The disadvantages in space would be compensated for by the ability to include textual responses such as comment fields.

    These are just my own prejudices, and are certainly not being put forth as "better". Your method, in fact, is more sophisticated and rates a much higher "Gee whiz!" factor.

    Edited by - Lee Dise on 06/27/2002 2:27:55 PM

  • Thanks for the kind words and suggestions. I am not sure who gets more out of the articles in this format the reader or the author. When I sent my articles in I was totally caught of guard by the forum but it is really cool.

    Scalability is a major drawback with this method and I should have noted that in the article. If I had more time on the project I probably would have tried to figure out where a good cut-off point is for this type of method.

    It is one of those projects that is on my list of things to get back to. Of course, that list keeps growing and I rarely get a chance to get back to it.

    Bruce Szabo, MCSE+I, MCDBA, MCSD

    Bruce Szabo, MCSE+I, MCDBA, MCSD

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply