character data type selection in table design

  • Let's say I have a field that will have a max value of 'yes'.

    What difference does it make if I choose VARCHAR(50) over VARCHAR(3)?  In terms of performance?

    I know the best practice is to choose the smallest data type to accomodate the expected values but in the case of VARCHAR - I've just done some basic testing and it seems that it makes no difference when the data is the same.

    I'd like to understand why to choose VARCHAR(3).

    Thank you!

  • perfect, very interesting - didn't realise any of the memory implications - thanks for the reply Jonathan.

  • snomadj - Tuesday, November 13, 2018 3:55 AM

    Let's say I have a field that will have a max value of 'yes'.

    What difference does it make if I choose VARCHAR(50) over VARCHAR(3)?  In terms of performance?

    I know the best practice is to choose the smallest data type to accomodate the expected values but in the case of VARCHAR - I've just done some basic testing and it seems that it makes no difference when the data is the same.

    I'd like to understand why to choose VARCHAR(3).

    Thank you!

    Worse than that, it should NOT be a VARCHAR(3).  VARCHAR automatically adds 2 bytes for length information of the data.  It also causes indexes to fragment quite rapidly and such fragmentation can actually be quite sever on the log file if the column starts out as NULL or changes from a shorter value like "No" to a larger value such as "Yes".  Going from NULL to either one will be even worse, especially for batch runs where the number of rows per page is high (narrow rows) either for the Clustered and especially for Non-Clustered Indexes.

    My recommendation for this column would be to assign it the TINYINT (rather than BIT) datatype and default it to zero.  It will never be expansive and you can't get any smaller than that (even a single BIT takes a byte) for a Yes/No indicator. 

    If the people using the data raise hell and prop it up on a stick because they don't know that "0" has been defined as meaning "No" and "1" has been defined as meaning "Yes" for this column, remind them that the words Yes and No are presentation layer stuff and should be avoided.

    If you still lose that battle, you can still win the war against totally unnecessary fragmentation.  Change the column to a CHAR(3), which will never be expansive even if a NULL is present.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden - Tuesday, November 13, 2018 7:56 AM

    snomadj - Tuesday, November 13, 2018 3:55 AM

    Let's say I have a field that will have a max value of 'yes'.

    What difference does it make if I choose VARCHAR(50) over VARCHAR(3)?  In terms of performance?

    I know the best practice is to choose the smallest data type to accomodate the expected values but in the case of VARCHAR - I've just done some basic testing and it seems that it makes no difference when the data is the same.

    I'd like to understand why to choose VARCHAR(3).

    Thank you!

    Worse than that, it should NOT be a VARCHAR(3).  VARCHAR automatically adds 2 bytes for length information of the data.  It also causes indexes to fragment quite rapidly and such fragmentation can actually be quite sever on the log file if the column starts out as NULL or changes from a shorter value like "No" to a larger value such as "Yes".  Going from NULL to either one will be even worse, especially for batch runs where the number of rows per page is high (narrow rows) either for the Clustered and especially for Non-Clustered Indexes.

    My recommendation for this column would be to assign it the TINYINT (rather than BIT) datatype and default it to zero.  It will never be expansive and you can't get any smaller than that (even a single BIT takes a byte) for a Yes/No indicator. 

    If the people using the data raise hell and prop it up on a stick because they don't know that "0" has been defined as meaning "No" and "1" has been defined as meaning "Yes" for this column, remind them that the words Yes and No are presentation layer stuff and should be avoided.

    If you still lose that battle, you can still win the war against totally unnecessary fragmentation.  Change the column to a CHAR(3), which will never be expansive even if a NULL is present.

    If you have multiple BIT columns in your table they won't take a byte each 
    The SQL Server Database Engine optimizes storage of bit columns. If there are 8 or fewer bit columns in a table, the columns are stored as 1 byte. If there are from 9 up to 16 bit columns, the columns are stored as 2 bytes, and so on.
    https://docs.microsoft.com/en-us/sql/t-sql/data-types/bit-transact-sql?view=sql-server-2017

  • snomadj - Tuesday, November 13, 2018 3:55 AM

    I'd like to understand why to choose VARCHAR(3).

    VARCHAR(3) is a nice example of not being a nice example but here are a few key points:

    If most of the values in a VARCHAR(3) field are of length 1, and few are of lets say length 3, it's almost as effective as using char(3) in terms of storage consumed. The benefit of varchar(3) would be that data cannot have leading / trailing spaces (which would have to be TRIMmed away if you want to do certain operations with that data later on) which could let's say in a DWH Scenario save you some CPU Time.

    That would be my view on why to choose varchar(3), anyone else got some thoughts?

  • DinoRS - Tuesday, November 13, 2018 8:36 AM

    snomadj - Tuesday, November 13, 2018 3:55 AM

    I'd like to understand why to choose VARCHAR(3).

    VARCHAR(3) is a nice example of not being a nice example but here are a few key points:

    If most of the values in a VARCHAR(3) field are of length 1, and few are of lets say length 3, it's almost as effective as using char(3) in terms of storage consumed. The benefit of varchar(3) would be that data cannot have leading / trailing spaces (which would have to be TRIMmed away if you want to do certain operations with that data later on) which could let's say in a DWH Scenario save you some CPU Time.

    That would be my view on why to choose varchar(3), anyone else got some thoughts?

    Yes.  I already posted mine. 😉

    Also... SQL Server doesn't actually care about trailing spaces if you do a join and, if the column is actually restricted to Yes/No, there would only be 1 trailing space at most.  CHAR(3) only takes up 3 bytes all the time.  VARCHAR(3) takes at least that much just for 1 character because of the 2 bytes necessary for storing the length and will take up 4 for No and 5 for Yes.

    While it all seems completely trivial, it's not when it comes to page splits and fragmentation. 

    Again, the best thing is to stop storing data for the sake of the presentation layer.  This column should be a TinyInt with No=0 and Yes=1.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jonathan AC Roberts - Tuesday, November 13, 2018 8:07 AM

    Jeff Moden - Tuesday, November 13, 2018 7:56 AM

    snomadj - Tuesday, November 13, 2018 3:55 AM

    Let's say I have a field that will have a max value of 'yes'.

    What difference does it make if I choose VARCHAR(50) over VARCHAR(3)?  In terms of performance?

    I know the best practice is to choose the smallest data type to accomodate the expected values but in the case of VARCHAR - I've just done some basic testing and it seems that it makes no difference when the data is the same.

    I'd like to understand why to choose VARCHAR(3).

    Thank you!

    Worse than that, it should NOT be a VARCHAR(3).  VARCHAR automatically adds 2 bytes for length information of the data.  It also causes indexes to fragment quite rapidly and such fragmentation can actually be quite sever on the log file if the column starts out as NULL or changes from a shorter value like "No" to a larger value such as "Yes".  Going from NULL to either one will be even worse, especially for batch runs where the number of rows per page is high (narrow rows) either for the Clustered and especially for Non-Clustered Indexes.

    My recommendation for this column would be to assign it the TINYINT (rather than BIT) datatype and default it to zero.  It will never be expansive and you can't get any smaller than that (even a single BIT takes a byte) for a Yes/No indicator. 

    If the people using the data raise hell and prop it up on a stick because they don't know that "0" has been defined as meaning "No" and "1" has been defined as meaning "Yes" for this column, remind them that the words Yes and No are presentation layer stuff and should be avoided.

    If you still lose that battle, you can still win the war against totally unnecessary fragmentation.  Change the column to a CHAR(3), which will never be expansive even if a NULL is present.

    If you have multiple BIT columns in your table they won't take a byte each 
    The SQL Server Database Engine optimizes storage of bit columns. If there are 8 or fewer bit columns in a table, the columns are stored as 1 byte. If there are from 9 up to 16 bit columns, the columns are stored as 2 bytes, and so on.
    https://docs.microsoft.com/en-us/sql/t-sql/data-types/bit-transact-sql?view=sql-server-2017

    Yep.  I get that but if you have just 1 bit column, it still takes a byte and it can't be indexed and you can't use aggregations such as SUM to quickly determine how many Yes's you have, etc, etc.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden - Tuesday, November 13, 2018 9:14 AM

    Jonathan AC Roberts - Tuesday, November 13, 2018 8:07 AM

    Jeff Moden - Tuesday, November 13, 2018 7:56 AM

    snomadj - Tuesday, November 13, 2018 3:55 AM

    Let's say I have a field that will have a max value of 'yes'.

    What difference does it make if I choose VARCHAR(50) over VARCHAR(3)?  In terms of performance?

    I know the best practice is to choose the smallest data type to accomodate the expected values but in the case of VARCHAR - I've just done some basic testing and it seems that it makes no difference when the data is the same.

    I'd like to understand why to choose VARCHAR(3).

    Thank you!

    Worse than that, it should NOT be a VARCHAR(3).  VARCHAR automatically adds 2 bytes for length information of the data.  It also causes indexes to fragment quite rapidly and such fragmentation can actually be quite sever on the log file if the column starts out as NULL or changes from a shorter value like "No" to a larger value such as "Yes".  Going from NULL to either one will be even worse, especially for batch runs where the number of rows per page is high (narrow rows) either for the Clustered and especially for Non-Clustered Indexes.

    My recommendation for this column would be to assign it the TINYINT (rather than BIT) datatype and default it to zero.  It will never be expansive and you can't get any smaller than that (even a single BIT takes a byte) for a Yes/No indicator. 

    If the people using the data raise hell and prop it up on a stick because they don't know that "0" has been defined as meaning "No" and "1" has been defined as meaning "Yes" for this column, remind them that the words Yes and No are presentation layer stuff and should be avoided.

    If you still lose that battle, you can still win the war against totally unnecessary fragmentation.  Change the column to a CHAR(3), which will never be expansive even if a NULL is present.

    If you have multiple BIT columns in your table they won't take a byte each 
    The SQL Server Database Engine optimizes storage of bit columns. If there are 8 or fewer bit columns in a table, the columns are stored as 1 byte. If there are from 9 up to 16 bit columns, the columns are stored as 2 bytes, and so on.
    https://docs.microsoft.com/en-us/sql/t-sql/data-types/bit-transact-sql?view=sql-server-2017

    Yep.  I get that but if you have just 1 bit column, it still takes a byte and it can't be indexed and you can't use aggregations such as SUM to quickly determine how many Yes's you have, etc, etc.

    I just tried indexing a bit and it can be indexed and used:
    create table #myTable(
    id int identity(1,1) not null,
    myBit bit not null,
    comment varchar(20));

    insert into #myTable(myBit,comment)
    select t.n%2,'asdsa_'+convert(varchar,t.n )
    from tally t
    where t.n < 100000

    create index IX_#myTable_1 on #myTable(myBit,comment)

    select top(20) * from #myTable order by myBit,comment /* index used */
    select sum(convert(int,myBit))from #myTable
    go
    drop table #myTable

  • What's wrong with 'Y' and 'N' as CHAR(1)? Unlike 0 and 1, there's no chance of getting them the wrong way around (I've seen this!!). You could even make it non-nullable.

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Jonathan AC Roberts - Tuesday, November 13, 2018 9:36 AM

    Jeff Moden - Tuesday, November 13, 2018 9:14 AM

    Jonathan AC Roberts - Tuesday, November 13, 2018 8:07 AM

    Jeff Moden - Tuesday, November 13, 2018 7:56 AM

    snomadj - Tuesday, November 13, 2018 3:55 AM

    Let's say I have a field that will have a max value of 'yes'.

    What difference does it make if I choose VARCHAR(50) over VARCHAR(3)?  In terms of performance?

    I know the best practice is to choose the smallest data type to accomodate the expected values but in the case of VARCHAR - I've just done some basic testing and it seems that it makes no difference when the data is the same.

    I'd like to understand why to choose VARCHAR(3).

    Thank you!

    Worse than that, it should NOT be a VARCHAR(3).  VARCHAR automatically adds 2 bytes for length information of the data.  It also causes indexes to fragment quite rapidly and such fragmentation can actually be quite sever on the log file if the column starts out as NULL or changes from a shorter value like "No" to a larger value such as "Yes".  Going from NULL to either one will be even worse, especially for batch runs where the number of rows per page is high (narrow rows) either for the Clustered and especially for Non-Clustered Indexes.

    My recommendation for this column would be to assign it the TINYINT (rather than BIT) datatype and default it to zero.  It will never be expansive and you can't get any smaller than that (even a single BIT takes a byte) for a Yes/No indicator. 

    If the people using the data raise hell and prop it up on a stick because they don't know that "0" has been defined as meaning "No" and "1" has been defined as meaning "Yes" for this column, remind them that the words Yes and No are presentation layer stuff and should be avoided.

    If you still lose that battle, you can still win the war against totally unnecessary fragmentation.  Change the column to a CHAR(3), which will never be expansive even if a NULL is present.

    If you have multiple BIT columns in your table they won't take a byte each 
    The SQL Server Database Engine optimizes storage of bit columns. If there are 8 or fewer bit columns in a table, the columns are stored as 1 byte. If there are from 9 up to 16 bit columns, the columns are stored as 2 bytes, and so on.
    https://docs.microsoft.com/en-us/sql/t-sql/data-types/bit-transact-sql?view=sql-server-2017

    Yep.  I get that but if you have just 1 bit column, it still takes a byte and it can't be indexed and you can't use aggregations such as SUM to quickly determine how many Yes's you have, etc, etc.

    I just tried indexing a bit and it can be indexed and used:
    create table #myTable(
    id int identity(1,1) not null,
    myBit bit not null,
    comment varchar(20));

    insert into #myTable(myBit,comment)
    select t.n%2,'asdsa_'+convert(varchar,t.n )
    from tally t
    where t.n < 100000

    create index IX_#myTable_1 on #myTable(myBit,comment)

    select top(20) * from #myTable order by myBit,comment /* index used */
    select sum(convert(int,myBit))from #myTable
    go
    drop table #myTable

    You're absolutely correct.  My apologies.  Not sure what I was thinking other than the bad memories when I used such an index and accidently brought Expedia.com to its knees for about two minutes in my younger days.  Of course, any column with such a low cardinality would have done the same.

    You did prove that you can't do a SUM on a bit datatype.  You do have to convert it first.  That may or may not be an aggravation to folks.  "It Depends". 

    Another interesting (and sometimes useful but usually not) aspect of BIT columns is that they do virtually no numeric error checking for the datatype.  The BIT datatype will convert cheerfully negative and positive values to a 1 and even the constraint built into the table below won't stop it (borrowing heavily from your good code example).


    create table #myTable(
    id  int identity(1,1) not null,
    myBit bit not null CHECK (mybit between 0 and 1),
    comment varchar(20));

    insert into #myTable(myBit,comment)
    select t.n-50000,'asdsa_'+convert(varchar(10),t.n-50000 )
    from dbo.tally t
    where t.n < 100000;

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • ChrisM@Work - Tuesday, November 13, 2018 9:52 AM

    What's wrong with 'Y' and 'N' as CHAR(1)? Unlike 0 and 1, there's no chance of getting them the wrong way around (I've seen this!!). You could even make it non-nullable.

    ... and, as with a TINYINT, you can also add a constraint to check for legal values.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden - Tuesday, November 13, 2018 5:51 PM

    Jonathan AC Roberts - Tuesday, November 13, 2018 9:36 AM

    Jeff Moden - Tuesday, November 13, 2018 9:14 AM

    Jonathan AC Roberts - Tuesday, November 13, 2018 8:07 AM

    Jeff Moden - Tuesday, November 13, 2018 7:56 AM

    snomadj - Tuesday, November 13, 2018 3:55 AM

    Let's say I have a field that will have a max value of 'yes'.

    What difference does it make if I choose VARCHAR(50) over VARCHAR(3)?  In terms of performance?

    I know the best practice is to choose the smallest data type to accomodate the expected values but in the case of VARCHAR - I've just done some basic testing and it seems that it makes no difference when the data is the same.

    I'd like to understand why to choose VARCHAR(3).

    Thank you!

    Worse than that, it should NOT be a VARCHAR(3).  VARCHAR automatically adds 2 bytes for length information of the data.  It also causes indexes to fragment quite rapidly and such fragmentation can actually be quite sever on the log file if the column starts out as NULL or changes from a shorter value like "No" to a larger value such as "Yes".  Going from NULL to either one will be even worse, especially for batch runs where the number of rows per page is high (narrow rows) either for the Clustered and especially for Non-Clustered Indexes.

    My recommendation for this column would be to assign it the TINYINT (rather than BIT) datatype and default it to zero.  It will never be expansive and you can't get any smaller than that (even a single BIT takes a byte) for a Yes/No indicator. 

    If the people using the data raise hell and prop it up on a stick because they don't know that "0" has been defined as meaning "No" and "1" has been defined as meaning "Yes" for this column, remind them that the words Yes and No are presentation layer stuff and should be avoided.

    If you still lose that battle, you can still win the war against totally unnecessary fragmentation.  Change the column to a CHAR(3), which will never be expansive even if a NULL is present.

    If you have multiple BIT columns in your table they won't take a byte each 
    The SQL Server Database Engine optimizes storage of bit columns. If there are 8 or fewer bit columns in a table, the columns are stored as 1 byte. If there are from 9 up to 16 bit columns, the columns are stored as 2 bytes, and so on.
    https://docs.microsoft.com/en-us/sql/t-sql/data-types/bit-transact-sql?view=sql-server-2017

    Yep.  I get that but if you have just 1 bit column, it still takes a byte and it can't be indexed and you can't use aggregations such as SUM to quickly determine how many Yes's you have, etc, etc.

    I just tried indexing a bit and it can be indexed and used:
    create table #myTable(
    id int identity(1,1) not null,
    myBit bit not null,
    comment varchar(20));

    insert into #myTable(myBit,comment)
    select t.n%2,'asdsa_'+convert(varchar,t.n )
    from tally t
    where t.n < 100000

    create index IX_#myTable_1 on #myTable(myBit,comment)

    select top(20) * from #myTable order by myBit,comment /* index used */
    select sum(convert(int,myBit))from #myTable
    go
    drop table #myTable

    You're absolutely correct.  My apologies.  Not sure what I was thinking other than the bad memories when I used such an index and accidently brought Expedia.com to its knees for about two minutes in my younger days.  Of course, any column with such a low cardinality would have done the same.

    You did prove that you can't do a SUM on a bit datatype.  You do have to convert it first.  That may or may not be an aggravation to folks.  "It Depends". 

    Another interesting (and sometimes useful but usually not) aspect of BIT columns is that they do virtually no numeric error checking for the datatype.  The BIT datatype will convert cheerfully negative and positive values to a 1 and even the constraint built into the table below won't stop it (borrowing heavily from your good code example).


    create table #myTable(
    id  int identity(1,1) not null,
    myBit bit not null CHECK (mybit between 0 and 1),
    comment varchar(20));

    insert into #myTable(myBit,comment)
    select t.n-50000,'asdsa_'+convert(varchar(10),t.n-50000 )
    from dbo.tally t
    where t.n < 100000;

    Yes, and it is documented in the BOL: "Converting to bit promotes any nonzero value to 1."

  • Jonathan AC Roberts - Tuesday, November 13, 2018 6:31 PM

    Jeff Moden - Tuesday, November 13, 2018 5:51 PM

    Jonathan AC Roberts - Tuesday, November 13, 2018 9:36 AM

    Jeff Moden - Tuesday, November 13, 2018 9:14 AM

    Jonathan AC Roberts - Tuesday, November 13, 2018 8:07 AM

    Jeff Moden - Tuesday, November 13, 2018 7:56 AM

    snomadj - Tuesday, November 13, 2018 3:55 AM

    Let's say I have a field that will have a max value of 'yes'.

    What difference does it make if I choose VARCHAR(50) over VARCHAR(3)?  In terms of performance?

    I know the best practice is to choose the smallest data type to accomodate the expected values but in the case of VARCHAR - I've just done some basic testing and it seems that it makes no difference when the data is the same.

    I'd like to understand why to choose VARCHAR(3).

    Thank you!

    Worse than that, it should NOT be a VARCHAR(3).  VARCHAR automatically adds 2 bytes for length information of the data.  It also causes indexes to fragment quite rapidly and such fragmentation can actually be quite sever on the log file if the column starts out as NULL or changes from a shorter value like "No" to a larger value such as "Yes".  Going from NULL to either one will be even worse, especially for batch runs where the number of rows per page is high (narrow rows) either for the Clustered and especially for Non-Clustered Indexes.

    My recommendation for this column would be to assign it the TINYINT (rather than BIT) datatype and default it to zero.  It will never be expansive and you can't get any smaller than that (even a single BIT takes a byte) for a Yes/No indicator. 

    If the people using the data raise hell and prop it up on a stick because they don't know that "0" has been defined as meaning "No" and "1" has been defined as meaning "Yes" for this column, remind them that the words Yes and No are presentation layer stuff and should be avoided.

    If you still lose that battle, you can still win the war against totally unnecessary fragmentation.  Change the column to a CHAR(3), which will never be expansive even if a NULL is present.

    If you have multiple BIT columns in your table they won't take a byte each 
    The SQL Server Database Engine optimizes storage of bit columns. If there are 8 or fewer bit columns in a table, the columns are stored as 1 byte. If there are from 9 up to 16 bit columns, the columns are stored as 2 bytes, and so on.
    https://docs.microsoft.com/en-us/sql/t-sql/data-types/bit-transact-sql?view=sql-server-2017

    Yep.  I get that but if you have just 1 bit column, it still takes a byte and it can't be indexed and you can't use aggregations such as SUM to quickly determine how many Yes's you have, etc, etc.

    I just tried indexing a bit and it can be indexed and used:
    create table #myTable(
    id int identity(1,1) not null,
    myBit bit not null,
    comment varchar(20));

    insert into #myTable(myBit,comment)
    select t.n%2,'asdsa_'+convert(varchar,t.n )
    from tally t
    where t.n < 100000

    create index IX_#myTable_1 on #myTable(myBit,comment)

    select top(20) * from #myTable order by myBit,comment /* index used */
    select sum(convert(int,myBit))from #myTable
    go
    drop table #myTable

    You're absolutely correct.  My apologies.  Not sure what I was thinking other than the bad memories when I used such an index and accidently brought Expedia.com to its knees for about two minutes in my younger days.  Of course, any column with such a low cardinality would have done the same.

    You did prove that you can't do a SUM on a bit datatype.  You do have to convert it first.  That may or may not be an aggravation to folks.  "It Depends". 

    Another interesting (and sometimes useful but usually not) aspect of BIT columns is that they do virtually no numeric error checking for the datatype.  The BIT datatype will convert cheerfully negative and positive values to a 1 and even the constraint built into the table below won't stop it (borrowing heavily from your good code example).


    create table #myTable(
    id  int identity(1,1) not null,
    myBit bit not null CHECK (mybit between 0 and 1),
    comment varchar(20));

    insert into #myTable(myBit,comment)
    select t.n-50000,'asdsa_'+convert(varchar(10),t.n-50000 )
    from dbo.tally t
    where t.n < 100000;

    Yes, and it is documented in the BOL: "Converting to bit promotes any nonzero value to 1."

    Not sure what your point is here.  Of course it's documented in BOL.  I'm just explaining one of the reasons why I don't like the BIT datatype for most things.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply