Better to store 0 or null in a column

  • Hello,

    We have a table with 10 columns each column is of datatype int and can accept nulls, when I do a save is it better to have 0 inserted in to the column or just insert null?

    May seem like a daft question but I thought I'll ask and get an answer from a DBA's point of view.

  • Hi,

    NULL takes up the size of underlying data type Plus one bit for the null flag.

    If you have a lot of NULL columns in your table then defining your column as SPARSE makes sense. it actually means that your NULL value takes zero bits however their is a trade off when storing a value in such columns as it would take an additional 4 bits.

    Hope it helps...!!!

    Regards

    Shafat Husain

    Regards,
    Shafat Husain
    🙂
    And Your Lord Never Forgets...!! (64:19 -- Quran)

  • You're speaking to someone that has minimal SQL exposure, I no the basics but when it comes to designing/learning a good database design I would require further explanation in to what you have just mentioned (pretend your speaking to a complete newbie), these particular columns join on to another tables to return the particular description linked to the ID,

  • Use 0 if the value exists and it is 0.

    Use NULL if the value does not exist or is undefined.

    If you used 0 when NULL was more appropriate, how would you distinguish between zeros that are actually zero and zeros that mean NULL instead?

    If you don't want to use NULL, why did you define the column nullable in the first place? Is the value mandatory? If so, make it non-nullable.

    -- Gianluca Sartori

  • In a perfect world, you would not store NULL values. However, in the world we live in, it's pretty normal to store NULL.

    The real issue comes down to design and business requirements. You can store 0 by default for numbers, or some other value for other data types. But, before you do that, you need to work with the business to understand if they believe that 0 is the default or if they think that instead of 0 it's just simply unknown. If it's unknown, then NULL will work better for you because that's effectively what it means.

    But, there are implications to using NULL. You need to take into account three part logic for comparisons when using NULL. Here's a good article from Microsoft on the topic.

    Most of us use and deal with NULL.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Quick thought, under normal circumstances if a value can be unknown/undefined/missing then use NULL, as there would be no gain in a replacement value, only complications.

    😎

    In simple terms the NULL is presented by the appropriate bit in the NULL Bitmap, a series of bits, one for each column in the table, one bitmap per row, which indicate whether the column is NULL or not. The bitmap is there anyway regardless of the null-ability of the column (SQL Server 2012), so from that perspective the NULL has no penalty.

    One would hardly use sparse columns for the scenario described as it wouldn't be beneficial.

  • For columns containing a foreign key enforced with a constraint, you have no choice but to not use NULL, because the primary key can't contain a NULL. Even if there is no constraint (an unconstrained foreign key), it's best make the column non-nullable. For example, in the following column, 1 means 'Part Time', 2 means 'Full Time', but what would NULL mean? You should instead specify 0 to indicate 'Terminated'.

    EMPLOYMENT_STATUS

    1

    0

    3

    However, it's ambiguous and error prone for the business and SQL developers when common values are "stubbed" into non-key column. For example, you can try explaining that '000-000-0000' means the customer has opted into a Do-Not-Call list, and '111-111-1111' means the information is missing and pending data entry, but you'll be re-explaining that a hundred times over again, and end users will even start entering codes that make sense only to themselves.

    In this scenario, it's probably best to have phone number contain NULL and then have a seperate key column indicating the reason. For example: 1 = home, 2 = cell, 0 = missing, 9 = do not call.

    CUST_ID PHONE_NUMBER PHONE_TYPE

    442334 555-291-8821 1

    726111 NULL 0

    989383 555-309-1721 2

    019222 NULL 9

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • Eric M Russell (12/16/2014)


    For columns containing a foreign key enforced with a constraint, you have no choice but to not use NULL, because the primary key can't contain a NULL. Even if there is no constraint (an unconstrained foreign key), it's best make the column non-nullable. For example, in the following column, 1 means 'Part Time', 2 means 'Full Time', but what would NULL mean? You should instead specify 0 to indicate 'Terminated'.

    EMPLOYMENT_STATUS

    1

    0

    3

    However, it's ambiguous and error prone for the business and SQL developers when common values are "stubbed" into non-key column. For example, you can try explaining that '000-000-0000' means the customer has opted into a Do-Not-Call list, and '111-111-1111' means the information is missing and pending data entry, but you'll be re-explaining that a hundred times over again, and end users will even start entering codes that make sense only to themselves.

    In this scenario, it's probably best to have phone number contain NULL and then have a seperate key column indicating the reason. For example: 1 = home, 2 = cell, 0 = missing, 9 = do not call.

    CUST_ID PHONE_NUMBER PHONE_TYPE

    442334 555-291-8821 1

    726111 NULL 0

    989383 555-309-1721 2

    019222 NULL 9

    There is an alternative for this case which is the sparse column, it practically does normal foreign key constraint when the value is not null, otherwise ignores it, here is an example I posted a while back

    😎

  • Eirikur Eiriksson (12/16/2014)

    ...

    There is an alternative for this case which is the sparse column, it practically does normal foreign key constraint when the value is not null, otherwise ignores it, here is an example I posted a while back

    😎

    For a wide fact table in a data warehouse, Clustered ColumnStore can provide 10x compression of sparse or repeating columns without resorting to XML columns and odd SQL query syntax.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • Eric M Russell (12/16/2014)


    Eirikur Eiriksson (12/16/2014)

    ...

    There is an alternative for this case which is the sparse column, it practically does normal foreign key constraint when the value is not null, otherwise ignores it, here is an example I posted a while back

    😎

    For a wide fact table in a data warehouse, Clustered ColumnStore can provide 10x compression of sparse or repeating columns without resorting to XML columns and odd SQL query syntax.

    I have a particular liking for "odd syntax":-D

    Although the example I mentioned is mainly on sparse columns as an EAV alternative, the point is that when the column is sparse it handles foreign key constraints in a "sparse" manner, that is when there is a value it enforces the constraints otherwise ignores it. ColumnStore might be a tiny little bit of an overkill in this situation;-)

    😎

  • Eirikur Eiriksson (12/16/2014)


    Eric M Russell (12/16/2014)


    Eirikur Eiriksson (12/16/2014)

    ...

    There is an alternative for this case which is the sparse column, it practically does normal foreign key constraint when the value is not null, otherwise ignores it, here is an example I posted a while back

    😎

    For a wide fact table in a data warehouse, Clustered ColumnStore can provide 10x compression of sparse or repeating columns without resorting to XML columns and odd SQL query syntax.

    I have a particular liking for "odd syntax":-D

    Although the example I mentioned is mainly on sparse columns as an EAV alternative, the point is that when the column is sparse it handles foreign key constraints in a "sparse" manner, that is when there is a value it enforces the constraints otherwise ignores it. ColumnStore might be a tiny little bit of an overkill in this situation;-)

    😎

    My earlier comments about NULL values in key columns and stub values in non-key columns had more to do with taking logical design into consideration.

    It would help if the OP provided sample DDL for context. The answer also depends on whether it is an OLTP database or warehouse. He did say there were 10 columns, which doesn't seem to imply the need for a Column Set.

    Starting with SQL Server 2014, the developers may want to consider Clustered ColumnStore as their default table type in a star schema or other warehouse type database, because it compresses low cardinality columns (including both mostly NULL or low selectivity) to a very high degree. Also another benefit is that each colummn is essentially it's own clustered index, so it's highly optimized for ad-hoc querying. The developer doesn't have to decide up front which columns are to be treated "sparsely" or which should be indexed. It's just a natural fit for a data warehousing case usage. However, it's not appropriate for an OLTP database for a number of reasons.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • spaghettidba (12/16/2014)


    Use 0 if the value exists and it is 0.

    Use NULL if the value does not exist or is undefined.

    If you used 0 when NULL was more appropriate, how would you distinguish between zeros that are actually zero and zeros that mean NULL instead?

    If you don't want to use NULL, why did you define the column nullable in the first place? Is the value mandatory? If so, make it non-nullable.

    +1

    Absolutely. 0 and NULL are not synonymous.

    Don Simpson



    I'm not sure about Heisenberg.

  • DonlSimpson (12/17/2014)


    spaghettidba (12/16/2014)


    Use 0 if the value exists and it is 0.

    Use NULL if the value does not exist or is undefined.

    If you used 0 when NULL was more appropriate, how would you distinguish between zeros that are actually zero and zeros that mean NULL instead?

    If you don't want to use NULL, why did you define the column nullable in the first place? Is the value mandatory? If so, make it non-nullable.

    +1

    Absolutely. 0 and NULL are not synonymous.

    As a side note, on some RDMS platforms, various stubbed values are treated literally as NULL. For example, Oracle treats an empty string as NULL.

    What this means is that ...

    '' is null : TRUE

    'Smith' > '' : FALSE

    '' = '' : FALSE

    I discovered that the hard way when working on a project that exchanged data between SQL Server and Oracle.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • Eric M Russell (12/17/2014)


    DonlSimpson (12/17/2014)


    spaghettidba (12/16/2014)


    Use 0 if the value exists and it is 0.

    Use NULL if the value does not exist or is undefined.

    If you used 0 when NULL was more appropriate, how would you distinguish between zeros that are actually zero and zeros that mean NULL instead?

    If you don't want to use NULL, why did you define the column nullable in the first place? Is the value mandatory? If so, make it non-nullable.

    +1

    Absolutely. 0 and NULL are not synonymous.

    As a side note, on some RDMS platforms, various stubbed values are treated literally as NULL. For example, Oracle treats an empty string as NULL.

    What this means is that ...

    '' is null : TRUE

    'Smith' > '' : FALSE

    '' = '' : FALSE

    I discovered that the hard way when working on a project that exchanged data between SQL Server and Oracle.

    Well that's tragic.

    Don Simpson



    I'm not sure about Heisenberg.

  • 0 means 0 , NULL means unknown. They are not the same.

    One approach if you do not want to have nulls in your table is to have a "magic" value, like -1 to imply an unknown. This is usually done in a data warehouse but I wonder on the meaning of the column. If these are measurements and need to be additive, dont use a magic value, leave a NULL in there. It keeps any sum on the column accurate.

    ----------------------------------------------------

Viewing 15 posts - 1 through 15 (of 17 total)

You must be logged in to reply to this topic. Login to reply