TSQL Query

  • Comments posted to this topic are about the item TSQL Query

  • I prefer to use the term "NULL propogation" for this kind of behaviour. It is a little like multiplication of any non-zero operand by zero giving zero, except that NULL propogation leads to a NULL result if a NULL value is an operand in any operation, arithmetic or logical.

    The best way of avoiding NULL propogation in the case of arithmethic operations (where they generally produce unexpected or undesired effect) is to wrap each operand that may be the result of an expression which could evaluate to NULL, in a "NULL to Zero" function.

    Thus, for example, rather than writing

    gross_profit = total_sales - (cost_of_purchases + total_expenses)

    it would be better to write a NULL to zero function, N2Z(), and write instead

    gross_profit = N2Z(total_sales) - (N2Z(cost_of_puchases) + N2Z(total_expenses))

    That will ensure that in the (rare!) event that there are no exepenses (i.e. total_expenses evaluates to NULL), there will still be profit!

    You never know: reading my book: "All about your computer" might just tell you something you never knew!
    lulu.com/kaspencer

  • Thanks for the update,

    Yes it is good practice to avoid painfull bug and screaming your head later on.

  • Kas:

    although I agree with you in principle a function will cover bad design - if you mean zero then zero should be what is in the column; 😀

    I think the point here is that NULLs are a quick/dirty solution which allows DB developers to play and should not exist in production databases.

    Rule should be all non binary data columns in every row in every table should have a default value;-

    0 - for all numeric columns

    '' - (empty string) for text/character data

    31/12/1899 - or some other default date value for date fields.

    Nulls should not be allowed - its a design issue.:hehe:

    --Shaun

    Hiding under a desk from SSIS Implemenation Work :crazy:

  • Points taken, absolutely Shaun.

    However, NULLs cannot be excluded from the results of queries. Suppose for instance, looking again at my example, we have some kind of accounting database, with various kinds of transaction types in a table containing transactions. If we query that table and sum up all transactions of type "expense_transaction" to give a sum "total_expenses", what happens if there are no transactions of type "expense_transaction"? The answer is that our sum will be NULL, not zero. In that case we would have to wrap the summed total value in our N2Z function, either in the query (or procedure) generating the sum, or in any calculations which use the final value of the sum.

    Hence, it is quite possible for NULL values to be encountered even in well-designed production database applications.

    Agree?

    Regards

    Ken.

    You never know: reading my book: "All about your computer" might just tell you something you never knew!
    lulu.com/kaspencer

  • Kas:

    I do agree with you, but that is a different issue;

    My point was about table design.

    Your point is about 'query results'.

    Both are valid.

    The COALESCE function then comes into play.

    --Shaun

    Hiding under a desk from SSIS Implemenation Work :crazy:

  • kas (3/4/2008)


    Points taken, absolutely Shaun.

    However, NULLs cannot be excluded from the results of queries. Suppose for instance, looking again at my example, we have some kind of accounting database, with various kinds of transaction types in a table containing transactions. If we query that table and sum up all transactions of type "expense_transaction" to give a sum "total_expenses", what happens if there are no transactions of type "expense_transaction"? The answer is that our sum will be NULL, not zero. In that case we would have to wrap the summed total value in our N2Z function, either in the query (or procedure) generating the sum, or in any calculations which use the final value of the sum.

    Hence, it is quite possible for NULL values to be encountered even in well-designed production database applications.

    Agree?

    Regards

    Ken.

    If I agree with you, that its a design Issue, then why Microsoft is repeating it again and again??

    Any clue??

  • MI:

    MS SQL Server is a tool to manage data.

    When we talk of a design issue - we are referring to the design of the database schema being used not the design of the tool. e.g. the table objects within the schema allow nulls to be inserted into their columns - this is a design choice made by the DB developer when designing the schema.

    On exception is Primary Key columns are constrained such that NULL cannot be entered.

    NULL is a necessary 'term/value' in relational database design just as zero is in mathematics.

    Hiding under a desk from SSIS Implemenation Work :crazy:

  • At the risk of reinvigorating fanatical discussions (search for the threads associated with any NULL-related article on the site), it's not Microsoft's design flaw as NULLs are a basic part of the SQL standard that all RDMS' (good ones anyway - don't flame me.. you're on the wrong website) support.

    NULLs have their place. At least someone in this discussion has already pointed out the difference between NULLs in tables vs NULLs in query results. The former need can be debated - sometimes people avoid them like the plague in tables.

    But in query results they are absolutely necessary and people should understand how they work. Outer joins are a classic and obvious case.

    What also confuses people is how aggregate functions such as SUM, AVG, etc treat NULLs. Essentially the NULLs are ignored for the purposes of such functions. If you're calculating the average of 1, 2, 3 and NULL then you want the average to be (1 + 2 + 3) / 3 = 2 rather than (1 + 2 + 3 + 0) / 4 = 1.5.... NULL <> 0.

    For cases, such as the one given in the quiz, think of NULL as mathematical infinity. Anything *, -, /, + infinity is infinity. Also, infinity <> infinity, etc. The rules are well defined and, from a logical & mathematical point of view, make sense.

    Anyhow, I must admit that if I had employees (that'd be the day!) I wouldn't hire someone who had not tried to understand NULLs and how they're handled, nor would I hire someone who just carte blanch ruled them out as a matter of dogma, etc. But that's just me and I'm picky 😀

  • Shaun McGuile (3/4/2008)


    Kas:

    although I agree with you in principle a function will cover bad design - if you mean zero then zero should be what is in the column; 😀

    I think the point here is that NULLs are a quick/dirty solution which allows DB developers to play and should not exist in production databases.

    Rule should be all non binary data columns in every row in every table should have a default value;-

    0 - for all numeric columns

    '' - (empty string) for text/character data

    31/12/1899 - or some other default date value for date fields.

    Nulls should not be allowed - its a design issue.:hehe:

    --Shaun

    Good heavens what dangerous nonsense.

    A database doesn't represent data extracted from some platonic ideal world in which everything is known, it represents our knowledge of the real world. Sometimes there will be things we just don't know; we don't want to put in a default value, because risks making anyone looking at the data believe that we do know (with potentially catastrophic results). We don't want to split an attribute that might be unknown into two attributes (one to contain the value or a default, and another to say whether the value in the first actually means anything) because (a) that makes handling aggregates rather clumsy and (b) it increases the number of columns and (c) the results are generally extremely hard to understand. We don't want to split the X table into a large number of tables "X for which attribute A is unknown", "X for which attribute B is unknown", X for which attributes A and B are both unknown, etcetera (2 to the power N tables where there N attributes which may or may not be unknown) nor yet into a different large number of tables "X and attribute A", "X and attribute B" etcetera (N+1 tables where there are N attributes whhich may or may not be known and at least 1 non-key attribute that is always known) because this wastes space, disc head movements, and performance. In fact we don't want to do any of the null-avoidance tricks advanced by the anti-null fanatics of the world, because they all lead to schemata which are difficult to understand, code which is inelegant and over-complex, and performance that is fairly appalling. So we have a standard which includes NULL (it has probably got NULL wrong since we seem to need to distinguish between at least two different sorts of NULL and work with a 4-valued logic, not a 3-valued one), and when we encounter (in the real world, not some platonic ideal with no real existence) an attribute which may sometimes be unknown we use that standard and the NULL that it provides to cope with this real world situation in a reasonably efficient manner.

    I suspect Shaun has been reading too much Date and not enough Codd - it's salutary to see what Codd said about NULL, and to remember that when the inventor of the relational model published his rules for what properties a DBMS must have to be considered relational one of those rules said plainly and clearly that it must support NULL.

    Tom

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply