Percentage of NULL Column Rows

  • Ram Ram

    Ten Centuries

    Points: 1079

    Comments posted to this topic are about the item Percentage of NULL Column Rows

  • g.britton

    SSChampion

    Points: 13689

    Get rid of NOLOCK

    Gerald Britton, MCSE-DP, MVPToronto PASS Chapter[/url]

  • gshouse

    Say Hey Kid

    Points: 664

    I typically put an existence guard when I do a DROP TABLE: IF OBJECT_ID('tempdb..#tblHOLD') IS NOT NULL DROP TABLE #tblHOLD

    Even for temp tables, I would use meaningful names, especially with 0 comments.

    CAST(CAST(CAST(NullRows AS VARCHAR(10))+'.00' AS DEC(15,2))/CAST(CAST(TotalRows AS VARCHAR(10))+'.00' AS DEC(15,2))*100 AS DEC(5,2)) AS PercentageNull

    I'm not sure of the purpose of the above statement. If it because dividing INTS gives an INT, CAST(NullRows as REAL) / CAST(TotalRows AS REAL). Even if there is a reason for CAST as VARCHAR, why append +'.00' when you are putting it in a DEC(15,2)?

    The above statement will fail when TotalRows or NullRows > 9999999999 when you try to cast as VARCHAR(10). BIGINT can hold 9,223,372,036,854,775,807. If there is some reason for CASTing as VARCHAR and appending '.00' of which I am unaware, they should be VARCHAR(20). Similarly, DEC(15,2) is too small. A table could theoretically have as many rows as can fit in 524+ Terrabytes, so certainly > 9999999999.

    Finally, it fails if there are any empty tables, I'd replace with

    CASE WHEN TotalRows != 0

    THEN CAST(CAST(NullRows AS REAL)/CAST(TotalRows AS REAL)*100 AS DEC(5,2)) -- DEC(15,2) is purely formatting

    ELSE 0

    END AS PercentageNull

  • andyabel

    SSC Enthusiast

    Points: 135

    Can someone explain why the percentage of NULL column rows is useful/interesting?

    What problem is it trying to solve?

    Thanks

  • gshouse

    Say Hey Kid

    Points: 664

    Useful for a number of reasons. Most I can think of don't need a survey like this. However, off the top of my head, if you are deciding whether making any columns SPARSE makes sense would be one use. If you decided during modelling that you'll allow NULLS 'just in case' and wanted to see how that is working out, this script filters only for columns that are nullable and will give you an overview. If you are modelling correctly and want to get a feel for data quality and percentages of unknown attributes a survey like this is useful

  • Dave Vroman

    SSC Eights!

    Points: 821

    NOLOCK is a useful evil on high volume systems. In cases where you're looking at generics or structure this can be a very useful tool which can stop deadlocks. It also allows the query to proceed without slowing down the system too much. If the NOLOCK was not there the query would wait for any locked columns or rows. In effect this slows the whole system.

    In this case it is better to leave the NOLOCK in.

  • g.britton

    SSChampion

    Points: 13689

    First of all, NOLOCK doesn't deliver on it's promise. It *does* take locks (e.g. schema stability locks).

    Second, it's a synonym for READUNCOMMITTED, which is what I recommend if you understand the risks (dirty reads).

    Third, in a case like this, the dirty reads may not matter, but I'd still go with READUNCOMMITTED, since for me it describes the situation better.

    see: https://msdn.microsoft.com/en-us/library/ms187373.aspx

    READUNCOMMITTED

    Specifies that dirty reads are allowed. No shared locks are issued to prevent other transactions from modifying data read by the current transaction, and exclusive locks set by other transactions do not block the current transaction from reading the locked data. Allowing dirty reads can cause higher concurrency, but at the cost of reading data modifications that then are rolled back by other transactions. This may generate errors for your transaction, present users with data that was never committed, or cause users to see records twice (or not at all).

    READUNCOMMITTED and NOLOCK hints apply ***only to data locks***. All queries, including those with READUNCOMMITTED and NOLOCK hints, **acquire Sch-S (schema stability) locks during compilation and execution**. Because of this, queries are blocked when a concurrent transaction holds a Sch-M (schema modification) lock on the table. For example, a data definition language (DDL) operation acquires a Sch-M lock before it modifies the schema information of the table. Any concurrent queries, including those running with READUNCOMMITTED or NOLOCK hints, are blocked when attempting to acquire a Sch-S lock. Conversely, a query holding a Sch-S lock blocks a concurrent transaction that attempts to acquire a Sch-M lock.

    Gerald Britton, MCSE-DP, MVPToronto PASS Chapter[/url]

  • akljfhnlaflkj

    SSC Guru

    Points: 76202

    Thanks for the sql.

Viewing 8 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply