Is there a way?

  • I have many tables that have 1000+ columns, of which about 85% of them are NULL. Is there a way to determine if a column in a table is used (meaning if it's no,t EVERY ROW IS NULL) at all?

    Some of these tables have over 200+ million records....so I won't use anything like DATALENGTH() > 0 or "Column" IS NULL etc...

    Any suggestions?

    ______________________________________________________________________________Never argue with an idiot; Theyll drag you down to their level and beat you with experience

  • Can you try something like below to see if it works for you .

    ;WITH XMLNAMESPACES('http://www.w3.org/2001/XMLSchema-instance' as ns)

    select <PrimarykeyCOL>,(SELECT *

    FROM <Table>

    for xml path('row'), elements xsinil, type

    ).value('count(/row/*[@ns:nil = "true"])', 'int') as NullCount

    from <Table>

  • Not so sure the xml approach is going to be very fast with 200 million rows. What about using MAX(SomeColumn)? That would likely be faster than where Column IS NULL. You could even hit multiple columns all at once this way.

    Something like this?

    declare @Cols varchar(max) = ''

    select @Cols = 'MAX([' + name + ']), ' + @Cols

    from sys.columns

    where object_id = object_id('MyTable')

    and is_nullable = 1

    select 'select ' + left(@Cols, len(@Cols) - 2) + ' from MyTable'

    Then you just need to execute the last piece of it to get the results.

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • Thanks for the code Sean. I let it run for about 20 mins and killed it...

    These tables are rather large :crazy:

    Going to do some creative "googling" and see what else I can come up with!

    ______________________________________________________________________________Never argue with an idiot; Theyll drag you down to their level and beat you with experience

  • MyDoggieJessie (4/7/2014)


    Thanks for the code Sean. I let it run for about 20 mins and killed it...

    These tables are rather large :crazy:

    Going to do some creative "googling" and see what else I can come up with!

    Yeah wasn't sure if that would help or not. I don't have any tables that large to test with. I would be interested to see what you come up with.

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • I haven't tried it but (no pun intended) check out CHECKSUM_AGG in Books Online. It might be just what the docter ordered.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Thanks Jeff! Unfortunately like Sean's solution, it works great for smaller tables - the one I'm looking at has 127,170,878 rows of data and has a data footprint of 532,771 MB. It still has to RBAR through them all for each column value.

    I sincerely appreciate the help though!

    ______________________________________________________________________________Never argue with an idiot; Theyll drag you down to their level and beat you with experience

  • We have a table with about 80 million rows, only about 75 columns though. A couple of these columns are all NULL. I found looking at one at a time wasn't too bad, still took about 3 minutes each. I know that doesn't help when you have a lot of columns to check.

    I just ran this simple SQL to return a record as soon as it found a not NULL.

    SELECT com_mt

    FROM prod_db.dbo.prod_table

    WHERE com_mt is not null

    GROUP BY com_mt

    HAVING count(*) = 1

    There should be an easier way.

    -------------------------------------------------------------
    we travel not to escape life but for life not to escape us
    Don't fear failure, fear regret.

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply