Counting the contents of a column on a table while handling table structure changes

  • Hi,

    I receive some data regularly which I load onto a load table before inserting into the main table. As an overall data completion check I would like to count the number of non-nulls in each column on the load table before I proceed.

    At the moment I have used this simple approach:

    select count(col_1), count(col_2), count(col_3), ... , count(col_n)

    from load_table

    There are some fifty or columns on the load table, and sometimes more are added as time goes on. Added columns could be a problem as this check query would continue to work without warning if not updated to match the new table structure.

    To avoid potential silent failures I would like to rewrite the query so it will work regardless of which columns are in the table, so far I have:

    select COLUMN_NAME

    from Information_Schema.Columns

    where TABLE_NAME = 'load_table'

    order by ORDINAL_POSITION

    Which gives me a list of columns names. To this I would like to add a second column with the count of non-null values in that column, e.g.:

    Col_1, 200

    Col_2, 150

    Col_3, 0

    ...

    I was heading toward something similar to the example below, but it results in a query which seems to attempt to join based on struture and data, which I would guess is not possible:

    select COLUMN_NAME, (select count(COLUMN_NAME) from load_table t where i.COLUMN_NAME = a.COLUMN_NAME) as [Count]

    from Information_Schema.Columns i

    where TABLE_NAME = 'load_table'

    group by COLUMN_NAME

    order by ORDINAL_POSITION

    Am I thinking along the right lines or would I have to take a different approach such as building this kind of query using a loop structure, dynamic SQL and plenty of union joins. I'm hoping there is a tidy set-based approach I can use rather than that? 😀

    Any help is gratefully appreciated.

  • In SSMS, if you look at the Storage tab on the properties of a table, you can get the row count for that table. If you use Profiler to find out what's going on in the background when you do that, you can find out what system view holds the row counts and write your own query accordingly.

    John

  • I can't really figure out what benefit you gain by getting a non-null row count for each column. To do this you will have to use dynamic sql. You will have to look in sys.syscolumns like you said and build a string to execute.

    This may not be the most elegant but I think it would get you what you need.

    declare @TableName varchar(50) = 'tblEmployee'

    declare @Columns varchar(max)

    select @Columns =

    stuff(

    (

    select 'count(' + sc.name + ') as ' + sc.name + ', '

    from sys.syscolumns sc

    where id = OBJECT_ID(@TableName)

    order by colorder

    for xml path('')

    )

    , 1, 0, 'select ')

    select left(@Columns, DATALENGTH(@Columns) - 2) + ' from ' + @TableName

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

Viewing 3 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply