January 25, 2012 at 4:08 am
Hi,
I receive some data regularly which I load onto a load table before inserting into the main table. As an overall data completion check I would like to count the number of non-nulls in each column on the load table before I proceed.
At the moment I have used this simple approach:
select count(col_1), count(col_2), count(col_3), ... , count(col_n)
from load_table
There are some fifty or columns on the load table, and sometimes more are added as time goes on. Added columns could be a problem as this check query would continue to work without warning if not updated to match the new table structure.
To avoid potential silent failures I would like to rewrite the query so it will work regardless of which columns are in the table, so far I have:
select COLUMN_NAME
from Information_Schema.Columns
where TABLE_NAME = 'load_table'
order by ORDINAL_POSITION
Which gives me a list of columns names. To this I would like to add a second column with the count of non-null values in that column, e.g.:
Col_1, 200
Col_2, 150
Col_3, 0
...
I was heading toward something similar to the example below, but it results in a query which seems to attempt to join based on struture and data, which I would guess is not possible:
select COLUMN_NAME, (select count(COLUMN_NAME) from load_table t where i.COLUMN_NAME = a.COLUMN_NAME) as [Count]
from Information_Schema.Columns i
where TABLE_NAME = 'load_table'
group by COLUMN_NAME
order by ORDINAL_POSITION
Am I thinking along the right lines or would I have to take a different approach such as building this kind of query using a loop structure, dynamic SQL and plenty of union joins. I'm hoping there is a tidy set-based approach I can use rather than that? 😀
Any help is gratefully appreciated.
January 25, 2012 at 4:27 am
In SSMS, if you look at the Storage tab on the properties of a table, you can get the row count for that table. If you use Profiler to find out what's going on in the background when you do that, you can find out what system view holds the row counts and write your own query accordingly.
John
January 25, 2012 at 8:32 am
I can't really figure out what benefit you gain by getting a non-null row count for each column. To do this you will have to use dynamic sql. You will have to look in sys.syscolumns like you said and build a string to execute.
This may not be the most elegant but I think it would get you what you need.
declare @TableName varchar(50) = 'tblEmployee'
declare @Columns varchar(max)
select @Columns =
stuff(
(
select 'count(' + sc.name + ') as ' + sc.name + ', '
from sys.syscolumns sc
where id = OBJECT_ID(@TableName)
order by colorder
for xml path('')
)
, 1, 0, 'select ')
select left(@Columns, DATALENGTH(@Columns) - 2) + ' from ' + @TableName
_______________________________________________________________
Need help? Help us help you.
Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.
Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.
Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/
Viewing 3 posts - 1 through 3 (of 3 total)
You must be logged in to reply to this topic. Login to reply