Stored Procedure to get column metrics

  • Hi everyone, I have a quick question to write a stored procedure to get column metrics. The stored procedure should take database name and table name as inputs and return column name, total distinct values in that column, number of nulls in that column, min value in the column (excluding null), max value in the column(excluding null). An example result is below.

    Database Table Column Total Distinct nullsLow Value High Value

    ReportingContract ContractID 9202 013 9587

    ReportingContract ContractName 9202 0AMPOCOH-BEXLEY-1120-1209ZERO

    ReportingContract ContractTypeID 5 01 5

    ReportingContract RateChangeDate 2524 09/10/2001 0:00 12/4/2012 0:00

    ReportingContract BudgetRequired 2 00 1

    ReportingContract TermsConditionsID855 83271 1294

    ReportingContract VoidContract 2 00 1

    ReportingContract AggregationTypeID5 86511 4

    ReportingContract FuturePrice 4 91680.49 0.67

    ReportingContract TimeStamp 9202 00x000000008788A8D2 0x0000000090878381

    ReportingContract DefaultReferralID 68 26121 485

  • No idea what the benefit of something like this would be but in order to get your desired output you are going to have to become very familiar with dynamic sql.

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • Thank you sean, the link was useful for me in asking a question on the forum. I did edit the data before posting, but when I posted it still changed the format.

  • I agree with Sean, you could do this using dynamic SQL. In fact I myself just wrote something very similar today.

    But there might be another, smarter way of doing it by using the SSIS data profiling task.

    Have a look here for the basic funcitons: http://consultingblogs.emc.com/jamiethomson/archive/2008/03/02/ssis-data-profiling-task-part-1-introduction.aspx

    [font="Verdana"]Markus Bohse[/font]

  • LOL the link is actually in my signature but it is certainly a good primer for new folks.

    You can get your desired output a little more clean by using IFCode shortcuts. They are over on the left side when you are posting. The code one does a decent job of formatting sql including colors.

    To attack this you should probably try to write the sql to get a single column's output. Then you will start to see how much effort this is going to take.

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • Hi Markus, Thanks for the reply.

    If you do not mind, can I see what you have written.

  • ravinderm2003 (3/14/2012)


    Hi Markus, Thanks for the reply.

    If you do not mind, can I see what you have written.

    As I said, it was something "similar". My need was to find a count of non-default values for a specific column in several tables all starting with the prefix 'S_'.

    So this script gave me the tablename, the number of non-default values and the total number of rows.

    DECLARE @table NVARCHAR(250), @cmd NVARCHAR(2000)

    CREATE TABLE #tmp (TableName NVARCHAR(250),

    ValCount INT,

    [ROWS] INT)

    SELECT @table= MIN(name)

    FROM sys.objects

    WHERE TYPE = 'U'

    AND name LIKE 'S[_]%'

    WHILE @table IS NOT NULL

    BEGIN

    SET @cmd = 'select ''' + @table + ''', COUNT(*), (Select rows from sysindexes where id = OBJECT_ID('''+@table+''') AND indid < 2 ) from '+ @table +' where col3 <> ''Default'''

    SET @cmd = 'INSERT INTO #tmp (TableName , ValCount, rows) '+ @cmd

    --print @cmd

    EXEC sp_executesql @cmd

    SELECT @table = MIN(name)

    FROM sys.objects

    WHERE TYPE = 'U'

    AND name > @table

    AND name LIKE 'S[_]%'

    END

    SELECT * FROM #tmp

    DROP TABLE #tmp

    Since you want info about all columns in a table you will have to loop through sys.columns as well, not just sys.objects.

    [font="Verdana"]Markus Bohse[/font]

  • Thanks a lot Markus, that is a good beginning for me.

  • This is the solution I got.

    Create procedure TableMetaData @tablename varchar(100) as

    declare @text2 as varchar(100)

    declare @text3 as varchar(100)

    declare @text4 as varchar(100)

    declare @Distinct table (value varchar(50))

    declare @Nulls table (value varchar(50))

    declare @min-2 table (value varchar(50))

    declare @max-2 table (value varchar(50))

    create table #return([Database] varchar(100),

    [Table] varchar(100),

    [column] varchar(100),

    DistinctValues varchar(100),

    NumberOfNulls varchar(100),

    MinValue varchar(100),

    MaxValue varchar(100))

    insert into #return([Database], [Table],[column])

    select db_name(),so.name, sc.name from sys.objects so

    join sys.columns sc on sc.object_id = so.object_id

    where so.type = 'u'

    and so.name = @tableName

    declare sql_cursor cursor for

    select so.name, sc.name from sys.objects so

    join sys.columns sc on sc.object_id = so.object_id

    where so.type = 'u'

    and so.name = @tableName

    open sql_cursor

    fetch next from sql_cursor into @text2, @text3

    while @@FETCH_STATUS = 0

    begin

    -----------distinct count

    set @text4 = 'select COUNT(distinct ' + @text3 +' ) from ' + @text2

    insert into @Distinct exec (@text4)

    update #return

    set DistinctValues = (select * from @Distinct)

    where [column] = @text3

    delete from @Distinct

    -----------NullCount

    set @text4 = 'select COUNT(*) from ' + @text2 + ' where ' + @text3 + ' is null'

    insert into @Nulls exec (@text4)

    update #return

    set NumberOfNulls = (select * from @Nulls)

    where [column] = @text3

    delete from @Nulls

    -----------------------MinValue

    set @text4 = 'select Min( ' + @text3 +' ) from ' + @text2

    insert into @min-2 exec (@text4)

    update #return

    set MinValue = (select * from @min-2)

    where [column] = @text3

    delete from @min-2

    --------------------MaxValue

    set @text4 = 'select Max( ' + @text3 +' ) from ' + @text2

    insert into @max-2 exec (@text4)

    update #return

    set MaxValue = (select * from @max-2)

    where [column] = @text3

    delete from @max-2

    fetch next from sql_cursor into @text2, @text3

    end

    close sql_cursor

    deallocate sql_cursor

    select * from #return

    drop table #return

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply