Find out if the whole column of data in a table is empty - dynamic sql creation

  • Morning,

    Thanks for your help in advance, very much appreciated.

    I have created some dynamic sql to check a temporary table that is created on the fly for any columns that do contain data. If they do the column name is added to a dynamic sql, if not they are excluded. This looks like:

    If (select sum(Case when [Sat] is null then 0 else 1 end) from #TABLE) >= 1 begin set @OIL_BULK = @OIL_BULK + '[Sat]' +',' END

    However, I am currently running this on over 230 columns and large tables 1.3 mil rows and it is quite slow. Does anyone have any ideas how I can dynamically create a sql script that only selects the columns in the table where there is data in a speedier manner. Unfortunately it has to be on the fly because the temporary table is created on the fly.

    Many Thanks for your help,

    Oliver

  • If you post what you are actually trying to do, someone may be able to suggest a better approach.

    The best that can be done with the current approach is a single table scan.

    Something like:

    CREATE TABLE #Oil_Bulk(NonNullList varchar(MAX));

    DECLARE @SQL varchar(MAX) = 'SELECT ';

    SELECT @SQL = @SQL + 'CASE WHEN MAX(CASE WHEN ' + name + ' IS NULL THEN 0 ELSE 1 END) = 1 THEN ''' + name + ','' ELSE '''' END +'

    FROM tempdb.sys.columns

    WHERE object_id = OBJECT_ID('tempdb..#TABLE')

    ORDER BY column_id;

    SET @SQL = LEFT(@SQL, LEN(@SQL) - 1) + ' FROM #TABLE;';

    INSERT INTO #Oil_Bulk

    EXEC (@SQL);

    DECLARE @Oil_Bulk varchar(MAX);

    SELECT @Oil_Bulk = LEFT(NonNullList, LEN(NonNullList) - 1)

    FROM #Oil_Bulk;

    SELECT @Oil_Bulk;

  • This is likely to be an expensive process whatever you do. What's the purpose?

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • I simply want to create a subset of the data as requested by the user and return them a table of the data which only contains columns that contain data. The table has over 260 columns and up to a few million rows - (this is at the request of the user not the most ideal way to work with a sparsly populated dataset I know!)

    Many Thanks,

    Oliver

  • Hi,

    Thanks for this, the query provided is in essence what my current query does except my current one stipulates each column, where as yours creates the sql dynamically. The problem is that it takes considerable time to run the CASE WHEN's because it is running through each of the 250+ columns.

    Would it be better to create a totals row to the table and use this to filter out which columns have data into a dynamic sql set?

    Many Thanks for your help,

    Oliver

  • The fundamental problem is the whole approach reeks of bad design.

    The thing that takes the time is scanning the table, not the CASE statements. The query I posted attempted to do all the columns in one scan; Your original query looked as though it was going to scan the table once for each column. A totals row will still need to be created by scanning the table.

  • Thanks, I know absolutely that at the moment it is designed badly.

    I will try to break down your query and have more of a play, I dont think the first time I looked at it I truely understood what is was doing.

    Many Thanks for your help, I will let you know how I get on.

    Thanks,

    Oliver

  • Hi,

    I have got this working now, as I understand it does it prevent scanning because it is all being run under the one select statement with case as opposed to my old attempt that was using multiple if select's.

    Many Thanks for your help, this is really useful.

    Oliver

  • Hi,

    in your script you added

    WHERE object_id = OBJECT_ID('tempdb..#TABLE')

    How do I find the temporary table name through object id of the temporary table I just created?

    select TOP(650) * INTO #table FROM dbo.V_ALLCLASSESWCOMMENTS

    DECLARE @SQL nvarchar(MAX) = '';

    SELECT @SQL = @SQL + 'CASE WHEN count(' + name + ') > 0 THEN ''' + name + ', ' + ''' ELSE '''' END + '

    FROM tempdb.sys.columns

    WHERE object_id = OBJECT_ID('tempdb..#TABLE')

    ORDER BY column_id;

    Sorry if this is really obvious, Many Thanks,

    Oliver

  • CREATE TABLE #YourTable

    (

    YourTable1 varchar(20)

    ,YourTable2 int

    ,YourTable3 datetime

    )

    SELECT *

    FROM tempdb.sys.columns

    WHERE object_id = OBJECT_ID('tempdb..#YourTable')

  • Thank you for the reply, I am incorporating this into my larger script.

    Thanks again

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply