Search for text across many columns

  • This issue has popped up a couple times over the last few weeks so I thought I would throw this out and see if anyone has a better way of doing it.  The issue is that we had a request to see if we had any business interactions with lets say 'ABadApple'.  We get in around 30 files from outside companies that we store the data in SQL tables each month.  Each table will have many columns that contain text where this 'ABadApple' may be mentioned.  And since they are different companies feeding us the data, the column names across the tables are not the same.  So if the data was small enough you could copy the data into Excel or something similar and do a simple search and it would look at every column in the table.  Of course the files are rather large(to large for Excel on most of them) and this would be tedious for the number of tables to look at.

    And instead of taking the time to go though each table and set up special SQL for each to only look at the columns from that table, I'm looking for something to search across all columns.

    So I wrote this code(modified from something else):

    DECLARE @tablename VARCHAR(100);

    SET @tablename = 'TableNameHere';

    DECLARE @SQL TABLE
    (
    RecordIdINT IDENTITY(1, 1),
    SQL_CodeVARCHAR(MAX)
    )
    ;

    DROP TABLE #temp;

    SELECT table_name, column_name
    INTO #temp
    FROM information_schema.columns
    WHERE table_name = @tablename
    ;

    INSERT INTO @SQL (SQL_Code)
    SELECT 'SELECT * '
    ;

    INSERT INTO @SQL (SQL_Code)
    SELECT ' FROM dbo.' + t.table_name
    FROM #temp AS t
    GROUP BY t.table_name
    ;

    INSERT INTO @SQL (SQL_Code)
    SELECT ' WHERE '
    ;

    INSERT INTO @SQL (SQL_Code)
    SELECT TOP 1 column_name + ' LIKE ''%ABadApple%'' '
    FROM #temp

    ;

    INSERT INTO @SQL (SQL_Code)
    SELECT ' OR ' + column_name + ' LIKE ''%ABadApple%'' '
    FROM #temp
    ;

    -- you will want to execute the results of @SQL
    SELECT SQL_Code
    FROM @SQL
    ORDER BY RecordId
    ;

    This will generate a SQL statement I can then execute for each table.   A little better but I still have to go through each table and there is a chance I miss one.

    So I then came up with this code:

    SELECT 'SELECT ''' + tbl.[name] + ''' AS Tbl_Name, ''' + cls.[name] + ''' AS Column_Name FROM dbo.' + tbl.[name] + ' WHERE ' + cls.[name] + ' LIKE ''%ABadApple%''  UNION ' 
    FROM sys.columns AS cls
    INNER JOIN sys.tables AS tbl
    ON cls.object_id = tbl.object_id
    WHERE tbl.[name] LIKE '%_ARS'
    AND cls.[name] NOT IN ('RecordId', 'BatchID', 'InsertedTs')
    ;

    This generated about 5,000 rows of code.  This seemed like a lot to try and execute at one time so I'm hoping for a better more efficient way to do this.  I added the code at the end to bypass know columns that it didn't need to look at, these were a few int and date columns each would have.

     

    Any help would be appreciated.  Thanks in advance.

    -------------------------------------------------------------
    we travel not to escape life but for life not to escape us
    Don't fear failure, fear regret.

  • You can eliminate irrelevant datatype columns using:

    and columns.system_type_id IN (35,99,167,175,231,239,241,231)

     

    hopefully you're not using sql_variant (98)

     

  • This was removed by the editor as SPAM

  • This was removed by the editor as SPAM

  • This was removed by the editor as SPAM

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply