Finding Values With Numerous Columns

  • Comments posted to this topic are about the item Finding Values With Numerous Columns

    ---
    Timothy A Wiseman
    SQL Blog: http://timothyawiseman.wordpress.com/

  • Good one ...

  • If you are happy to exclude index performance benefits or doing a like comparison you can create a computed column of the concatenation of other columns and then search the computed column.

    ALTER TABLE phonebook ADD

    phoneall AS ',' + isnull(phone1, '') + ',' +

    isnull(phone2, '') + ',' +

    isnull(workphone, '') + ',' +

    isnull(cellphone, '') + ','

    SELECT

    *

    FROM

    phonebook

    WHERE

    phoneall like '%,234-5678,%'

  • This is an interesting article. What is most important is that it explains WHY there can be a problem, for anyone who has never been in a position of trying unsuccessfully to curb developers who possess just a little knowledge of Database Design but who are completely unaware of just how little.

    As far as I can see, the solution that Timothy shows will work well in cases where the database has a lot of small spreadsheet-like tables, (Timothy seems to have suffered from this) but it will not work for monster tables unless it has some very clever indexing.

    There is a different solution that works very well, which I've had to use myself many times, though it is not a good idea for a rapidly changing table. It is, however, very fast where there is lots of text to search in several columns, which would otherwise need a '%xxxx%' wWHERE-clause. (i.e. unindexable!) This is to use an 'inversion' table.

    this technique is usually called the ‘Inverted’ or ‘Inversion’ index technique. (see http://en.wikipedia.org/wiki/Index_(search_engine)%5B/url%5D for a full discussion.

    Basically, you produce table that contains, uniquely, a list of every word in the columns you want to index. You maintain a many-to-many linking table that links the row in the denormalized table with the words in the unique table that it contains. This gives you, instantly, the rows containing the words in the 'search string' that you want to search for, even in tables that are several million rows long. You can refine your search from that subset, and Timothy's method should work fine for that. I've never tried to make this into a generic solution, as I'm not that brave, nor unfortunate enough to have more than one travesty of a denormalized table like the ones Timothy describes, within a single database!

    If anyone is interested, I'll pop the solution into a blog post, but it is too long for a forum post, I reckon..

    Best wishes,
    Phil Factor

  • ... or a nice article on SSC, Phil. 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Thanks for the article as this does come up, especially when dealing with others' legacy databases. I would be interested in seeing this expanded to include text fields, as so often a varchar is used to capture field data, then it overspills into a catch-all notes field.

  • Jeff Moden (12/2/2008)


    ... or a nice article on SSC, Phil. 😉

    Speaking for the vast silent majority, I'd like to read Phil's article. 😀

    Paul DB

  • ... or a nice article on SSC, Phil. [Wink]

    Speaking for the vast silent majority, I'd like to read Phil's article. [BigGrin]

    OK. It's a deal. I needed a good excuse to publish it! 🙂

    Best wishes,
    Phil Factor

  • Just want to point out that ISNUMERIC and ISDATE is not reliable

    SELECT ISNUMERIC('12d5'),ISDATE('2000')


    Madhivanan

    Failing to plan is Planning to fail

  • This doesn't work for a LIKE condition, but for equalities in a small number of fields, you can use

    WHERE '411-555-1212' IN (HomePhone, WorkPhone, CellPhone, AltPhone)

    I've found it useful when looking for a foreign key of a Person Id in several fields like fkLoanOfficerId, fkApprovingOfficerId, fkVerifiedBy etc.

    You can't let the perfect be the enemy of the good. In some cases it's more expedient to work with the errors of the past, than to attempt a total rewrite of the structure.

  • Phil Factor (12/2/2008)


    ... or a nice article on SSC, Phil. [Wink]

    Speaking for the vast silent majority, I'd like to read Phil's article. [BigGrin]

    OK. It's a deal. I needed a good excuse to publish it! 🙂

    Phil is it comming in Jan '09....

  • Madhivanan and Jason bring up some good points. Formating! Then if you like LIKE you'll love this. Try a search argument with an embedded RegEx. Search for '*e*' and stand by for a boat load of rows.

    I'm pushing to store phone numbers as BigInt. I don't have to dial the dashes why should I have to store them? Oh, and three fields too. Country code, area code, phone number. OK, I know that there are letter on the phone but the phone system could care less.

    I have customers that use ISO 8601 dates (20081202 for today). Goes in a char(8). On one hand no messy times. 🙂 A ship date is a ship date. On the other try doing calculations by week. 🙁 I plan on finding a real good reason for these folks to migrate from 2000 to 2008. The date type has me all a twitter.

    ATBCharles Kincaid

  • I guess I like the code, so I tweaked it a bit into a SELECT only, and added 'column' filters

    so you can only search in SELECTED columns (if you know the names ahead)

    It's funny we are trying to simulate Full-Text search

    AdventureWorks2008 DB

    SET @schema = 'Person'

    SET @TableName = 'Person'

    SET @Value = 'Xu%'

    SET @ColumnNames = 'FirstName,LastName'-- can be empty or * for ALL columns

    /*

    CREATE PROCEDURE [dbo].[FindValue]

    @TableName NVARCHAR(128), /* Must be a valid table or view name,

    must not be quoted or contain a schema*/

    @Value NVARCHAR(4000), /*May contain wildcards*/

    @schema NVARCHAR(128) = 'dbo' /*May be left out*/

    AS

    Sample Execution

    Exec FindValue

    @TableName = 'spt_monitor',

    @Value = '8',

    @schema = 'dbo'

    */

    /*

    If given a string it will finds all rows where any char, varchar, or

    their Unicode equivalent which contain that string in the selected

    table or view. Note that this only works on objects which have entries

    in information_schema.columns, which excludes certain system objects. If

    given a numeric value it will check those text types for a match as well

    as numeric types. If given a possible date, it will also check date type.

    The string that is being searched for may contain wildcard characters such as %.

    This will NOT search text, ntext, xml, or user defined fields. This may

    return a row more than once if the search string is found in more than one

    column in that row.

    */

    DECLARE

    @TableName NVARCHAR(128), /* Must be a valid table or view name,

    must not be quoted or contain a schema*/

    @Value NVARCHAR(4000), /*May contain wildcards*/

    @schema NVARCHAR(128) /*May be left out*/

    ,@ColumnNames NVARCHAR(4000)-- list of columns to search for, can be * for ALL

    SET @schema = 'Person'

    SET @TableName = 'Person'

    SET @Value = 'Xu%'

    SET @ColumnNames = 'FirstName,LastName'-- can be empty or * for ALL columns

    SET @ColumnNames = REPLACE(@ColumnNames, ' ', '')-- removes all space

    /**************************** Declare Variables ***********************/

    DECLARE @columns TABLE (ColumnName NVARCHAR(128))

    DECLARE @columnsFiltered TABLE (ColumnName NVARCHAR(128))

    DECLARE @sql NVARCHAR(MAX)

    /************************** Populate Table Variable *****************/

    /*Takes the names of string type columns for the selected table */

    INSERT INTO @columns

    (ColumnName)

    SELECT

    Column_name

    FROM

    INFORMATION_SCHEMA.COLUMNS

    WHERE

    Table_schema = @schema

    AND Table_name = @TableName

    AND data_type IN ('char', 'nchar', 'varchar', 'nvarchar')

    /* If it is numeric, also check the numeric fields */

    IF ISNUMERIC(@value) = 1

    INSERT INTO @columns

    (ColumnName)

    SELECT

    Column_name

    FROM

    INFORMATION_SCHEMA.COLUMNS

    WHERE

    Table_schema = @schema

    AND Table_name = @TableName

    AND data_type IN ('int', 'numeric', 'bigint', 'money',

    'smallint', 'smallmoney',

    'tinyint', 'float', 'decimal', 'real')

    IF ISDATE(@value) = 1

    INSERT INTO @columns

    (ColumnName)

    SELECT

    Column_name

    FROM

    INFORMATION_SCHEMA.COLUMNS

    WHERE

    Table_schema = @schema

    AND Table_name = @TableName

    AND data_type IN ('datetime', 'smalldatetime')

    INSERT INTO @columnsFiltered

    SELECT ColumnName

    FROM @columns

    WHERE

    (@ColumnNames IN ('*','')

    OR

    CHARINDEX(',' + ColumnName + ',', ',' + @ColumnNames + ',') > 0

    )

    /********************* Prepare dynamic SQL Statement to Execute **********/

    SELECT

    @sql =

    CASE

    WHEN @sql IS NULL

    THEN 'Select ''' + ColumnName

    + ''' as ContainingColumn, * From '

    + QUOTENAME(@Schema) + '.' + QUOTENAME(@TableName)

    + ' where ' + ColumnName + ' like ''' + @Value + ''' '

    WHEN @sql IS NOT NULL

    THEN @sql + 'UNION ALL Select ''' + ColumnName

    + ''' as ContainingColumn, * From ' + QUOTENAME(@Schema)

    + '.' + QUOTENAME(@TableName)

    + ' where ' + ColumnName + ' like ''' + @Value + ''' '

    END

    FROM

    @columnsFiltered

    /******************* Execute Statement and display results ***********/

    --print @sql /* This may be uncommented for testing purposes */

    EXEC (@sql)

    SQLServerNewbieMCITP: Database Administrator SQL Server 2005
  • Jeff Moden (12/2/2008)


    ... or a nice article on SSC, Phil. 😉

    I'll hop on the bandwagon and say I would love to see this article, Phil.

    And you are quite correct. I originally wrote the procedure to help me deal with large numbers of "spreadsheet-like" tables with swaths of repeating columns. I am not aware of it actually "failing" on a large table, but it can definitely be painfully slow on large tables

    ---
    Timothy A Wiseman
    SQL Blog: http://timothyawiseman.wordpress.com/

  • Jason Akin (12/2/2008)


    This doesn't work for a LIKE condition, but for equalities in a small number of fields, you can use

    WHERE '411-555-1212' IN (HomePhone, WorkPhone, CellPhone, AltPhone)

    I've found it useful when looking for a foreign key of a Person Id in several fields like fkLoanOfficerId, fkApprovingOfficerId, fkVerifiedBy etc.

    You can't let the perfect be the enemy of the good. In some cases it's more expedient to work with the errors of the past, than to attempt a total rewrite of the structure.

    +1. The IN list of columns is far better than a bunch of OR conditions. Likewise, if LIKEs are necessary this would be a better alternative:

    WHERE (HomePhone+'|'+WorkPhone+'|'+CellPhone+'|'+AltPhone) like '%555-1212%'

Viewing 15 posts - 1 through 15 (of 26 total)

You must be logged in to reply to this topic. Login to reply