Where Clause IN from delimited field

  • Hi all,

    I'm trying to do some matching between tables for record exclusion based on matching the rules in a table. The rules are in field in a table called Rules and stored as | delimited text. 

    The table I need to compare to the rules also has a field of specialties. What I want to be able to do, is this: 

    Select
          r.[Provider]
         ,[pro].FullName
       ,r.[Descr]
       ,r.[ColumnName]
       ,pro.Specialties
       ,r.Filtering
       ,r.[FilterType]
       ,r.[ColumnFilter]
       ,pro.*
    from
    [pdf].[newProvider_cb] [Pro]
    LEFT Outer JOIN [PDF].[Rules] r on API.RemoveNonAlphaCharacters(r.Provider) = API.RemoveNonAlphaCharacters([pro].FullName)
    Where
    pro.Specialties NOT In (ISNULL(Replace(Replace(r.Filtering,' ',''),'|',','),''))

    Ignore for the moment that I'm joining tables on text fields rather than IDs. that will be changed. 

    My issue is the NOT IN (ISNULL(Replace(Replace(r.Filtering,' ',''),'|',','),'')) doesn't work if there are multiple values in the field. If there's only one value in on each side it works. 

    I can get the individual specialties from the provider table but that failed also.

    Any thoughts on how to accomplish this?

  • craig.bobchin - Thursday, February 9, 2017 2:31 PM

    Hi all,

    I'm trying to do some matching between tables for record exclusion based on matching the rules in a table. The rules are in field in a table called Rules and stored as | delimited text. 

    The table I need to compare to the rules also has a field of specialties. What I want to be able to do, is this: 

    Select
          r.[Provider]
         ,[pro].FullName
       ,r.[Descr]
       ,r.[ColumnName]
       ,pro.Specialties
       ,r.Filtering
       ,r.[FilterType]
       ,r.[ColumnFilter]
       ,pro.*
    from
    [pdf].[newProvider_cb] [Pro]
    LEFT Outer JOIN [PDF].[Rules] r on API.RemoveNonAlphaCharacters(r.Provider) = API.RemoveNonAlphaCharacters([pro].FullName)
    Where
    pro.Specialties NOT In (ISNULL(Replace(Replace(r.Filtering,' ',''),'|',','),''))

    Ignore for the moment that I'm joining tables on text fields rather than IDs. that will be changed. 

    My issue is the NOT IN (ISNULL(Replace(Replace(r.Filtering,' ',''),'|',','),'')) doesn't work if there are multiple values in the field. If there's only one value in on each side it works. 

    I can get the individual specialties from the provider table but that failed also.

    Any thoughts on how to accomplish this?

    That isn't how a where clause works. You would need a string splitter to do this. You can find an excellent one of those in my signature. What is more concerning to me though is your join. You appear to be joining on value that are exposed through a scalar function. Scalar functions are horribly inefficient and the way you are using it will force the query engine to evaluate the result of the query for every single row in both tables. This means a full table scan on both sides. There is no indexing that can help here.

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • Sean Lange - Thursday, February 9, 2017 2:36 PM

    craig.bobchin - Thursday, February 9, 2017 2:31 PM

    Hi all,

    I'm trying to do some matching between tables for record exclusion based on matching the rules in a table. The rules are in field in a table called Rules and stored as | delimited text. 

    The table I need to compare to the rules also has a field of specialties. What I want to be able to do, is this: 

    Select
          r.[Provider]
         ,[pro].FullName
       ,r.[Descr]
       ,r.[ColumnName]
       ,pro.Specialties
       ,r.Filtering
       ,r.[FilterType]
       ,r.[ColumnFilter]
       ,pro.*
    from
    [pdf].[newProvider_cb] [Pro]
    LEFT Outer JOIN [PDF].[Rules] r on API.RemoveNonAlphaCharacters(r.Provider) = API.RemoveNonAlphaCharacters([pro].FullName)
    Where
    pro.Specialties NOT In (ISNULL(Replace(Replace(r.Filtering,' ',''),'|',','),''))

    Ignore for the moment that I'm joining tables on text fields rather than IDs. that will be changed. 

    My issue is the NOT IN (ISNULL(Replace(Replace(r.Filtering,' ',''),'|',','),'')) doesn't work if there are multiple values in the field. If there's only one value in on each side it works. 

    I can get the individual specialties from the provider table but that failed also.

    Any thoughts on how to accomplish this?

    That isn't how a where clause works. You would need a string splitter to do this. You can find an excellent one of those in my signature. What is more concerning to me though is your join. You appear to be joining on value that are exposed through a scalar function. Scalar functions are horribly inefficient and the way you are using it will force the query engine to evaluate the result of the query for every single row in both tables. This means a full table scan on both sides. There is no indexing that can help here.

    I'll give your string splitter a try. I know the join is horrible. this more of a POC. The business gave us an excel file with the rules and provider names no IDs. we are meeting with the business to  talk about the rules file and other things I've come across.. I'm also throwing a lookup in the SSIS package that loads the rules to return the IDs.

  • Provided that you prevent SQL Injection, I believe that you'll find that dynamic SQL using a WHERE IN will be way more efficient than any splitter (and I'm the guy that wrote the one being cited... thank you, BTW). 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Thanks all for the suggestions. I was able to convince the business we need to change the format of the input file to include the keys and no have each item on it's own separate line so I no longer need to parse anything out or create a delimited IN() clause.

    we can close the thread.

Viewing 5 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply