Home Forums SQL Server 2005 T-SQL (SS2K5) How to expand this variable for the query to work RE: How to expand this variable for the query to work

  • Two points:

    1. Splitting a list in a While loop is more expensive than splitting one using a Numbers table. (See http://www.simple-talk.com/sql/t-sql-programming/the-helper-table-workbench/ for a lot of good data on this subject.)

    2. The "Like" solution fails the moment you start using multi-digit IDs. For example, try it out with "12,13,175", and you'll get IDs 1, 2, 3, 7, 5, 12, 13, 175. Becase "%1%" is like "12", and like "13" and like "175".

    The function I use for parsing is:

    ALTER function [dbo].[StringParser]

    (@String_in varchar(max),

    @Delimiter_in char(1))

    returns table

    as

    return(

    SELECT top 100 percent

    SUBSTRING(@String_in+@Delimiter_in, number,

    CHARINDEX(@Delimiter_in, @String_in+@Delimiter_in, number) - number) as Parsed

    FROM dbo.numbers

    WHERE number <= LEN(@String_in)

    AND SUBSTRING(@Delimiter_in + @String_in, number, 1) = @Delimiter_in

    ORDER BY number

    )

    (I have this in my Common database, along with a Numbers table that goes from 0 to 9999.)

    I picked a medium sized table, dbo.Names, with 149,000 rows of data, and ran:

    select *

    from dbo.names

    where '12,13,175' like '%' + cast(nameid as varchar(10)) + '%'

    select *

    from dbo.names

    inner join common.dbo.stringparser('12,13,175', ',')

    on names.nameid = stringparser.parsed

    The first one returned values I didn't really want (IDs 1, 2, 3) as well as the IDs I want, and had an execution cost of .838. The second one returned the exact values I want, and an execution cost of .799.

    The cost is only slightly different, but the first one included a Clustered Index Scan, while the second had a Clustered Index Seek. On a larger table, the difference in cost would matter more.

    So the difference in cost is an illusion, caused by using a While loop to take apart the string instead of using a more efficient solution, and it can result in wrong data.

    I'd take a very close look at this before including it in "best practices".

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon