Home Forums SQL Server 2008 T-SQL (SS2K8) Parse field with Numeric ranges and comma deliminated values RE: Parse field with Numeric ranges and comma deliminated values

  • Lowell (3/25/2010)


    nice breakthrough Wayne; I couldn't get my mind around the multiple ranges;

    performance on this solution sux, took my decent dev machine 1:09 to return the 151 rows, we might look at optimizing it , but i'm just happy to add a solution like that to my toolbox.

    Yes, the performance isn't very good. Several things contribute to this:

    1. using a virtual (aka in-line or dynamic) tally table vs. a physical tally table with a good clustered index

    2. the tally table is being called twice

    3. we're looking at numbers that require building it into the million-row range

    If you replace the virtual tally table with a physical one, things perform much better. On my system, it goes from 50+ seconds to <1 second.

    To create the permanent tally table with a clustered index:

    ;WITH

    Tens (N) AS (SELECT 0 UNION ALL SELECT 0 UNION ALL SELECT 0 UNION ALL SELECT 0 UNION ALL SELECT 0 UNION ALL

    SELECT 0 UNION ALL SELECT 0 UNION ALL SELECT 0 UNION ALL SELECT 0 UNION ALL SELECT 0),

    Thousands(N) AS (SELECT 1 FROM Tens t1 CROSS JOIN Tens t2 CROSS JOIN Tens t3),

    Millions (N) AS (SELECT 1 FROM Thousands t1 CROSS JOIN Thousands t2),

    Tally (N) AS (SELECT ROW_NUMBER() OVER (ORDER BY (SELECT 0)) FROM Millions)

    SELECT N

    INTO dbo.TALLY

    FROM Tally

    ORDER BY N

    ALTER TABLE dbo.TALLY ALTER COLUMN N int NOT NULL

    ALTER TABLE dbo.TALLY ADD CONSTRAINT [PK_TALLY] PRIMARY KEY CLUSTERED (N)

    Then, change the previously posted code for this solution to:

    -- See how this starts off by creating a table

    -- and inserting representative test data into it?

    -- If you do this, it makes it a LOT easier for all

    -- of the volunteers on this site to just copy/paste

    -- this into a query window and start working on it.

    DECLARE @test-2 TABLE (MixedData varchar(50))

    INSERT INTO @test-2

    SELECT '44100-44110, 44150' UNION ALL

    SELECT '44115,44125,44157' UNION ALL

    SELECT ' 44250-44350, 44128, 44524-44555, 44112, 52144'

    ;WITH

    -- first, need to break down each mixed data row into separate items,

    -- either a single item or a range. See Jeff Moden's article

    -- The "Numbers" or "Tally" Table: What it is and how it replaces a loop.

    -- at http://www.sqlservercentral.com/articles/T-SQL/62867/ for how

    -- a tally table can split strings apart.

    Elements (Items) AS (

    SELECT RTRIM(LTRIM(SUBSTRING(',' + MixedData + ',',N+1,CHARINDEX(',',',' + MixedData + ',',N+1)-N-1)))

    FROM dbo.Tally, @test-2

    WHERE N < LEN(',' + MixedData + ',')

    AND SUBSTRING(',' + MixedData + ',',N,1) = ','),

    -- Now, for each item, get the range start/end. A range will have a "-".

    -- If just a single item, use the item for both the range start and end.

    Elements2 (RangeStart, RangeEnd) AS (

    SELECT CONVERT(INT, CASE WHEN CharIndex('-', Items) > 0 THEN LEFT(Items, CharIndex('-', Items)-1)

    ELSE Items

    END),

    CONVERT(INT, CASE WHEN CharIndex('-', Items) > 0 THEN SUBSTRING(Items, CharIndex('-', Items)+1, len(Items))

    ELSE Items

    END)

    FROM Elements)

    -- Finally, return all the individual items

    SELECT N

    FROM dbo.Tally, Elements2

    WHERE N BETWEEN Elements2.RangeStart and Elements2.RangeEnd

    Lowell and Jeff, I'm interested in how this improves the performance for you.

    My results:

    Table '#07020F21'. Scan count 1, logical reads 1, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'TALLY'. Scan count 13, logical reads 73, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 15 ms, elapsed time = 165 ms.

    Execution plan is attached.

    Edit: added performance results and execution plan

    Edit2: corrected name misspelling

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2