Tally OH! An Improved SQL 8K “CSV Splitter” Function

  • WayneS (5/4/2011)


    Nadrek, I was thinking about this on my commute, and am wondering if you would test this out (I'm not going to have the time today like I had yesterday)

    (It creates a zero-based tally table from a one-based tally table):

    CREATE FUNCTION [dbo].[YourFunction0Based](@pString [varchar](8000), @pDelimiter [char](1))

    RETURNS TABLE

    RETURN

    WITH

    cteTally(N) AS (--==== This limits the number of rows right up front

    -- for both a performance gain and prevention of accidental "overruns"

    -- changed "+1" to "+2", made N be "N-1" to create a zero-based virtual tally table

    SELECT TOP (DATALENGTH(ISNULL(@pString,1))+2) N-1

    FROM YourDB.dbo.YourTallyTable0Based WITH (nolock)

    WHERE N >= 1

    -- add where clause to NOT get a zero if a zero based tally table.

    ),

    cteStart(N1) AS (--==== This returns N+1 (starting position of each "element" just once for each delimiter)

    SELECT t.N+1

    FROM cteTally t

    WHERE (SUBSTRING(@pString,t.N,1) = @pDelimiter OR t.N = 0)

    )

    --===== Do the actual split. The ISNULL/NULLIF combo handles the length for the final element when no delimiter is found.

    SELECT ItemNumber = ROW_NUMBER() OVER(ORDER BY s.N1), -- not including StringNumber actually increases duration and CPU, but uses slightly fewer writes

    Item = SUBSTRING(@pString,s.N1,ISNULL(NULLIF(CHARINDEX(@pDelimiter,@pString,s.N1),0)-s.N1,8000))

    FROM cteStart s WITH (nolock)

    ;

    Results below. Wayne, for the very largest case, yours looks only very slightly faster than the article's code.

    1,SplitRCte,1000,1,1,10,0.02000,1,5,10

    2,SplitXML,1000,1,1,10,0.14600,1,5,10

    3,DelimitedSplit8K,1000,1,1,10,0.00600,1,5,10

    4,DelimitedSplit8KPerm0Based,1000,1,1,10,0.00600,1,5,10

    5,DelimitedSplit8KPermUnion,1000,1,1,10,0.00600,1,5,10

    6,Wayne1Based,1000,1,1,10,0.01300,1,5,10

    7,SplitRCte,1000,2,1,10,0.04300,3,12,21

    8,SplitXML,1000,2,1,10,0.18300,3,12,21

    9,DelimitedSplit8K,1000,2,1,10,0.02000,3,12,21

    10,DelimitedSplit8KPerm0Based,1000,2,1,10,0.01600,3,12,21

    11,DelimitedSplit8KPermUnion,1000,2,1,10,0.01600,3,12,21

    12,Wayne1Based,1000,2,1,10,0.02000,3,12,21

    13,SplitRCte,1000,4,1,10,0.08000,8,25,42

    14,SplitXML,1000,4,1,10,0.21600,8,25,42

    15,DelimitedSplit8K,1000,4,1,10,0.06300,8,25,42

    16,DelimitedSplit8KPerm0Based,1000,4,1,10,0.02300,8,25,42

    17,DelimitedSplit8KPermUnion,1000,4,1,10,0.06300,8,25,42

    18,Wayne1Based,1000,4,1,10,0.02300,8,25,42

    19,SplitRCte,1000,8,1,10,0.14000,29,51,75

    20,SplitXML,1000,8,1,10,0.27300,29,51,75

    21,DelimitedSplit8K,1000,8,1,10,0.11600,29,51,75

    22,DelimitedSplit8KPerm0Based,1000,8,1,10,0.10000,29,51,75

    23,DelimitedSplit8KPermUnion,1000,8,1,10,0.13000,29,51,75

    24,Wayne1Based,1000,8,1,10,0.10600,29,51,75

    25,SplitRCte,1000,16,1,10,0.27300,59,103,141

    26,SplitXML,1000,16,1,10,0.37600,59,103,141

    27,DelimitedSplit8K,1000,16,1,10,0.07000,59,103,141

    28,DelimitedSplit8KPerm0Based,1000,16,1,10,0.08600,59,103,141

    29,DelimitedSplit8KPermUnion,1000,16,1,10,0.06600,59,103,141

    30,Wayne1Based,1000,16,1,10,0.09000,59,103,141

    31,SplitRCte,1000,32,1,10,0.57300,155,206,253

    32,SplitXML,1000,32,1,10,0.58300,155,206,253

    33,DelimitedSplit8K,1000,32,1,10,0.12300,155,206,253

    34,DelimitedSplit8KPerm0Based,1000,32,1,10,0.12600,155,206,253

    35,DelimitedSplit8KPermUnion,1000,32,1,10,0.12000,155,206,253

    36,Wayne1Based,1000,32,1,10,0.13600,155,206,253

    37,SplitRCte,1000,64,1,10,1.03600,342,415,480

    38,SplitXML,1000,64,1,10,0.99000,342,415,480

    39,DelimitedSplit8K,1000,64,1,10,0.23000,342,415,480

    40,DelimitedSplit8KPerm0Based,1000,64,1,10,0.22600,342,415,480

    41,DelimitedSplit8KPermUnion,1000,64,1,10,0.22000,342,415,480

    42,Wayne1Based,1000,64,1,10,0.24000,342,415,480

    43,SplitRCte,1000,128,1,10,2.06300,747,830,920

    44,SplitXML,1000,128,1,10,1.81300,747,830,920

    45,DelimitedSplit8K,1000,128,1,10,0.44600,747,830,920

    46,DelimitedSplit8KPerm0Based,1000,128,1,10,0.42300,747,830,920

    47,DelimitedSplit8KPermUnion,1000,128,1,10,0.43600,747,830,920

    48,Wayne1Based,1000,128,1,10,0.44600,747,830,920

    49,SplitRCte,1000,256,1,10,4.10300,1522,1664,1833

    50,SplitXML,1000,256,1,10,3.45000,1522,1664,1833

    51,DelimitedSplit8K,1000,256,1,10,0.88300,1522,1664,1833

    52,DelimitedSplit8KPerm0Based,1000,256,1,10,0.83000,1522,1664,1833

    53,DelimitedSplit8KPermUnion,1000,256,1,10,0.84000,1522,1664,1833

    54,Wayne1Based,1000,256,1,10,0.88300,1522,1664,1833

    55,SplitRCte,1000,512,1,10,8.21600,3122,3327,3531

    56,SplitXML,1000,512,1,10,6.87300,3122,3327,3531

    57,DelimitedSplit8K,1000,512,1,10,1.74000,3122,3327,3531

    58,DelimitedSplit8KPerm0Based,1000,512,1,10,1.60000,3122,3327,3531

    59,DelimitedSplit8KPermUnion,1000,512,1,10,1.66600,3122,3327,3531

    60,Wayne1Based,1000,512,1,10,1.72600,3122,3327,3531

    61,SplitRCte,1000,1150,1,10,18.47300,7175,7470,7860

    62,SplitXML,1000,1150,1,10,15.52300,7175,7470,7860

    63,DelimitedSplit8K,1000,1150,1,10,3.87600,7175,7470,7860

    64,DelimitedSplit8KPerm0Based,1000,1150,1,10,3.58000,7175,7470,7860

    65,DelimitedSplit8KPermUnion,1000,1150,1,10,3.80300,7175,7470,7860

    66,Wayne1Based,1000,1150,1,10,3.80600,7175,7470,7860

    67,SplitRCte,1000,1,10,20,0.02600,10,15,20

    68,SplitXML,1000,1,10,20,0.15000,10,15,20

    69,DelimitedSplit8K,1000,1,10,20,0.01300,10,15,20

    70,DelimitedSplit8KPerm0Based,1000,1,10,20,0.01300,10,15,20

    71,DelimitedSplit8KPermUnion,1000,1,10,20,0.01300,10,15,20

    72,Wayne1Based,1000,1,10,20,0.01300,10,15,20

    73,SplitRCte,1000,2,10,20,0.04600,21,30,41

    74,SplitXML,1000,2,10,20,0.18600,21,30,41

    75,DelimitedSplit8K,1000,2,10,20,0.02300,21,30,41

    76,DelimitedSplit8KPerm0Based,1000,2,10,20,0.01600,21,30,41

    77,DelimitedSplit8KPermUnion,1000,2,10,20,0.02000,21,30,41

    78,Wayne1Based,1000,2,10,20,0.02600,21,30,41

    79,SplitRCte,1000,4,10,20,0.08000,45,62,82

    80,SplitXML,1000,4,10,20,0.22000,45,62,82

    81,DelimitedSplit8K,1000,4,10,20,0.04000,45,62,82

    82,DelimitedSplit8KPerm0Based,1000,4,10,20,0.05000,45,62,82

    83,DelimitedSplit8KPermUnion,1000,4,10,20,0.08000,45,62,82

    84,Wayne1Based,1000,4,10,20,0.05000,45,62,82

    85,SplitRCte,1000,8,10,20,0.15000,100,126,154

    86,SplitXML,1000,8,10,20,0.27600,100,126,154

    87,DelimitedSplit8K,1000,8,10,20,0.06600,100,126,154

    88,DelimitedSplit8KPerm0Based,1000,8,10,20,0.07000,100,126,154

    89,DelimitedSplit8KPermUnion,1000,8,10,20,0.06300,100,126,154

    90,Wayne1Based,1000,8,10,20,0.07600,100,126,154

    91,SplitRCte,1000,16,10,20,0.27600,219,254,299

    92,SplitXML,1000,16,10,20,0.38300,219,254,299

    93,DelimitedSplit8K,1000,16,10,20,0.12000,219,254,299

    94,DelimitedSplit8KPerm0Based,1000,16,10,20,0.11600,219,254,299

    95,DelimitedSplit8KPermUnion,1000,16,10,20,0.11000,219,254,299

    96,Wayne1Based,1000,16,10,20,0.12600,219,254,299

    97,SplitRCte,1000,32,10,20,0.54000,461,510,564

    98,SplitXML,1000,32,10,20,0.60000,461,510,564

    99,DelimitedSplit8K,1000,32,10,20,0.21600,461,510,564

    100,DelimitedSplit8KPerm0Based,1000,32,10,20,0.20600,461,510,564

    101,DelimitedSplit8KPermUnion,1000,32,10,20,0.20600,461,510,564

    102,Wayne1Based,1000,32,10,20,0.23300,461,510,564

    103,SplitRCte,1000,64,10,20,1.07000,932,1022,1110

    104,SplitXML,1000,64,10,20,1.02000,932,1022,1110

    105,DelimitedSplit8K,1000,64,10,20,0.42600,932,1022,1110

    106,DelimitedSplit8KPerm0Based,1000,64,10,20,0.39000,932,1022,1110

    107,DelimitedSplit8KPermUnion,1000,64,10,20,0.39600,932,1022,1110

    108,Wayne1Based,1000,64,10,20,0.43000,932,1022,1110

    109,SplitRCte,1000,128,10,20,2.08300,1938,2046,2156

    110,SplitXML,1000,128,10,20,1.87600,1938,2046,2156

    111,DelimitedSplit8K,1000,128,10,20,0.84000,1938,2046,2156

    112,DelimitedSplit8KPerm0Based,1000,128,10,20,0.75600,1938,2046,2156

    113,DelimitedSplit8KPermUnion,1000,128,10,20,0.77600,1938,2046,2156

    114,Wayne1Based,1000,128,10,20,0.83600,1938,2046,2156

    115,SplitRCte,1000,256,10,20,4.19600,3952,4096,4242

    116,SplitXML,1000,256,10,20,3.65000,3952,4096,4242

    117,DelimitedSplit8K,1000,256,10,20,1.65300,3952,4096,4242

    118,DelimitedSplit8KPerm0Based,1000,256,10,20,1.48000,3952,4096,4242

    119,DelimitedSplit8KPermUnion,1000,256,10,20,1.57000,3952,4096,4242

    120,Wayne1Based,1000,256,10,20,1.63000,3952,4096,4242

    121,SplitRCte,1000,480,10,20,7.79600,7450,7678,7948

    122,SplitXML,1000,480,10,20,6.73600,7450,7678,7948

    123,DelimitedSplit8K,1000,480,10,20,3.08600,7450,7678,7948

    124,DelimitedSplit8KPerm0Based,1000,480,10,20,2.77600,7450,7678,7948

    125,DelimitedSplit8KPermUnion,1000,480,10,20,2.89300,7450,7678,7948

    126,Wayne1Based,1000,480,10,20,3.04000,7450,7678,7948

    127,SplitRCte,1000,1,20,30,0.02600,20,25,30

    128,SplitXML,1000,1,20,30,0.15300,20,25,30

    129,DelimitedSplit8K,1000,1,20,30,0.01600,20,25,30

    130,DelimitedSplit8KPerm0Based,1000,1,20,30,0.01300,20,25,30

    131,DelimitedSplit8KPermUnion,1000,1,20,30,0.01300,20,25,30

    132,Wayne1Based,1000,1,20,30,0.01300,20,25,30

    133,SplitRCte,1000,2,20,30,0.04600,41,51,61

    134,SplitXML,1000,2,20,30,0.19000,41,51,61

    135,DelimitedSplit8K,1000,2,20,30,0.02300,41,51,61

    136,DelimitedSplit8KPerm0Based,1000,2,20,30,0.02000,41,51,61

    137,DelimitedSplit8KPermUnion,1000,2,20,30,0.02300,41,51,61

    138,Wayne1Based,1000,2,20,30,0.02300,41,51,61

    139,SplitRCte,1000,4,20,30,0.08300,86,103,121

    140,SplitXML,1000,4,20,30,0.22300,86,103,121

    141,DelimitedSplit8K,1000,4,20,30,0.05300,86,103,121

    142,DelimitedSplit8KPerm0Based,1000,4,20,30,0.06000,86,103,121

    143,DelimitedSplit8KPermUnion,1000,4,20,30,0.05000,86,103,121

    144,Wayne1Based,1000,4,20,30,0.06600,86,103,121

    145,SplitRCte,1000,8,20,30,0.14600,180,207,236

    146,SplitXML,1000,8,20,30,0.28000,180,207,236

    147,DelimitedSplit8K,1000,8,20,30,0.09000,180,207,236

    148,DelimitedSplit8KPerm0Based,1000,8,20,30,0.09300,180,207,236

    149,DelimitedSplit8KPermUnion,1000,8,20,30,0.08300,180,207,236

    150,Wayne1Based,1000,8,20,30,0.10300,180,207,236

    151,SplitRCte,1000,16,20,30,0.28000,375,414,451

    152,SplitXML,1000,16,20,30,0.39000,375,414,451

    153,DelimitedSplit8K,1000,16,20,30,0.16600,375,414,451

    154,DelimitedSplit8KPerm0Based,1000,16,20,30,0.16000,375,414,451

    155,DelimitedSplit8KPermUnion,1000,16,20,30,0.15300,375,414,451

    156,Wayne1Based,1000,16,20,30,0.17600,375,414,451

    157,SplitRCte,1000,32,20,30,0.54000,783,830,896

    158,SplitXML,1000,32,20,30,0.71000,783,830,896

    159,DelimitedSplit8K,1000,32,20,30,0.32300,783,830,896

    160,DelimitedSplit8KPerm0Based,1000,32,20,30,0.29300,783,830,896

    161,DelimitedSplit8KPermUnion,1000,32,20,30,0.30000,783,830,896

    162,Wayne1Based,1000,32,20,30,0.33000,783,830,896

    163,SplitRCte,1000,64,20,30,1.06600,1584,1663,1755

    164,SplitXML,1000,64,20,30,1.04600,1584,1663,1755

    165,DelimitedSplit8K,1000,64,20,30,0.62600,1584,1663,1755

    166,DelimitedSplit8KPerm0Based,1000,64,20,30,0.57000,1584,1663,1755

    167,DelimitedSplit8KPermUnion,1000,64,20,30,0.57300,1584,1663,1755

    168,Wayne1Based,1000,64,20,30,0.63600,1584,1663,1755

    169,SplitRCte,1000,128,20,30,2.11000,3211,3326,3431

    170,SplitXML,1000,128,20,30,1.93300,3211,3326,3431

    171,DelimitedSplit8K,1000,128,20,30,1.24000,3211,3326,3431

    172,DelimitedSplit8KPerm0Based,1000,128,20,30,1.10000,3211,3326,3431

    173,DelimitedSplit8KPermUnion,1000,128,20,30,1.14000,3211,3326,3431

    174,Wayne1Based,1000,128,20,30,1.24000,3211,3326,3431

    175,SplitRCte,1000,256,20,30,4.24000,6468,6657,6831

    176,SplitXML,1000,256,20,30,3.77600,6468,6657,6831

    177,DelimitedSplit8K,1000,256,20,30,2.48300,6468,6657,6831

    178,DelimitedSplit8KPerm0Based,1000,256,20,30,2.17300,6468,6657,6831

    179,DelimitedSplit8KPermUnion,1000,256,20,30,2.27300,6468,6657,6831

    180,Wayne1Based,1000,256,20,30,2.43300,6468,6657,6831

    181,SplitRCte,1000,290,20,30,4.81600,7357,7539,7695

    182,SplitXML,1000,290,20,30,4.25600,7357,7539,7695

    183,DelimitedSplit8K,1000,290,20,30,2.78600,7357,7539,7695

    184,DelimitedSplit8KPerm0Based,1000,290,20,30,2.45600,7357,7539,7695

    185,DelimitedSplit8KPermUnion,1000,290,20,30,2.57600,7357,7539,7695

    186,Wayne1Based,1000,290,20,30,2.75300,7357,7539,7695

    187,SplitRCte,1000,1,30,40,0.02600,30,35,40

    188,SplitXML,1000,1,30,40,0.15600,30,35,40

    189,DelimitedSplit8K,1000,1,30,40,0.01600,30,35,40

    190,DelimitedSplit8KPerm0Based,1000,1,30,40,0.01300,30,35,40

    191,DelimitedSplit8KPermUnion,1000,1,30,40,0.01600,30,35,40

    192,Wayne1Based,1000,1,30,40,0.01600,30,35,40

    193,SplitRCte,1000,2,30,40,0.04600,61,71,81

    194,SplitXML,1000,2,30,40,0.19000,61,71,81

    195,DelimitedSplit8K,1000,2,30,40,0.02600,61,71,81

    196,DelimitedSplit8KPerm0Based,1000,2,30,40,0.02000,61,71,81

    197,DelimitedSplit8KPermUnion,1000,2,30,40,0.02600,61,71,81

    198,Wayne1Based,1000,2,30,40,0.02300,61,71,81

    199,SplitRCte,1000,4,30,40,0.08600,126,143,160

    200,SplitXML,1000,4,30,40,0.22300,126,143,160

    201,DelimitedSplit8K,1000,4,30,40,0.06600,126,143,160

    202,DelimitedSplit8KPerm0Based,1000,4,30,40,0.07000,126,143,160

    203,DelimitedSplit8KPermUnion,1000,4,30,40,0.06000,126,143,160

    204,Wayne1Based,1000,4,30,40,0.07600,126,143,160

    205,SplitRCte,1000,8,30,40,0.15000,259,287,315

    206,SplitXML,1000,8,30,40,0.28300,259,287,315

    207,DelimitedSplit8K,1000,8,30,40,0.11600,259,287,315

    208,DelimitedSplit8KPerm0Based,1000,8,30,40,0.11300,259,287,315

    209,DelimitedSplit8KPermUnion,1000,8,30,40,0.10600,259,287,315

    210,Wayne1Based,1000,8,30,40,0.13300,259,287,315

    211,SplitRCte,1000,16,30,40,0.28300,538,575,627

    212,SplitXML,1000,16,30,40,0.39600,538,575,627

    213,DelimitedSplit8K,1000,16,30,40,0.22000,538,575,627

    214,DelimitedSplit8KPerm0Based,1000,16,30,40,0.20300,538,575,627

    215,DelimitedSplit8KPermUnion,1000,16,30,40,0.19600,538,575,627

    216,Wayne1Based,1000,16,30,40,0.23000,538,575,627

    217,SplitRCte,1000,32,30,40,0.55000,1093,1150,1208

    218,SplitXML,1000,32,30,40,0.62000,1093,1150,1208

    219,DelimitedSplit8K,1000,32,30,40,0.42300,1093,1150,1208

    220,DelimitedSplit8KPerm0Based,1000,32,30,40,0.38000,1093,1150,1208

    221,DelimitedSplit8KPermUnion,1000,32,30,40,0.38600,1093,1150,1208

    222,Wayne1Based,1000,32,30,40,0.42300,1093,1150,1208

    223,SplitRCte,1000,64,30,40,1.07300,2224,2302,2376

    224,SplitXML,1000,64,30,40,1.07000,2224,2302,2376

    225,DelimitedSplit8K,1000,64,30,40,0.82600,2224,2302,2376

    226,DelimitedSplit8KPerm0Based,1000,64,30,40,0.74000,2224,2302,2376

    227,DelimitedSplit8KPermUnion,1000,64,30,40,0.75600,2224,2302,2376

    228,Wayne1Based,1000,64,30,40,0.84000,2224,2302,2376

    229,SplitRCte,1000,128,30,40,2.19000,4481,4606,4733

    230,SplitXML,1000,128,30,40,2.00300,4481,4606,4733

    231,DelimitedSplit8K,1000,128,30,40,1.64600,4481,4606,4733

    232,DelimitedSplit8KPerm0Based,1000,128,30,40,1.44000,4481,4606,4733

    233,DelimitedSplit8KPermUnion,1000,128,30,40,1.50000,4481,4606,4733

    234,Wayne1Based,1000,128,30,40,1.61600,4481,4606,4733

    235,SplitRCte,1000,210,30,40,3.53000,7427,7561,7699

    236,SplitXML,1000,210,30,40,3.22000,7427,7561,7699

    237,DelimitedSplit8K,1000,210,30,40,2.70600,7427,7561,7699

    238,DelimitedSplit8KPerm0Based,1000,210,30,40,2.38000,7427,7561,7699

    239,DelimitedSplit8KPermUnion,1000,210,30,40,2.45600,7427,7561,7699

    240,Wayne1Based,1000,210,30,40,2.67000,7427,7561,7699

    241,SplitRCte,1000,1,40,50,0.02600,40,44,50

    242,SplitXML,1000,1,40,50,0.15600,40,44,50

    243,DelimitedSplit8K,1000,1,40,50,0.02600,40,44,50

    244,DelimitedSplit8KPerm0Based,1000,1,40,50,0.01000,40,44,50

    245,DelimitedSplit8KPermUnion,1000,1,40,50,0.02000,40,44,50

    246,Wayne1Based,1000,1,40,50,0.01300,40,44,50

    247,SplitRCte,1000,2,40,50,0.04600,81,90,101

    248,SplitXML,1000,2,40,50,0.19300,81,90,101

    249,DelimitedSplit8K,1000,2,40,50,0.03000,81,90,101

    250,DelimitedSplit8KPerm0Based,1000,2,40,50,0.02300,81,90,101

    251,DelimitedSplit8KPermUnion,1000,2,40,50,0.02600,81,90,101

    252,Wayne1Based,1000,2,40,50,0.02600,81,90,101

    253,SplitRCte,1000,4,40,50,0.08300,165,182,201

    254,SplitXML,1000,4,40,50,0.23000,165,182,201

    255,DelimitedSplit8K,1000,4,40,50,0.07600,165,182,201

    256,DelimitedSplit8KPerm0Based,1000,4,40,50,0.08000,165,182,201

    257,DelimitedSplit8KPermUnion,1000,4,40,50,0.07000,165,182,201

    258,Wayne1Based,1000,4,40,50,0.09000,165,182,201

    259,SplitRCte,1000,8,40,50,0.15300,342,367,396

    260,SplitXML,1000,8,40,50,0.28600,342,367,396

    261,DelimitedSplit8K,1000,8,40,50,0.14300,342,367,396

    262,DelimitedSplit8KPerm0Based,1000,8,40,50,0.13300,342,367,396

    263,DelimitedSplit8KPermUnion,1000,8,40,50,0.13000,342,367,396

    264,Wayne1Based,1000,8,40,50,0.15000,342,367,396

    265,SplitRCte,1000,16,40,50,0.28600,690,734,775

    266,SplitXML,1000,16,40,50,0.40300,690,734,775

    267,DelimitedSplit8K,1000,16,40,50,0.27000,690,734,775

    268,DelimitedSplit8KPerm0Based,1000,16,40,50,0.24600,690,734,775

    269,DelimitedSplit8KPermUnion,1000,16,40,50,0.24300,690,734,775

    270,Wayne1Based,1000,16,40,50,0.27600,690,734,775

    271,SplitRCte,1000,32,40,50,0.55600,1408,1471,1527

    272,SplitXML,1000,32,40,50,0.63300,1408,1471,1527

    273,DelimitedSplit8K,1000,32,40,50,0.52600,1408,1471,1527

    274,DelimitedSplit8KPerm0Based,1000,32,40,50,0.46600,1408,1471,1527

    275,DelimitedSplit8KPermUnion,1000,32,40,50,0.47600,1408,1471,1527

    276,Wayne1Based,1000,32,40,50,0.52600,1408,1471,1527

    277,SplitRCte,1000,64,40,50,1.10300,2858,2943,3023

    278,SplitXML,1000,64,40,50,1.09600,2858,2943,3023

    279,DelimitedSplit8K,1000,64,40,50,1.03300,2858,2943,3023

    280,DelimitedSplit8KPerm0Based,1000,64,40,50,0.91600,2858,2943,3023

    281,DelimitedSplit8KPermUnion,1000,64,40,50,0.93300,2858,2943,3023

    282,Wayne1Based,1000,64,40,50,1.03000,2858,2943,3023

    283,SplitRCte,1000,128,40,50,2.20000,5781,5886,5999

    284,SplitXML,1000,128,40,50,2.05300,5781,5886,5999

    285,DelimitedSplit8K,1000,128,40,50,2.05600,5781,5886,5999

    286,DelimitedSplit8KPerm0Based,1000,128,40,50,1.79300,5781,5886,5999

    287,DelimitedSplit8KPermUnion,1000,128,40,50,1.85600,5781,5886,5999

    288,Wayne1Based,1000,128,40,50,2.04000,5781,5886,5999

    289,SplitRCte,1000,165,40,50,2.81300,7479,7589,7724

    290,SplitXML,1000,165,40,50,2.63300,7479,7589,7724

    291,DelimitedSplit8K,1000,165,40,50,2.64600,7479,7589,7724

    292,DelimitedSplit8KPerm0Based,1000,165,40,50,2.32600,7479,7589,7724

    293,DelimitedSplit8KPermUnion,1000,165,40,50,2.39000,7479,7589,7724

    294,Wayne1Based,1000,165,40,50,2.61000,7479,7589,7724

  • I'm late here .... Jeff, I can see the brilliant stuff in this great article! -

    Come and be a twitter guy (@JeffModen), I would like to see the SQL folks discussing live in the Twitter...!

    ============================================================
    SELECT YOUR PROBLEM FROM SSC.com WHERE PROBLEM DESCRIPTION =
    http://www.sqlservercentral.com/articles/Best+Practices/61537/[/url]

  • Nadrek (5/4/2011)


    Results below. Wayne, for the very largest case, yours looks only very slightly faster than the article's code.

    Well, it was worth a shot. At least we know now that it's not as good.

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

  • WayneS (5/3/2011)


    Nadrek (5/3/2011)


    WayneS (5/3/2011)


    Why don't you take your functions, integrate them into the test script included in the References section of the article, run the tests, and post the results for all of us to enjoy?

    All right, but I'm not much with the fancy graphs, so here's the output exported as a csv (I don't know what the forum would do to tabs)

    Here's your data converted to "the fancy graphs".

    10 to 20 elements

    20 to 30 elements

    Would you mind generating a new graph, purely with less than 50 elements for these? That should show the crossover point when the 0 based permanent table starts beating the union, and whether it's a linear distribution or not in that area.

  • Jeff Moden (5/4/2011)


    SQLkiwi (5/4/2011)


    This is the CLR code I wrote for Jeff.

    Thanks for stopping by, Paul. Now I don't have to send you an email to ask you to "stop by" with your code. I didn't want to presume to post the code you gave me without asking. 🙂

    No worries.

  • KDM8943 (5/4/2011)


    Jeff, great article!... I just read through it a few minutes ago, but as I was walking through you code explanation for handling the start and end positions I thought your epiphany was going to be putting a 'delim' at the beginning and the end of the input string.... bypassing having to handle no delims at these positions... it would be a one-time hit modifying the string, but the 'delim/data' pattern would be consistent w/o having to create extra code to account for the lack of delim's... just a thought...

    Thanks for the effort!

    KDM

    Thanks for the feedback, KDM.

    As I said in the article, the original function concatenated delimiters to the beginning and the end. The problem was that it either had to be done within the main select or as a subquery in order to preserve the nature of an Inline Table Valued Function instead of resorting to just a Multiline Table Valued fFunction which can be as slow as a scalar function. In fact, the method I used got rid of a fair amount of code in the form of calculations.

    Some folks have posted their functions, as well. I've not had the chance to examine them closely nor test tem but ome of them look like they may be faster than even what I submitted in the article according to the testing they've done.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Sorry... duplicated post removed.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Dugi (5/4/2011)


    I'm late here .... Jeff, I can see the brilliant stuff in this great article! -

    Come and be a twitter guy (@JeffModen), I would like to see the SQL folks discussing live in the Twitter...!

    Hi Dugi! Thanks for stopping by and for the compliment. No... no twitter for me. I can hardly keep up with all the posts on this discussion. 😛

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Nadrek (5/4/2011)


    WayneS (5/3/2011)


    Nadrek (5/3/2011)


    WayneS (5/3/2011)


    Why don't you take your functions, integrate them into the test script included in the References section of the article, run the tests, and post the results for all of us to enjoy?

    All right, but I'm not much with the fancy graphs, so here's the output exported as a csv (I don't know what the forum would do to tabs)

    Here's your data converted to "the fancy graphs".

    10 to 20 elements

    20 to 30 elements

    Would you mind generating a new graph, purely with less than 50 elements for these? That should show the crossover point when the 0 based permanent table starts beating the union, and whether it's a linear distribution or not in that area.

    I've got this one. I'm running tests right now and I'll post the charts using a Log/Log scale so we can see the details on the low end.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • gary.rumble (5/3/2011)


    Jeff Moden (5/3/2011)


    gary.rumble (5/3/2011)


    Jeff Moden (5/2/2011)


    gary.rumble (5/2/2011)


    Well, so far I got:

    ...

    I think your code is too much for my server. 😉

    Yowch. The code is even split into batches. What you may have to do is run the code a section at a time up to where the test loop begins and the let the test loop rip.

    Thanks for trying, Gary. If for some reason, your server just won't take it, let me know and we'll test your code for you.

    I set up a new SS instance on my laptop and got the tests to run. Looks like my code tracks your results fairly closely, but of course it doesn't create the numbers table on the fly so it probably has an advantage there.

    I emailed you the results. I couldn't generate your pretty graphs, though.

    Thanks for the article.

    No problem, Gary. You obviously had to convert your code to a function to run it through my test harness. Could you post your function, please. Thanks.

    It is thus:

    IF OBJECT_ID('dbo.GRDelimitedSplit') IS NOT NULL

    DROP FUNCTION dbo.GRDelimitedSplit;

    GO

    CREATE FUNCTION [dbo].[GRDelimitedSplit]

    (@text VARCHAR(max), @delimiter CHAR(1))

    RETURNS @Return TABLE (ItemNumber SMALLINT, Item VARCHAR(max))

    WITH SCHEMABINDING AS

    begin

    declare @len int

    set @len = len(@text) + 1

    ;with cte1 as (

    select 0 as number, 0 as row

    union

    select number, row_number() over (order by number) as row from dbo.Numbers

    where number <= @len and substring(@text, number, 1) = @delimiter

    )

    insert into @Return

    select

    ROW_NUMBER() OVER(ORDER BY c1.row),

    substring(@text, c1.number + 1, coalesce(c2.number - 1, @len) - c1.number) from cte1 c1

    left join cte1 c2 on c1.row = c2.row - 1

    order by c1.row

    return;

    end

    go

    Gary... sorry, my friend. I'm not sure how you got the results you sent me because I had to stop your code because it only made it to the 1-10 element size with only 512 elements on each row. Your code will be in the attachment when I publish my race results for Nadrek's code but the actual test is commented out. You'll need to send me your actual test and setup for the Tally Table so I can see if I'm doing something wrong.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (5/3/2011)


    What I find really strange is why do people wait until I write an article to bring these wonders forward? 😛

    To go along with this... where were they the past year that we've been posting the "old" DelimitedSplit8K code for solutions? Jeez, it sure would have been nice to have seen this back then!

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

  • WayneS (5/4/2011)


    Jeff Moden (5/3/2011)


    What I find really strange is why do people wait until I write an article to bring these wonders forward? 😛

    To go along with this... where were they the past year that we've been posting the "old" DelimitedSplit8K code for solutions? Jeez, it sure would have been nice to have seen this back then!

    I guess I'm going blind or slowly insane. I didn't look that closely at Nadrek's functions before. The only mods he made were on the Tally Table source... he left the dust bunny code the way it was. I don't feel so bad now. 😀 Wellll... lookie there in the corner. Now the dust bunnies are wrapped up in binkies, twiddling their hair, and suckin on ... WAIT A MINUTE!!!! THOSE ARE MY BEER POPSICLES!!!!:hehe:

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Here are the runs using random elements for 1 to 10 characters in size with the first chart showing just 1 to 50 elements per row and the second chart showing the full range. Nicely done, Nadrek! The cool part about all of this is if you run into a DBA that won't allow auxiliary tables such as a Tally Table, the cteTally method isn't that far off. The third chart is for 40 to 50 characters per element for the full range up to almost 8 kbytes just to show that Nadrek's code doesn't have the old performance problem, either.

    I'll attach the code I did the testing with in just a minute or two.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (5/4/2011)


    Wellll... lookie there in the corner. Now the dust bunnies are wrapped up in binkies, twiddling their hair, and suckin on ... WAIT A MINUTE!!!! THOSE ARE MY BEER POPSICLES!!!!:hehe:

    Looks like Jeff is going to war with the dust bunnies over his popsicles... 😀

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

  • Jeff Moden (5/4/2011)


    Nicely done, Nadrek!

    Yes indeed, very nice!

    The cool part about all of this is if you run into a DBA that won't allow auxiliary tables such as a Tally Table, the cteTally method isn't that far off.

    I think that any of these methods would be very good ones to have.

    How did Peso's splitter compare? He usually has some wickedly fast routines...

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

Viewing 15 posts - 151 through 165 (of 990 total)

You must be logged in to reply to this topic. Login to reply