Since looping is bad, how should I accomplish what I am trying to do?

  • I have a table with records each having a unique Identifier, a start date, and two semi-colon delimited strings which I need to break out into a value for each date with the first date being the start date value for each record.

    Currently, I am doing all sorts of bad things, in that I am using a While loop to go through each record in the table (16K records+) and I am using a split function that uses a While loop. The split function is based on Oskar Austegard's dbo.fnSplit function. I've been reading about how using a loop to split is bad for performance, but these delimited strings have at most 100 Items.

    I am beginning to try to learn about CROSS APPLY and I have been using CTEs for several SPs I've created, but I am not sure if using a CTE here would work or how it would work. I especially am not sure on CROSS APPLY or any other APPLY.

    I will gladly post my query and the adulterated split function if someone is willing to assist.

    Thank you in advance.

    Rob

  • If you want a better splitter, then I can't recommend strongly enough that you use Jeff Moden's DelimitedSplit8K, found here:
        http://www.sqlservercentral.com/articles/72993/
    It performs quite well, and only CLR goes faster.   The code can be found at the end of the article, and the article itself is a great read.  Anyway, one WHILE loop on top of another is going to be a problem for you, if not now, then perhaps just around the corner, as volume increases will eventually turn into a severe performance penalty.   You can cetainly use CROSS APPLY together with this function, but you might not need to, depending on the order of appearance of the values you are looking for within the given string.   If you can provide some sample data and expected results, in the form of CREATE TABLE and INSERT statements, and the SQL for your existing query, we should be able to re-factor it to use the better splitter and give it the kick in the performance a$$ it may well need.

    Steve (aka sgmunson) 🙂 🙂 🙂
    Rent Servers for Income (picks and shovels strategy)

  • If I understand your request properly, I THINK that this will do exactly what you are asking, no:
    http://sqljason.com/2010/05/converting-single-comma-separated-row.html
    It uses cross apply and you don't need to worry about the split function.

    If that isn't correct, could you post your table creation script, an INSERT script with sample data, the queries you have tried, and your desired output?

    EDIT - sorry, didn't see Steve already suggested getting the scripts.
    On another note - is this required to be done inside SQL?  If not, could you split the string at a different level such as SSRS, SSIS, or some home built application?
    Or could you split the string at table insert time in the application and do multiple inserts instead of inserting a comma separated list?
    I ask because based on my experience, SQL will very rarely out-perform any other application in terms of string splitting.  Fixing the data now I think is good, but you probably also want to fix it at the data insertion level as well to reduce the use of splitting things inside SQL.

    The above is all just my opinion on what you should do. 
    As with all advice you find on a random internet forum - you shouldn't blindly follow it.  Always test on a test server to see if there is negative side effects before making changes to live!
    I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.

  • @bmg02, What I am trying to do is similar to what anil ch discusses  in the comments on that post.
    I have two columns that need to be split.  I like that it doesn't need a loop or function, but can I split both columns and include the two static fields as well as the incremented dates?
    I am working on getting the table creation, insert and queries together for you and @sgmunson.

  • rmcfaden - Friday, September 29, 2017 11:12 AM

    @bmg02, What I am trying to do is similar to what anil ch discusses  in the comments on that post.
    I have two columns that need to be split.  I like that it doesn't need a loop or function, but can I split both columns and include the two static fields as well as the incremented dates?
    I am working on getting the table creation, insert and queries together for you and @sgmunson.

    CROSS APPLY can help you do that.  It will end up being something like this:

    SELECT S1.Item AS FirstDateField, S2.Item AS SecondDateField
    FROM YOUR_TABLE AS YT
        CROSS APPLY dbo.DelimitedSplit8K(YT.Field1ToSplit, ';') AS S1
        CROSS APPLY dbo.DelimitedSplt8K(YT.Field2ToSplit, ';') AS S2
    WHERE S1.ItemNumber = 1
        AND S2.ItemNumber = 1

    You'll need to adjust the values in the WHERE clause based on the positional location of the string to be extracted from each field that needs splitting.

    Steve (aka sgmunson) 🙂 🙂 🙂
    Rent Servers for Income (picks and shovels strategy)

  • With the link I provided, I think you'd just need to run things through twice.  Using the code in the link and assuming the [city] column is there twice, something like this should work:
    IF OBJECT_ID(N'#temp1') IS NOT NULL
        DROP TABLE #temp1
    GO
    SELECT A.[State],
        Split.a.value('.','VARCHAR(100)') as String1,
        A.[city]
    INTO #temp1
    FROM (SELECT [state], CAST('<M>' + REPLACE([City],',','</M><M>')+'</M>' AS XML) AS String
        FROM TableA) AS A CROSS APPLY String.nodes('/M') AS Split(a);
    SELECT A.[State],
        A.String1,
        Split.a.value('.','VARCHAR(100)') as String2
    FROM (SELECT [state],[String1],CAST('<M>' + REPLCAE([City],',','</M><M>')+'</M>' AS XML) AS String
        FROM #temp1) AS A CROSS APPLY String.nodes('/M') AS Split(a);

    Basically, you split it once and then split it a second time.  I use a temp table as it should perform faster than a nested select for this I believe, but feel free to play around with it.  You may find it faster to use a nested select.

    Although I still think that doing this outside of SQL is likely going to offer better performance than inside of SQL...

    The above is all just my opinion on what you should do. 
    As with all advice you find on a random internet forum - you shouldn't blindly follow it.  Always test on a test server to see if there is negative side effects before making changes to live!
    I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.

  • bmg002 - Friday, September 29, 2017 10:55 AM

    If I understand your request properly, I THINK that this will do exactly what you are asking, no:
    http://sqljason.com/2010/05/converting-single-comma-separated-row.html
    It uses cross apply and you don't need to worry about the split function.

    If that isn't correct, could you post your table creation script, an INSERT script with sample data, the queries you have tried, and your desired output?

    EDIT - sorry, didn't see Steve already suggested getting the scripts.
    On another note - is this required to be done inside SQL?  If not, could you split the string at a different level such as SSRS, SSIS, or some home built application?
    Or could you split the string at table insert time in the application and do multiple inserts instead of inserting a comma separated list?
    I ask because based on my experience, SQL will very rarely out-perform any other application in terms of string splitting.  Fixing the data now I think is good, but you probably also want to fix it at the data insertion level as well to reduce the use of splitting things inside SQL.

    We were going to do just that and make each piece of the delimited string its own record, but the company let my Developer go, so I am a bit stuck for the moment.

  • sgmunson - Friday, September 29, 2017 11:18 AM

    rmcfaden - Friday, September 29, 2017 11:12 AM

    @bmg02, What I am trying to do is similar to what anil ch discusses  in the comments on that post.
    I have two columns that need to be split.  I like that it doesn't need a loop or function, but can I split both columns and include the two static fields as well as the incremented dates?
    I am working on getting the table creation, insert and queries together for you and @sgmunson.

    CROSS APPLY can help you do that.  It will end up being something like this:

    SELECT S1.Item AS FirstDateField, S2.Item AS SecondDateField
    FROM YOUR_TABLE AS YT
        CROSS APPLY dbo.DelimitedSplit8K(YT.Field1ToSplit, ';') AS S1
        CROSS APPLY dbo.DelimitedSplt8K(YT.Field2ToSplit, ';') AS S2
    WHERE S1.ItemNumber = 1
        AND S2.ItemNumber = 1

    You'll need to adjust the values in the WHERE clause based on the positional location of the string to be extracted from each field that needs splitting.

    OK. Tried that, I get what it's doing, but how do I get all of the strings that are to be extracted from each split field.
    What I tried earlier today gave me too much data in that it seemed to apply each item multiple times.

    My test data:

    USE TEMPDB
    GO

    IF OBJECT_ID('dbo.TABLEA','U') IS NOT NULL DROP TABLE dbo.TABLEA;

    CREATE TABLE TABLEA (
        [Job_Prod]        varchar(100),
        [Job_Id]        varchar(16),
        Actual_Start    datetime,
        actual_Track    varchar(1000),
        pieces_track    varchar(1000)
    )

    INSERT INTO TABLEA VALUES
    ('VirginiaA', 'Virginia', '2017-09-24 08:00:00.000', '5;4;3;2;1','1;2;3;4;5;6')
    , ('VirginiaB', 'Virginia', '2017-09-25 08:00:00.000', '1;2;3;4;5;6','5;4;3;2;1')
    , ('CaliforniaA', 'California', '2017-09-24 08:00:00.000', '0.1;0.2;0.3;0.4','4;3;2;1')
    , ('CaliforniaB', 'California', '2017-09-25 08:00:00.000', '1.4;2.3;3.2;4.1','1;2;3;4')

    SELECT Job_Prod, Job_ID, CONVERT(DATE,Actual_Start,101) AS TrDate, S1.Item AS FirstDateField, S2.Item AS SecondDateField
    FROM TABLEA AS TA
      CROSS APPLY dbo.DelimitedSplitN4K(TA.actual_Track, ';') AS S1
      CROSS APPLY dbo.DelimitedSplitN4K(TA.pieces_track, ';') AS S2
    WHERE S1.ItemNumber = 4
      AND S2.ItemNumber = 4
    ORDER BY Job_Prod

    My Desired Output:

  • Job_ProdJob_IdTrDateFirstDateFieldSecondDateField
    CaliforniaACalifornia2017-09-240.14
    CaliforniaACalifornia2017-09-250.23
    CaliforniaACalifornia2017-09-260.32
    CaliforniaACalifornia2017-09-270.41
    CaliforniaBCalifornia2017-09-251.41
    CaliforniaBCalifornia2017-09-262.32
    CaliforniaBCalifornia2017-09-273.23
    CaliforniaBCalifornia2017-09-284.14
    VirginiaAVirginia2017-09-2451
    VirginiaAVirginia2017-09-2542
    VirginiaAVirginia2017-09-2633
    VirginiaAVirginia2017-09-2724
    VirginiaAVirginia2017-09-2815
    VirginiaAVirginia2017-09-2906
    VirginiaBVirginia2017-09-2515
    VirginiaBVirginia2017-09-2624
    VirginiaBVirginia2017-09-2733
    VirginiaBVirginia2017-09-2842
    VirginiaBVirginia2017-09-2951
    VirginiaBVirginia2017-09-3050

  • So I need to increment the date starting at the actual_start and if there are more Items in one of the two lists, I need to add a 0 for each missing Item

  • Given this sql code:

    USE TEMPDB
    GO

    IF OBJECT_ID('dbo.TABLEA','U') IS NOT NULL DROP TABLE dbo.TABLEA;

    CREATE TABLE TABLEA (
        [Job_Prod]        varchar(100),
        [Job_Id]        varchar(16),
        Actual_Start    datetime,
        actual_Track    varchar(1000),
        pieces_track    varchar(1000)
    )

    INSERT INTO TABLEA VALUES
    ('VirginiaA', 'Virginia', '2017-09-24 08:00:00.000', '5;4;3;2;1','1;2;3;4;5;6')
    , ('VirginiaB', 'Virginia', '2017-09-25 08:00:00.000', '1;2;3;4;5;6','5;4;3;2;1')
    , ('CaliforniaA', 'California', '2017-09-24 08:00:00.000', '0.1;0.2;0.3;0.4','4;3;2;1')
    , ('CaliforniaB', 'California', '2017-09-25 08:00:00.000', '1.4;2.3;3.2;4.1','1;2;3;4')


    Both methods:

    IF OBJECT_ID(N'#temp1') IS NOT NULL
      DROP TABLE #temp1
    GO
    SELECT A.Job_Prod, A.[Job_Id],
      Split.a.value('.','VARCHAR(100)') as String1,
      A.pieces_track
    INTO #temp1
    FROM (SELECT Job_Prod, [Job_Id], CAST('<M>' + REPLACE(actual_track,';','</M><M>')+'</M>' AS XML) AS String, Pieces_Track
      FROM TableA) AS A CROSS APPLY String.nodes('/M') AS Split(a);
        select * from #temp1
    SELECT A.Job_Prod, A.[Job_Id],
      A.String1,
      Split.a.value('.','VARCHAR(100)') as String2
    FROM (SELECT Job_Prod, [Job_Id],[String1],CAST('<M>' + REPLACE(Pieces_Track,';','</M><M>')+'</M>' AS XML) AS String
      FROM #temp1) AS A CROSS APPLY String.nodes('/M') AS Split(a);


    and

    SELECT Job_Prod, Job_ID, CONVERT(DATE,Actual_Start,101) AS TrDate, S1.Item AS FirstDateField, S2.Item AS SecondDateField
    FROM TABLEA AS TA
      CROSS APPLY dbo.DelimitedSplitN4K(TA.actual_Track, ';') AS S1
      CROSS APPLY dbo.DelimitedSplitN4K(TA.pieces_track, ';') AS S2
    --WHERE S1.ItemNumber = 6
    --  AND S2.ItemNumber = 6
    ORDER BY Job_Prod

    yield 92 records while there should be only 20

  • Welcome to the world of "Why storing comma-separated lists is bad".  Take a look at Bill Karwin's accepted answer to this post.

    That said, let's see what we can do with what's available.

    Looking at the source data Virginia has unbalanced entries in the lists
      ('VirginiaA', 'Virginia', '2017-09-24 08:00:00.000', '5;4;3;2;1','1;2;3;4;5;6')
    , ('VirginiaB', 'Virginia', '2017-09-25 08:00:00.000', '1;2;3;4;5;6','5;4;3;2;1')
    , ('CaliforniaA', 'California', '2017-09-24 08:00:00.000', '0.1;0.2;0.3;0.4','4;3;2;1')
    , ('CaliforniaB', 'California', '2017-09-25 08:00:00.000', '1.4;2.3;3.2;4.1','1;2;3;4')

    What is the logic to determine which items to match up.
    VirginiaA has 5 items int actual_Track and 6 items in pieces_track.
    VirginiaB has 6 items int actual_Track and 5 items in pieces_track.
    So this code, brings back *MOST* of the data
    SELECT Job_Prod, Job_ID, CONVERT(DATE,Actual_Start,101) AS TrDate, S1.Item AS FirstDateField, S2.Item AS SecondDateField
    FROM TABLEA AS TA
    OUTER APPLY dbo.DelimitedSplitN4K(TA.actual_Track, ';') AS S1
    OUTER APPLY dbo.DelimitedSplitN4K(TA.pieces_track, ';') AS S2
    WHERE S1.ItemNumber = S2.ItemNumber
    ORDER BY Job_Prod

    In order to make the above code, you need to add an extra ";" into the appropriate data for Virginia, to ensure that you have the same number of items in each list.  Note that the "empty" item can be anywhere in the CSV list
    INSERT INTO TABLEA VALUES
    ('VirginiaA', 'Virginia', '2017-09-24 08:00:00.000', '5;4;3;2;1;','1;2;3;4;5;6')
    , ('VirginiaB', 'Virginia', '2017-09-25 08:00:00.000', '1;2;3;4;5;6','5;;4;3;2;1')
    --, ('CaliforniaA', 'California', '2017-09-24 08:00:00.000', '0.1;0.2;0.3;0.4','4;3;2;1')
    --, ('CaliforniaB', 'California', '2017-09-25 08:00:00.000', '1.4;2.3;3.2;4.1','1;2;3;4')

    SELECT Job_Prod, Job_ID, CONVERT(DATE,Actual_Start,101) AS TrDate, S1.Item AS FirstDateField, S2.Item AS SecondDateField
    FROM TABLEA AS TA
    OUTER APPLY dbo.DelimitedSplitN4K(TA.actual_Track, ';') AS S1
    OUTER APPLY dbo.DelimitedSplitN4K(TA.pieces_track, ';') AS S2
    WHERE S1.ItemNumber = S2.ItemNumber
    ORDER BY Job_Prod

  • This should do the trick.
    -- Test data ...
    IF OBJECT_ID('tempdb..#TABLEA ', 'U') IS NOT NULL
      DROP TABLE #TABLEA;

    CREATE TABLE #TABLEA (
        PK_ID INT NOT NULL IDENTITY(1,1), -- added a pk row for something to sort an and tie everything back together after the split.
      Job_Prod VARCHAR(100),                -- if there is no pk in the real table use ROW_NUMER() with a cte to accompolish the same.
      Job_Id VARCHAR(16),
      Actual_Start DATETIME,
      actual_Track VARCHAR(1000),
      pieces_track VARCHAR(1000)
      );
    INSERT
      #TABLEA (Job_Prod, Job_Id, Actual_Start, actual_Track, pieces_track) VALUES
      ('VirginiaA', 'Virginia', '2017-09-24 08:00:00.000', '5;4;3;2;1', '1;2;3;4;5;6'),
      ('VirginiaB', 'Virginia', '2017-09-25 08:00:00.000', '1;2;3;4;5;6', '5;4;3;2;1'),
      ('CaliforniaA', 'California', '2017-09-24 08:00:00.000', '0.1;0.2;0.3;0.4', '4;3;2;1'),
      ('CaliforniaB', 'California', '2017-09-25 08:00:00.000', '1.4;2.3;3.2;4.1', '1;2;3;4');

    --============================================================================

    -- The solution...
    SELECT
        t.PK_ID,
        Job_Prod = MAX(t.Job_Prod),
        Job_Id = MAX(t.Job_Id),
        Actual_Start = MAX(t.Actual_Start),
        actual_Track = MAX(CASE WHEN trk.tType = 'A' THEN sc.Item ELSE '0' END),
        pieces_track = MAX(CASE WHEN trk.tType = 'P' THEN sc.Item ELSE '0' END)
    FROM
        #TABLEA t
        CROSS APPLY ( VALUES ('A', t.actual_Track), ('P', t.pieces_track) ) trk (tType, tValue) -- unpivot actual_Track & pieces_track so they are in a single colum. 'A' & 'P' are added so we do loose track of their original columns.
        CROSS APPLY dbo.SplitCSVToTable8K(trk.tValue, ';') sc    -- Just DelimitedSplit8K, just with a different name.
    GROUP BY
        t.PK_ID,
        sc.ItemNumber
    ORDER BY
        t.PK_ID,
        sc.ItemNumber;

    The output results...
    PK_ID   Job_Prod  Job_Id   Actual_Start    actual_Track pieces_track
    ----------- ------------- ------------ ----------------------- ------------- -------------
    1    VirginiaA  Virginia  2017-09-24 08:00:00.000 5     1
    1    VirginiaA  Virginia  2017-09-24 08:00:00.000 4     2
    1    VirginiaA  Virginia  2017-09-24 08:00:00.000 3     3
    1    VirginiaA  Virginia  2017-09-24 08:00:00.000 2     4
    1    VirginiaA  Virginia  2017-09-24 08:00:00.000 1     5
    1    VirginiaA  Virginia  2017-09-24 08:00:00.000 0     6
    2    VirginiaB  Virginia  2017-09-25 08:00:00.000 1     5
    2    VirginiaB  Virginia  2017-09-25 08:00:00.000 2     4
    2    VirginiaB  Virginia  2017-09-25 08:00:00.000 3     3
    2    VirginiaB  Virginia  2017-09-25 08:00:00.000 4     2
    2    VirginiaB  Virginia  2017-09-25 08:00:00.000 5     1
    2    VirginiaB  Virginia  2017-09-25 08:00:00.000 6     0
    3    CaliforniaA California 2017-09-24 08:00:00.000 0.1    4
    3    CaliforniaA California 2017-09-24 08:00:00.000 0.2    3
    3    CaliforniaA California 2017-09-24 08:00:00.000 0.3    2
    3    CaliforniaA California 2017-09-24 08:00:00.000 0.4    1
    4    CaliforniaB California 2017-09-25 08:00:00.000 1.4    1
    4    CaliforniaB California 2017-09-25 08:00:00.000 2.3    2
    4    CaliforniaB California 2017-09-25 08:00:00.000 3.2    3
    4    CaliforniaB California 2017-09-25 08:00:00.000 4.1    4

    Sorry about the results formatting... This forum butchers the "results to text" format...
    I'm sure you get the idea though... 😀
    Hope this helps,
    Jason

  • DesNorton - Saturday, September 30, 2017 12:56 AM

    Welcome to the world of "Why storing comma-separated lists is bad".  Take a look at Bill Karwin's accepted answer to this post.

    That said, let's see what we can do with what's available.

    Looking at the source data Virginia has unbalanced entries in the lists
      ('VirginiaA', 'Virginia', '2017-09-24 08:00:00.000', '5;4;3;2;1','1;2;3;4;5;6')
    , ('VirginiaB', 'Virginia', '2017-09-25 08:00:00.000', '1;2;3;4;5;6','5;4;3;2;1')
    , ('CaliforniaA', 'California', '2017-09-24 08:00:00.000', '0.1;0.2;0.3;0.4','4;3;2;1')
    , ('CaliforniaB', 'California', '2017-09-25 08:00:00.000', '1.4;2.3;3.2;4.1','1;2;3;4')

    What is the logic to determine which items to match up.
    VirginiaA has 5 items int actual_Track and 6 items in pieces_track.
    VirginiaB has 6 items int actual_Track and 5 items in pieces_track.
    So this code, brings back *MOST* of the data
    SELECT Job_Prod, Job_ID, CONVERT(DATE,Actual_Start,101) AS TrDate, S1.Item AS FirstDateField, S2.Item AS SecondDateField
    FROM TABLEA AS TA
    OUTER APPLY dbo.DelimitedSplitN4K(TA.actual_Track, ';') AS S1
    OUTER APPLY dbo.DelimitedSplitN4K(TA.pieces_track, ';') AS S2
    WHERE S1.ItemNumber = S2.ItemNumber
    ORDER BY Job_Prod

    In order to make the above code, you need to add an extra ";" into the appropriate data for Virginia, to ensure that you have the same number of items in each list.  Note that the "empty" item can be anywhere in the CSV list
    INSERT INTO TABLEA VALUES
    ('VirginiaA', 'Virginia', '2017-09-24 08:00:00.000', '5;4;3;2;1;','1;2;3;4;5;6')
    , ('VirginiaB', 'Virginia', '2017-09-25 08:00:00.000', '1;2;3;4;5;6','5;;4;3;2;1')
    --, ('CaliforniaA', 'California', '2017-09-24 08:00:00.000', '0.1;0.2;0.3;0.4','4;3;2;1')
    --, ('CaliforniaB', 'California', '2017-09-25 08:00:00.000', '1.4;2.3;3.2;4.1','1;2;3;4')

    SELECT Job_Prod, Job_ID, CONVERT(DATE,Actual_Start,101) AS TrDate, S1.Item AS FirstDateField, S2.Item AS SecondDateField
    FROM TABLEA AS TA
    OUTER APPLY dbo.DelimitedSplitN4K(TA.actual_Track, ';') AS S1
    OUTER APPLY dbo.DelimitedSplitN4K(TA.pieces_track, ';') AS S2
    WHERE S1.ItemNumber = S2.ItemNumber
    ORDER BY Job_Prod

    First let me say that I have no way to update how the data is stored in the database.  I am merely pulling the data to compare with other data.  
    That said, the idea is that the Actual Start is the date of each of the first items in the actual_track and pieces_track. If there is nothing marked off since a particular day in one of the track strings, but there is something marked off for the other track string, the number of Items or elements in the delimited list will differ.  But, each item in the list is associated with a date, so if 9/25 is the actual_start and there are three items in the list the last item will be associated with 9/27.  If the other list has 6 items, its last item would be associated with 9/30.

  • Jason A. Long - Saturday, September 30, 2017 1:08 AM

    This should do the trick.
    -- Test data ...
    IF OBJECT_ID('tempdb..#TABLEA ', 'U') IS NOT NULL
      DROP TABLE #TABLEA;

    CREATE TABLE #TABLEA (
        PK_ID INT NOT NULL IDENTITY(1,1), -- added a pk row for something to sort an and tie everything back together after the split.
      Job_Prod VARCHAR(100),                -- if there is no pk in the real table use ROW_NUMER() with a cte to accompolish the same.
      Job_Id VARCHAR(16),
      Actual_Start DATETIME,
      actual_Track VARCHAR(1000),
      pieces_track VARCHAR(1000)
      );
    INSERT
      #TABLEA (Job_Prod, Job_Id, Actual_Start, actual_Track, pieces_track) VALUES
      ('VirginiaA', 'Virginia', '2017-09-24 08:00:00.000', '5;4;3;2;1', '1;2;3;4;5;6'),
      ('VirginiaB', 'Virginia', '2017-09-25 08:00:00.000', '1;2;3;4;5;6', '5;4;3;2;1'),
      ('CaliforniaA', 'California', '2017-09-24 08:00:00.000', '0.1;0.2;0.3;0.4', '4;3;2;1'),
      ('CaliforniaB', 'California', '2017-09-25 08:00:00.000', '1.4;2.3;3.2;4.1', '1;2;3;4');

    --============================================================================

    -- The solution...
    SELECT
        t.PK_ID,
        Job_Prod = MAX(t.Job_Prod),
        Job_Id = MAX(t.Job_Id),
        Actual_Start = MAX(t.Actual_Start),
        actual_Track = MAX(CASE WHEN trk.tType = 'A' THEN sc.Item ELSE '0' END),
        pieces_track = MAX(CASE WHEN trk.tType = 'P' THEN sc.Item ELSE '0' END)
    FROM
        #TABLEA t
        CROSS APPLY ( VALUES ('A', t.actual_Track), ('P', t.pieces_track) ) trk (tType, tValue) -- unpivot actual_Track & pieces_track so they are in a single colum. 'A' & 'P' are added so we do loose track of their original columns.
        CROSS APPLY dbo.SplitCSVToTable8K(trk.tValue, ';') sc    -- Just DelimitedSplit8K, just with a different name.
    GROUP BY
        t.PK_ID,
        sc.ItemNumber
    ORDER BY
        t.PK_ID,
        sc.ItemNumber;

    The output results...
    PK_ID   Job_Prod  Job_Id   Actual_Start    actual_Track pieces_track
    ----------- ------------- ------------ ----------------------- ------------- -------------
    1    VirginiaA  Virginia  2017-09-24 08:00:00.000 5     1
    1    VirginiaA  Virginia  2017-09-24 08:00:00.000 4     2
    1    VirginiaA  Virginia  2017-09-24 08:00:00.000 3     3
    1    VirginiaA  Virginia  2017-09-24 08:00:00.000 2     4
    1    VirginiaA  Virginia  2017-09-24 08:00:00.000 1     5
    1    VirginiaA  Virginia  2017-09-24 08:00:00.000 0     6
    2    VirginiaB  Virginia  2017-09-25 08:00:00.000 1     5
    2    VirginiaB  Virginia  2017-09-25 08:00:00.000 2     4
    2    VirginiaB  Virginia  2017-09-25 08:00:00.000 3     3
    2    VirginiaB  Virginia  2017-09-25 08:00:00.000 4     2
    2    VirginiaB  Virginia  2017-09-25 08:00:00.000 5     1
    2    VirginiaB  Virginia  2017-09-25 08:00:00.000 6     0
    3    CaliforniaA California 2017-09-24 08:00:00.000 0.1    4
    3    CaliforniaA California 2017-09-24 08:00:00.000 0.2    3
    3    CaliforniaA California 2017-09-24 08:00:00.000 0.3    2
    3    CaliforniaA California 2017-09-24 08:00:00.000 0.4    1
    4    CaliforniaB California 2017-09-25 08:00:00.000 1.4    1
    4    CaliforniaB California 2017-09-25 08:00:00.000 2.3    2
    4    CaliforniaB California 2017-09-25 08:00:00.000 3.2    3
    4    CaliforniaB California 2017-09-25 08:00:00.000 4.1    4

    Sorry about the results formatting... This forum butchers the "results to text" format...
    I'm sure you get the idea though... 😀
    Hope this helps,
    Jason

    That's great and almost what I need, except I have to increment that date.  As it is 4 AM, I will look at trying to accomplish that part on another day.

  • rmcfaden - Saturday, September 30, 2017 2:23 AM

    That's great and almost what I need, except I have to increment that date.  As it is 4 AM, I will look at trying to accomplish that part on another day.

    I'm thinking this should do the trick... 🙂
    SELECT
        t.PK_ID,
        Job_Prod = MAX(t.Job_Prod),
        Job_Id = MAX(t.Job_Id),
        Actual_Start = DATEADD(DAY, ROW_NUMBER() OVER (PARTITION BY t.PK_ID ORDER BY t.PK_ID) -1, MAX(t.Actual_Start)),
        actual_Track = MAX(CASE WHEN trk.tType = 'A' THEN sc.Item ELSE '0' END),
        pieces_track = MAX(CASE WHEN trk.tType = 'P' THEN sc.Item ELSE '0' END)
    FROM
        #TABLEA t
        CROSS APPLY ( VALUES ('A', t.actual_Track), ('P', t.pieces_track) ) trk (tType, tValue)
        CROSS APPLY dbo.SplitCSVToTable8K(trk.tValue, ';') sc    
    GROUP BY
        t.PK_ID,
        sc.ItemNumber
    ORDER BY
        t.PK_ID,
        sc.ItemNumber;

    New results...
    PK_ID   Job_Prod   Job_Id     Actual_Start    actual_Track pieces_track
    ----------- -------------- ----------------  ----------------------- ------------- ------------
    1    VirginiaA   Virginia    2017-09-24 08:00:00.000 5     1
    1    VirginiaA   Virginia    2017-09-25 08:00:00.000 4     2
    1    VirginiaA   Virginia    2017-09-26 08:00:00.000 3     3
    1    VirginiaA   Virginia    2017-09-27 08:00:00.000 2     4
    1    VirginiaA   Virginia    2017-09-28 08:00:00.000 1     5
    1    VirginiaA   Virginia    2017-09-29 08:00:00.000 0     6
    2    VirginiaB   Virginia    2017-09-25 08:00:00.000 1     5
    2    VirginiaB   Virginia    2017-09-26 08:00:00.000 2     4
    2    VirginiaB   Virginia    2017-09-27 08:00:00.000 3     3
    2    VirginiaB   Virginia    2017-09-28 08:00:00.000 4     2
    2    VirginiaB   Virginia    2017-09-29 08:00:00.000 5     1
    2    VirginiaB   Virginia    2017-09-30 08:00:00.000 6     0
    3    CaliforniaA  California   2017-09-24 08:00:00.000 0.1    4
    3    CaliforniaA  California   2017-09-25 08:00:00.000 0.2    3
    3    CaliforniaA  California   2017-09-26 08:00:00.000 0.3    2
    3    CaliforniaA  California   2017-09-27 08:00:00.000 0.4    1
    4    CaliforniaB  California   2017-09-25 08:00:00.000 1.4    1
    4    CaliforniaB  California   2017-09-26 08:00:00.000 2.3    2
    4    CaliforniaB  California   2017-09-27 08:00:00.000 3.2    3
    4    CaliforniaB  California   2017-09-28 08:00:00.000 4.1    4

  • Jason,
    OMG! 300K+ records output in 6 seconds versus what I was doing which took between 1 and 2 minutes.
    I've checked a few records so far and they match up to outputting and transposing in Excel.

    One oddity though, I ran it against a small sample first, similar to the data above, but got confusing results:

    USE TEMPDB
    GO

    IF OBJECT_ID('dbo.TABLEA','U') IS NOT NULL DROP TABLE dbo.TABLEA;

    CREATE TABLE TABLEA (
        [Job_Prod]        varchar(100),
        [Job_No]        varchar(16),
        Act_Start        datetime,
        actual_Track    varchar(1000),
        pieces_track    varchar(1000)
    );

    IF OBJECT_ID('TempDb..#TABLEA','U') IS NOT NULL DROP TABLE #TABLEA;

    CREATE TABLE #TABLEA (
        PK_ID            INT NOT NULL IDENTITY(1,1),
        [Job_Prod]        varchar(100),
        [Job_No]        varchar(16),
        Act_Start        datetime,
        actual_Track    varchar(1000),
        pieces_track    varchar(1000)
    );

    INSERT INTO TABLEA VALUES ('Virginia', 'VirginiaA', '2017-09-24 08:00:00.000','5;4;3;2;1','1;2;3;4;5')
    INSERT INTO TABLEA VALUES ('Virginia', 'VirginiaB', '2017-09-25 08:00:00.000', '1;2;3;4;5','5;4;3;2;1')
    INSERT INTO TABLEA VALUES ('California', 'CaliforniaA', '2017-09-24 08:00:00.000', '.1;.2;.3;.4','.4;.3;.2;.1')
    INSERT INTO TABLEA VALUES ('California', 'CaliforniaB', '2017-09-25 08:00:00.000', '.4;.3;.2;.1','.1;.2;.3;.4')

    INSERT INTO #TABLEA SELECT     [Job_No], [Job_Prod], Act_Start, actual_Track, pieces_track
    FROM TABLEA

    --SELECT A.[Job_No], A.Job_Prod,
    --  SplitA.a.value('.', 'VARCHAR(100)') AS StringA,
        -- SplitB.b.value('.', 'VARCHAR(100)') AS StringB
    --FROM (SELECT [Job_No], Job_Prod,
    --   CAST ('<M>' + REPLACE([actual_Track], ';', '</M><M>') + '</M>' AS XML) AS StringA,
    --   CAST ('<M>' + REPLACE([pieces_Track], ';', '</M><M>') + '</M>' AS XML) AS StringB
    --  FROM TableA) AS A CROSS APPLY StringA.nodes ('/M') AS SplitA(a)
        -- CROSS APPLY StringB.nodes ('/M') AS SplitB(b);

    --IF OBJECT_ID(N'#temp1') IS NOT NULL
    --  DROP TABLE #temp1
    --GO
    --SELECT A.[Job_No],
    --  Split.a.value('.','VARCHAR(100)') as String1,
    --  A.pieces_track
    --INTO #temp1
    --FROM (SELECT [state], CAST('<M>' + REPLACE([City],',','</M><M>')+'</M>' AS XML) AS String
    --  FROM TableA) AS A CROSS APPLY String.nodes('/M') AS Split(a);
    --SELECT A.[State],
    --  A.String1,
    --  Split.a.value('.','VARCHAR(100)') as String2
    --FROM (SELECT [state],[String1],CAST('<M>' + REPLCAE([City],',','</M><M>')+'</M>' AS XML) AS String
    --  FROM #temp1) AS A CROSS APPLY String.nodes('/M') AS Split(a);

    SELECT
      t.PK_ID,
      Job_Prod = MAX(t.Job_Prod),
      Job_No = MAX(t.Job_No),
      Act_Start = DATEADD(DAY, ROW_NUMBER() OVER (PARTITION BY t.PK_ID ORDER BY t.PK_ID) -1, MAX(t.Act_Start)),
      actual_Track = MAX(CASE WHEN trk.tType = 'A' THEN sc.Item ELSE '0' END),
      pieces_track = MAX(CASE WHEN trk.tType = 'P' THEN sc.Item ELSE '0' END)
    FROM
      #TABLEA t
      CROSS APPLY ( VALUES ('A', t.actual_Track), ('P', t.pieces_track) ) trk (tType, tValue)
      CROSS APPLY dbo.DelimitedSplit8K(trk.tValue, ';') sc 
    GROUP BY
      t.PK_ID,
      sc.ItemNumber
    ORDER BY
      t.PK_ID,
      sc.ItemNumber;
     
    PK_ID    Job_Prod    Job_No    Act_Start    actual_Track    pieces_track
    1    VirginiaA    Virginia    2017-09-24 08:00:00.000    5    1
    1    VirginiaA    Virginia    2017-09-25 08:00:00.000    4    2
    1    VirginiaA    Virginia    2017-09-26 08:00:00.000    3    3
    1    VirginiaA    Virginia    2017-09-27 08:00:00.000    2    4
    1    VirginiaA    Virginia    2017-09-28 08:00:00.000    1    5
    2    VirginiaB    Virginia    2017-09-25 08:00:00.000    1    5
    2    VirginiaB    Virginia    2017-09-26 08:00:00.000    2    4
    2    VirginiaB    Virginia    2017-09-27 08:00:00.000    3    3
    2    VirginiaB    Virginia    2017-09-28 08:00:00.000    4    2
    2    VirginiaB    Virginia    2017-09-29 08:00:00.000    5    1
    3    CaliforniaA    California    2017-09-24 08:00:00.000    0    0
    3    CaliforniaA    California    2017-09-25 08:00:00.000    0    0
    3    CaliforniaA    California    2017-09-26 08:00:00.000    0    0
    3    CaliforniaA    California    2017-09-27 08:00:00.000    0    0
    4    CaliforniaB    California    2017-09-25 08:00:00.000    0    0
    4    CaliforniaB    California    2017-09-26 08:00:00.000    0    0
    4    CaliforniaB    California    2017-09-27 08:00:00.000    0    0
    4    CaliforniaB    California    2017-09-28 08:00:00.000    0    0

  • Viewing 15 posts - 1 through 15 (of 46 total)

    You must be logged in to reply to this topic. Login to reply