Split out data from single field that is encoded with new lines

  • I have a table with one field that merges all notes entered from an application into a single varchar(MAX) field.  By dragging this field into a Crystal Report I can deduce that it's using the new line char(10.  What I'd like to do is use this new line character as a delimiter then put them into separate rows of a view.  I'm assuming it's char(10) but could be combo of 10 and 13 but not sure.  

    Table - Notes is the varchar(MAX) containing the char(10) separated values.
    Original - using commas just to show column separation
    Co,Project,Phase,Notes
    1,1234,X,10/1/2017 This is my note  10/15/2017 This is my second note 
    1,1234,Y,10/1/2017 This is another note  10/15/2017 This is another second note 

    What I need is

    Co,Project,Phase,Notes
    1,1234,X,10/1/2017 This is my note
    1,1234,X,10/15/2017 This is my second note 
    1,1234,Y,10/1/2017 This is another note
    1,1234,Y,10/15/2017 This is another second note 

  • First, above all else... People who create (n)varchar(max) "Notes" columns need a high-five. In the face. With a chair. But I digress. 

    I don't have a full solution yet but this should be pretty simple. Moving forward, I suggest including easily consumable sample data/DDL. I post what I started with in case anyone wants to run with this. 

    DECLARE @originalTable table (co int, project int, phase char(1), notes varchar(max));
    INSERT @originalTable (Co,Project,Phase,Notes) values
    (1,1234,'X','10/1/2017 This is my note 10/15/2017 This is my second note'),
    (1,1234,'Y','10/1/2017 This is another note 10/15/2017 This is another second note');

    The rub here is that you have free-style text dates which complicates things. Here's my partial solution which should get you started. I'll look at this again a little later tonight. 

    SELECT co, project, phase, notes = substring(notes, 1, v2.x+6)
    FROM @originalTable
    CROSS APPLY (VALUES (patindex('%[0-9][0-9]/[0-9]/2[0-9][0-9][0-9]%', notes))) v(x)
    CROSS APPLY (VALUES (patindex('%[0-9][0-9]/[0-9][0-9]/2[0-9][0-9][0-9]%', substring(notes, v.x+8, 8000)))) v2(x);

    Cheers

    "I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

    -- Itzik Ben-Gan 2001

  • So let's deal with the elephant in the room.  You are storing different notes generated at different time of the date to a single column in a single row, and THEN you want to parse them?

    Why not just store them the way you want them the first time out?  It would actually more efficient than continually writing and rewriting and re-rewriting an ever-growing varchar(max) column.

    Alan's solution should get you started, but still - you seem to be pursuing what feels like a strange design choice.  It would help to know why.

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • Matt and Alan, I totally get it, however, the application/code was written by a third party software vendor and I cannot modify unfortunately. The only reason I need it parsed out is for one report, which will only be used on a monthly basis.  I'm stuck with what I got.

  • Here is the actual data for a few rows:

    101 16032.    01910.      NULL
    101 16032.  01910.   01 NULL
    101 16032.    01955.      NULL
    101 16032.    02200.      10/17/2017 11:36:37 AM by User2  This is a note I entered on Phase 02200 for batch 11 8/17 date 8/15/2017    10/17/2017 11:45:11 AM by User2  This is a note I entered on PHase 02200 for batch 6 9/17 date 9/16/2017
    101 16011.    00500.      10/13/2017 4:28:44 PM by User3  TEST NOTE
    101 16011.    00550.      NULL
    101 16011.    02400.      NULL
    101 16011.    02450.      7/11/2017 12:58:09 PM by User2  Estimating that the Foundation will be over budget due to blah blah
    101 16011.    02500.      NULL

    There is a new line after the first note which ends at 8/15/2017 (the 8/15/17 was inputted by me in the note field to see how it appeared).  The field is updated with a date/time/user then the note.  What I would love to be able to do is split out the Date, Time, UserID, Note into 4 separate columns, which is should have been done in the first place, but I digress.

  • Inneresting discovery.  I replaced a CHAR(13) with ^ and CHAR(10) with ~ and get this for one field.  So if I replace them first, my delimiter could be "^~^~" to deduce when an actual new record is.  Once I've separated them I can then work on stripping out the date, time, User ID and note.  Yikes.

    10/17/2017 11:36:37 AM by User2^~This is a note I entered on Phase 02200 for batch 11 8/17 date 8/15/2017^~^~10/17/2017 11:45:11 AM by User2^~This is a note I entered on Phase 02200 for batch 6 9/17 date 9/16/2017

  • I'm getting there with something like this.  I'm getting "Msg 9448, Level 16, State 1, Line 12
    XML parsing: line 2, character 89, well formed check: undeclared entity"
    in the T1 select.  I don't know XML but still searching.

     SELECT T1.JCCo, T1.Job, T1.Phase,T2.my_Splits AS ProjectNotes
    FROM
     (
      SELECT *,
      CAST('<X>'+replace(REPLACE(T.ProjNotes,CHAR(10),'~'),'~ ~','</X><X>')+'</X>' as XML) as my_Xml
      FROM JCCH T
      where T.ProjNotes is not null
     ) T1
     CROSS APPLY
     (
     SELECT my_Data.D.value('.','varchar(100)') as my_Splits
     FROM T1.my_Xml.nodes('X') as my_Data(D)
     ) T2

  • Tried this as well:

     SELECT JCCo,Job,Phase,
    LTRIM(RTRIM(m.n.value('.[1]','varchar(8000)')))
    FROM
    (
    SELECT JCCo,Job,Phase,CAST('<XMLRoot><RowData>' + REPLACE(REPLACE(ProjNotes,CHAR(10),'~'),'~ ~','</RowData><RowData>') + '</RowData></XMLRoot>' AS XML) AS x
    FROM   JCCH
    )t
    CROSS APPLY x.nodes('/XMLRoot/RowData')m(n)

    Msg 9448, Level 16, State 1, Line 24
    XML parsing: line 2, character 89, well formed check: undeclared entity

  • I'm so close it's killing me.  Went the function route and found this:
    CREATE FUNCTION dbo.SplitStrings
    (
        @List       NVARCHAR(MAX),
        @Delimiter  NVARCHAR(255)
    )
    RETURNS TABLE
    AS
        RETURN (SELECT Number = ROW_NUMBER() OVER (ORDER BY Number),
            Item FROM (SELECT Number, Item = LTRIM(RTRIM(SUBSTRING(@List, Number,
            CHARINDEX(@Delimiter, @List + @Delimiter, Number) - Number)))
        FROM (SELECT ROW_NUMBER() OVER (ORDER BY s1.[object_id])
            FROM sys.all_objects AS s1 CROSS APPLY sys.all_objects) AS n(Number)
        WHERE Number <= CONVERT(INT, LEN(@List))
            AND SUBSTRING(@Delimiter + @List, Number, 1) = @Delimiter
        ) AS y);
    GO

    Now I can actually get the date out HOWEVER when there is only 1 note, there is no lines/carriage returns after the note so function won't split the ones with a single note.  Getting there.

  • When you call your function, append the new line character to the end of the list parameter.  That way there will always be a delimiter for the function to find.

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply