T-SQL parsing query

  • Hi Guys,

    I need help with the SQL query. I need help with parsing one field to 4 different fields. Here is my sample data

    DECLARE @table_parsing TABLE (
    idint IDENTITY(1,1),
    Field1varchar(4000)
    );

    INSERT INTO @table_parsing
    (
    --id - column value is auto-generated
    Field1
    )
    SELECT 'test_stp_commer_ca_mm updated. 37345 records added. 0 records updated. 0 records deleted.'
    UNION ALL
    SELECT 'test_stp_commer_ca_da updated. 679 records added. 0 records updated. 0 records deleted.'
    UNION ALL
    SELECT 'test_stp_commer_ca updated. 768 records added. 0 records updated. 0 records deleted.'
    UNION ALL
    SELECT 'test_stp_commer_apps updated. 38263 records added. 0 records updated. 0 records deleted.'


    SELECT * FROM @table_parsing tp

    Here is the result that I want.

    Parse1|Parse2 |Parse3 |Parse4

    test_stp_commer_ca_mm updated| 37345 records added |0 records updated |0 records deleted.

    test_stp_commer_ca_da updated.| 679 records added |0 records updated |0 records deleted.

    test_stp_commer_ca updated. |768 records added |0 records updated |0 records deleted.

    test_stp_commer_apps updated. |38263 records added |0 records updated |0 records deleted.

    Any help would be much appreciated.

    Tnx.

  • I had thought to use PARSENAME here, however, they all return NULL. You can achieve this with a few CHARINDEXs though:

    SELECT tp.id,
    LEFT(tp.Field1,V1.CI),
    SUBSTRING(tp.Field1,V1.CI+1,V2.CI - V1.CI),
    SUBSTRING(tp.Field1,V2.CI+1,V3.CI - V2.CI),
    RIGHT(tp.Field1,LEN(tp.Field1) - V3.CI)
    FROM @table_parsing tp
    CROSS APPLY (VALUES(CHARINDEX('.',tp.Field1)))V1(CI)
    CROSS APPLY (VALUES(CHARINDEX('.',tp.Field1,V1.CI+1)))V2(CI)
    CROSS APPLY (VALUES(CHARINDEX('.',tp.Field1,V2.CI+1)))V3(CI);

    Thom~

    Excuse my typos and sometimes awful grammar. My fingers work faster than my brain does.
    Larnu.uk

  • https://www.sqlservercentral.com/articles/tally-oh-an-improved-sql-8k-%e2%80%9ccsv-splitter%e2%80%9d-function

     

    might seem like overkill for the question you posed, but trust me, you'll use it many times, it's worth borrowing from Jeff et al.

    -------------------------------------------------------------------------------------------------------------------------------------
    Please follow Best Practices For Posting On Forums to receive quicker and higher quality responses

  • Thank you, Thom,

    That is precisely what I need as per my sample data.

    Now I have one more further question. As you know, the data will not always clean or the way we wanted. If we don't have "." in the data, I believe the Left/Right or Substring function will error out. Any advice to make sure no error will not occur?

    I appreciate your advice/help!

     

  • Add in WHERE tp.Field1 LIKE '%.%' and you won't have any errors. Won't parse anything either, but that wouldn't happen anyway.

    -------------------------------------------------------------------------------------------------------------------------------------
    Please follow Best Practices For Posting On Forums to receive quicker and higher quality responses

  • That was my first approach to use in the where clause. However, I am still getting the error when I ran against my real data, and the error is "Invalid Length parameter passed to the LEFT or Substring function."

     

  • You're going to have to find the data. If you're running this on a regular basis to move stuff in, check this out:

    https://www.sqlservercentral.com/articles/conditional-set-based-processing-moving-towards-a-best-practice-for-etl

    uses Try/Catch blocks, you can attempt everything and if it fails, rollback the transaction and process in smaller sections, perhaps line by line. Trap when errors occur and throw them into an output destination so you can review and figure out what went wrong. At least you get loaded what works, and move forward.

    Somebody has to work that fallout though, if you just ignore it you'll never know what your data is supposed to look like.

    -------------------------------------------------------------------------------------------------------------------------------------
    Please follow Best Practices For Posting On Forums to receive quicker and higher quality responses

  • rocky_498 wrote:

    Thank you, Thom,

    That is precisely what I need as per my sample data.

    Now I have one more further question. As you know, the data will not always clean or the way we wanted. If we don't have "." in the data, I believe the Left/Right or Substring function will error out. Any advice to make sure no error will not occur?

    I appreciate your advice/help!

    I suspect a few NULLIFs will work. As all your sample data was clean, however, you have "unclean" data then showing us samples of that and what you expect helps us and we can cater for additional scenarios.

    Thom~

    Excuse my typos and sometimes awful grammar. My fingers work faster than my brain does.
    Larnu.uk

  • You can avoid errors where there are missing periods by making sure you always have 4 periods in the field.  Here is one approach:

     Select *
    , Parse1 = substring(v.Field1, 1, p1.pos - 1)
    , Parse2 = substring(v.Field1, p1.pos + 1, p2.pos - p1.pos - 1)
    , Parse3 = substring(v.Field1, p2.pos + 1, p3.pos - p2.pos - 1)
    , Parse4 = substring(v.Field1, p3.pos + 1, p4.pos - p3.pos - 1)
    From @table_parsing tp
    Cross Apply (Values (concat(tp.Field1, '....'))) v(Field1)
    Cross Apply (Values (charindex('.', v.Field1, 1))) p1(pos)
    Cross Apply (Values (charindex('.', v.Field1, p1.pos + 1))) p2(pos)
    Cross Apply (Values (charindex('.', v.Field1, p2.pos + 1))) p3(pos)
    Cross Apply (Values (charindex('.', v.Field1, p3.pos + 1))) p4(pos);

    This also standardizes the 'last' column calculation since it is now the same as all of the other calculations.  This also removes the periods from the results - which I assume is the expected results.  If you want to include the periods in the results then you can adjust the substring - but you will also need to include a cleanup of those extra fields where there is no data as they would return just a period.

     

     

    Jeffrey Williams
    “We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

    ― Charles R. Swindoll

    How to post questions to get better answers faster
    Managing Transaction Logs

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply