Help with XML Splitter De-entitization, please.

  • Matt Miller (#4) (1/3/2017)


    Hi Jeff -

    I went into hibernation for a bit, only to come out and find you dabbling in XML..... Small wonders never cease I guess 🙂

    I guess without spilling the beans too much - what was the purpose for trying to use the split string functions against the raw XML? I am just curious if were was a better way to get at what you're looking for.

    Are you trying to infer the structure perhaps? I might have some tidbits to do so if that's the case.

    It's for a small series of articles I'm working on called "The Devils in the Data". It's about some caveats that a lot of people don't expect for accuracy and performance testing. I'm using splitters to demonstrate because, as my good friend Ed Wagner would say, "It's a target rich environment". 😀

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (1/3/2017)


    It's for a small series of articles I'm working on called "The Devils in the Data". It's about some caveats that a lot of people don't expect for accuracy and performance testing. I'm using splitters to demonstrate because, as my good friend Ed Wagner would say, "It's a target rich environment". 😀

    Ahh - Gotcha. I could see a few places in there, so yup!

    Will keep an eye out for it 😀

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • Just for fun, here is a JSON based splitter, nasty fast but not fully tested, SQL Server 2016 and later.

    😎

    CREATE FUNCTION dbo.ITVFN_SPLIT_STRING_JSON

    (

    @INPUT_STRING NVARCHAR(MAX)

    ,@DELIMITER NVARCHAR(250)

    )

    RETURNS TABLE

    WITH SCHEMABINDING

    AS

    RETURN

    (

    SELECT

    Y AS Item

    from OPENJSON

    (

    N'[{"Y":"'

    + REPLACE(

    REPLACE(

    REPLACE(

    REPLACE(

    @INPUT_STRING,N'\',N'\\')

    ,N'/',N'\/')

    ,N'"',N'\"')

    ,@DELIMITER,N'"},{"Y":"') + N'"}]'

    )

    WITH (Y NVARCHAR(MAX) '$.Y')

    );

  • Mikael Eriksson SE - Tuesday, January 3, 2017 12:25 AM

    That's an interesting techinque, i've never actually seen that processing-instruction command.
    I use this approach by throwing it throw a dummy lag function (SQL-2012+) that seems to work as well and remains crazy fast.


    create function dbo.xml_split_with_lag(@input varchar(max),@d char(1))
    returns table
    with schemabinding as
    return select item = x.i.value('./text()[1]','varchar(8000)')
    from(select _= lag(convert(xml,'<r>'+replace((select @input for xml path('')), @d,'</r><r>')+'</r>'),0) over (order by (select 1))) a
    cross apply _.nodes('./r') x(i);

  • Matt Miller (4) - Tuesday, January 3, 2017 7:45 PM

    Jeff Moden (1/3/2017)


    It's for a small series of articles I'm working on called "The Devils in the Data". It's about some caveats that a lot of people don't expect for accuracy and performance testing. I'm using splitters to demonstrate because, as my good friend Ed Wagner would say, "It's a target rich environment". 😀

    Ahh - Gotcha. I could see a few places in there, so yup!Will keep an eye out for it 😀

    Heh... after 8 months, I've obviously gotten behind on that endeavor.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 5 posts - 16 through 19 (of 19 total)

You must be logged in to reply to this topic. Login to reply