Efficient way to split a string

  • waxb18

    Ten Centuries

    Points: 1060

    Hi All,

    I have a table that is approx 65k rows long. WIthin the table is a varchar(max) column which contains a string as per below

     

    [ { "Code": 123, "Label": "Animal", "Value": "Cow", "RecordYear": "2017-03-09T11:19:58.94" }, { "Code": 456, "Label": "Type", "Value": "Mammal", "RecordYear": "2017-03-09T11:20:26.697" }, { "Code": 789, "Label": "GoodFor", "Value": "Milk","RecordYear": "2017-03-09T11:20:44.65" }, { "Code": 101, "Label": "Annoys", "Value": "Vegans", "RecordYear": "2017-03-09T11:23:40.54" }]

    however can be a lot longer,

    I have a solution that uses the 8k split string function below but it is extremely slow.

    Can anyone help me out.

    SELECT   ID
    ,Case WHen Item like ',%' then

    Substring((LTRIM(RTRIM(REplace(Replace(ITEM, '{', ''), '}', '')))) ,4,len(LTRIM(RTRIM(REplace(Replace(ITEM, '{', ''), '}', '')))))

    Else
    (LTRIM(RTRIM(REplace(Replace(ITEM, '{', ''), '}', ''))))
    END AS ValueItem
    ,ItemNumber

    INTO #B

    FROM #A
    CROSS APPLY dbo.[DelimitedSplit8K](Replace(SUBSTRING(Form, 3, len(form) - 3), '"', ''), '}') B

    --====================================================================================================================================================

    Select

    ID ,
    Replace(SUBSTRING(ValueItem, CHARINDEX('Code: ',ValueItem),CHARINDEX(', Label: ',ValueItem)-1),'Code: ','') as Code,
    Replace(SUBSTRING(ValueItem, CHARINDEX('Label: ',ValueItem),CHARINDEX('Value: ',ValueItem)-(CHARINDEX('Label: ',ValueItem)+2)),'Label: ','') as Label,
    Replace(Case
    When ValueItem like '%RecordYear%'
    then SUBSTRING(ValueItem, CHARINDEX('Value: ',ValueItem), CHARINDEX('RecordYear: ',ValueItem)-(CHARINDEX('Value: ',ValueItem)+2))
    Else SUBSTRING(ValueItem, CHARINDEX('Value: ',ValueItem),Len(ValueItem))
    END ,'Value: ','') as Value,
    Case
    When ValueItem like '%RecordYear%'
    then REPLACE(SUBSTRING(ValueItem, CHARINDEX('RecordYear: ',ValueItem), Len(ValueItem)) ,'RecordYear: ','')
    Else NULL
    END as RecordYear


    FROM #B
  • Michael L John

    One Orange Chip

    Points: 25796

    Based upon the example of the data you provided, what should that look like when it's finished?

    Michael L John
    If you assassinate a DBA, would you pull a trigger?
    To properly post on a forum:
    http://www.sqlservercentral.com/articles/61537/

  • Jeff Moden

    SSC Guru

    Points: 995109

    I might be missing something but that looks an awful lot like JSON.  Is it?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
    "If you think its expensive to hire a professional to do the job, wait until you hire an amateur."--Red Adair
    "Change is inevitable... change for the better is not."
    When you put the right degree of spin on it, the number 3|8 is also a glyph that describes the nature of a DBAs job. 😉

    Helpful Links:
    How to post code problems
    Create a Tally Function (fnTally)

  • ScottPletcher

    SSC Guru

    Points: 98206

    Limited sample data, but I think this should perform better, since it avoids all the CHARINDEX costs:

    SELECT   A.ID, B.ItemNumber,
    MAX(CASE WHEN D.Item LIKE 'Code:%' THEN LTRIM(SUBSTRING(D.Item, 6, 200)) END) AS Code,
    MAX(CASE WHEN D.Item LIKE 'Label:%' THEN LTRIM(SUBSTRING(D.Item, 7, 200)) END) AS Label,
    MAX(CASE WHEN D.Item LIKE 'Value:%' THEN LTRIM(SUBSTRING(D.Item, 7, 200)) END) AS Value,
    MAX(CASE WHEN D.Item LIKE 'RecordYear:%' THEN LTRIM(SUBSTRING(D.Item, 12, 200)) END) AS RecordYear
    --*/
    FROM #A A /*(
    SELECT 1 AS ID, '[{"Code": 123, "Label": "Animal", "Value": "Cow", "RecordYear": "2017-03-09T11:19:58.94" }, { "Code": 456, "Label": "Type", "Value": "Mammal", "RecordYear": "2017-03-09T11:20:26.697" }, { "Code": 789, "Label": "GoodFor", "Value": "Milk","RecordYear": "2017-03-09T11:20:44.65" }, { "Code": 101, "Label": "Annoys", "Value": "Vegans", "RecordYear": "2017-03-09T11:23:40.54" }]' AS Form
    ) AS A--*/
    CROSS APPLY dbo.[DelimitedSplit8K](Replace(SUBSTRING(Form, 3, len(form) - 3), '"', ''), '}') B
    CROSS APPLY dbo.[DelimitedSplit8K](Ltrim(Replace(Replace(Item, '{', ''), '}', '')), ',') C
    CROSS APPLY ( SELECT LTRIM(RTRIM(C.Item)) AS Item ) AS D
    WHERE D.Item > ''
    GROUP BY A.ID, B.ItemNumber

    SQL DBA,SQL Server MVP(07, 08, 09) Prosecutor James Blackburn, in closing argument in the Fatal Vision murders trial: If in the future, you should cry a tear, cry one for them [the murder victims]. If in the future, you should say a prayer, say one for them. And if in the future, you should light a candle, light one for them.

  • drew.allen

    SSC Guru

    Points: 76658

    Jeff Moden wrote:

    I might be missing something but that looks an awful lot like JSON.  Is it?

    I'm with Jeff.  This looks like a JSON document.

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • ScottPletcher

    SSC Guru

    Points: 98206

    It seems like it, but SQL 2008 doesn't directly read JSON, afaik.  It might be easier to handle it with T-SQL rather than try to create some external function/process.

    SQL DBA,SQL Server MVP(07, 08, 09) Prosecutor James Blackburn, in closing argument in the Fatal Vision murders trial: If in the future, you should cry a tear, cry one for them [the murder victims]. If in the future, you should say a prayer, say one for them. And if in the future, you should light a candle, light one for them.

  • Jeff Moden

    SSC Guru

    Points: 995109

    ScottPletcher wrote:

    It seems like it, but SQL 2008 doesn't directly read JSON, afaik.  It might be easier to handle it with T-SQL rather than try to create some external function/process.

    I have to admit that I didn't look at which forum we are in on this thread.  You're correct.  JSON functionality isn't available in 2008.  The code you wrote using DelimitedSplit8K will do the trick.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
    "If you think its expensive to hire a professional to do the job, wait until you hire an amateur."--Red Adair
    "Change is inevitable... change for the better is not."
    When you put the right degree of spin on it, the number 3|8 is also a glyph that describes the nature of a DBAs job. 😉

    Helpful Links:
    How to post code problems
    Create a Tally Function (fnTally)

Viewing 7 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply