Splitting a delimited field

  • Hi all

    I'm trying to use Jeff's Delimited8K function to split a string into several rows.

    The string is in this format:-

    Drug1|Drug2|Drug3..... etc, etc

    There are other pieces of information I need to add in as well (they will be the unique patient identifiers, etc.) but I can't get it to work.

    I've tried calling the function as part of the select but it's giving me an error message:-

    Msg 512, Level 16, State 1, Line 1

    Subquery returned more than 1 value. This is not permitted when the subquery follows =, !=, <, <= , >, >= or when the subquery is used as an expression.

    This is my query so far :-

    select top 1000

    aq1.SourceID

    ,aq1.VisitID

    ,(select item from [DelimitedSplit8K] (MedicalInduction,'|'))

    from

    [MeditechDR01-M1].[livefocdb_daily].[dbo].[RegAcctQuery_Result] aq1

    outer apply (select

    max(case when aq1.[Query_MisQryID]='MAT.MEDIND' then

    replace(replace([ValueInfo],'}',''),'{','') end) as MedicalInduction

    from

    [MeditechDR01-M1].[livefocdb_daily].[dbo].[RegAcctQuery_Result] aq1

    where

    aq1.[Query_MisQryID] in ('MAT.MEDIND')

    group by

    aq1.SourceID

    ,aq1.VisitID

    )q2

    where

    aq1.[Query_MisQryID] in ('MAT.MEDIND')

    Any help on this would be greatly appreciated.

    ::edit::

    The above code is part of a bigger query which is why I've used an OUTER APPLY

  • Treat it as any other tablesource, fashionably APPLY.

    Are you really running this query against a linked server?

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Hi Chris

    Thanks for that.

    So I'd just just APPLY/LEFT JOIN as normal?

    As for the linked server, I have to for now.

    We're upgrading some servers to SQL2012 and then going to use replication from the main source to several other servers so can get rid of the link references.

  • richardmgreen1 (6/30/2016)


    Hi Chris

    Thanks for that.

    So I'd just just APPLY/LEFT JOIN as normal?

    As for the linked server, I have to for now.

    We're upgrading some servers to SQL2012 and then going to use replication from the main source to several other servers so can get rid of the link references.

    CROSS APPLY / INNER JOIN, OUTER APPLY / LEFT JOIN usually.

    If your entire query references tables exclusively from the remote server, then it's best executed there. You can do this in a variety of ways, maybe best explored when your query is returning expected results.

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Thanks Chris

    I've now got it running as expected.

  • richardmgreen1 (6/30/2016)


    Thanks Chris

    I've now got it running as expected.

    Cool, well done.

    There are issues with the fragment of query you posted - do you want help with resolving them?

    The first is the use of the same table alias for a table referenced inside and outside an APPLY block. You should use different aliases, e.g. if the outer reference is "aq1" then use something like "aq2".

    The second is that the query in the APPLY block isn't correlated with the outer query so it will be executed for every row in the outer query. Rather than using GROUP BY, just correlate it to the outer query on SourceID and VisitID (which is where the different table aliases come in). You could also use MAX(OVER) to calculate MedicalInduction.

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Hi Chris

    That was just a code snippet, here's the complete query:-

    set dateformat dmy

    declare

    @start datetime

    ,@end datetime

    set @start=cast(DATEADD(month,-2,getdate())-DATEPART(d,getdate())+1 as date)

    set @end=cast(DATEADD(month,-1,getdate())-DATEPART(d,getdate()) as date);

    INSERT INTO [MDS_Maternity].[dbo].[MAT401Induction] (

    [LocalPatientIdMother]

    ,[LabourOnsetDateTime]

    ,[LabourInductionMethod]

    ,[OxytocinAdministeredDateTime]

    ,[ReportingMonth]

    )

    SELECT distinct

    LocalPatientIdMother = ad.UnitNumber

    ,[LabourOnsetDateTime] = cast(

    dateadd(minute,cast(right(q2.TimeofOnsetofLabour,2) as int),

    dateadd(hour,cast(left(q2.TimeofOnsetofLabour,2) as int),

    cast(right(q2.DateofOnsetofLabour,2) + '-' +

    substring(q2.DateofOnsetofLabour,5,2) + '-' + left(q2.DateofOnsetofLabour,4) as datetime)

    )

    ) as datetime)

    ,[LabourInductionMethod] = case

    when q3.MedicalInduction like '%Mifepristone%' then '01'

    when q3.MedicalInduction like '%Misoprostol%' then '02'

    when q3.MedicalInduction like '%Prostaglandin%' then '03'

    when replace(q2.OxytocinUsed,'|','')='Y' then '04'

    when q3.MedicalInduction is null and replace(q2.OxytocinUsed,'|','')<>'Y' then '05'

    end

    ,[OxytocinAdministeredDateTime] = cast(

    dateadd(minute,cast(right(q2.TimeOxytocinUsed,2) as int),

    dateadd(hour,cast(left(q2.TimeOxytocinUsed,2) as int),

    cast(right(q2.DateOxytocinUsed,2) + '-' +

    substring(q2.DateOxytocinUsed,5,2) + '-' + left(q2.DateOxytocinUsed,4) as datetime)

    )

    ) as datetime)

    ,ReportingMonth = @start

    FROM

    [MeditechDR01-M1].livedb_daily.dbo.AbstractData ad

    inner join [MeditechDR01-M1].livedb_daily.dbo.AbsQueries aq

    on ad.SourceID=ad.SourceID

    and aq.AbstractID=ad.AbstractID

    and aq.QueryID='GEN.PREG'

    outer apply (select

    max(case when aq1.[Query_MisQryID]='MAT.ONSETDA' then

    replace(replace([ValueInfo],'}',''),'{','') end) as DateofOnsetofLabour

    ,max(case when aq1.[Query_MisQryID]='MAT.ONSETT' then

    replace(replace([ValueInfo],'}',''),'{','') end) as TimeofOnsetofLabour

    ,max(case when aq1.[Query_MisQryID]='MAT.OXYT' then

    replace(replace([ValueInfo],'}',''),'{','') end) as OxytocinUsed

    ,max(case when aq1.[Query_MisQryID]='MAT.OXYTD' then

    replace(replace([ValueInfo],'}',''),'{','') end) as DateOxytocinUsed

    ,max(case when aq1.[Query_MisQryID]='MAT.OXYTT' then

    replace(replace([ValueInfo],'}',''),'{','') end) as TimeOxytocinUsed

    from

    [MeditechDR01-M1].[livefocdb_daily].[dbo].[RegAcctQuery_Result] aq1

    where

    aq1.[Query_MisQryID] in ('MAT.ONSETDA','MAT.ONSETT','MAT.OXYT','MAT.OXYTD','MAT.OXYTT')

    and aq1.SourceID=ad.SourceID

    and aq1.VisitID=ad.VisitID

    group by

    aq1.SourceID

    ,aq1.VisitID

    )q2

    outer apply (select MedicalInduction = Item from DelimitedSplit8K ((select

    max(replace(replace([ValueInfo],'}',''),'{',''))

    from

    [MeditechDR01-M1].[livefocdb_daily].[dbo].[RegAcctQuery_Result] aq1

    where

    aq1.[Query_MisQryID] in ('MAT.MEDIND')

    and aq1.SourceID=ad.SourceID

    and aq1.VisitID=ad.VisitID

    group by

    aq1.SourceID

    ,aq1.VisitID),'|')

    )q3

    where

    cast(cast(

    dateadd(minute,cast(right(q2.TimeofOnsetofLabour,2) as int),

    dateadd(hour,cast(left(q2.TimeofOnsetofLabour,2) as int),

    cast(right(q2.DateofOnsetofLabour,2) + '-' +

    substring(q2.DateofOnsetofLabour,5,2) + '-' + left(q2.DateofOnsetofLabour,4) as datetime)

    )

    ) as datetime) as date) between @start and @end

    and (ad.Name NOT LIKE 'XX%TEST%'

    or ad.Name NOT LIKE 'YY%TEST%'

    OR ad.Name IS NULL)

    I don't think the same issues apply here, but I'd be grateful if you'd give it a once-over.

    I've got quite a few queries written like this for different pieces of info so I'll take all the improvements I can get my hands on.

    ::edit::

    The dates and times come in as just numbers (and in different fields) which is why I've had to use that fairly complicated piece of code to get them to something usable.

  • Two changes worth trying. Firstly, introducing the splitter:

    OUTER APPLY (

    SELECT MedicalInduction = ds.Item

    FROM [MeditechDR01-M1].[livefocdb_daily].[dbo].[RegAcctQuery_Result] aq1

    CROSS APPLY DelimitedSplit8K (replace(replace(aq1.[ValueInfo],'}',''),'{',''),'|') ds

    WHERE aq1.[Query_MisQryID] IN ('MAT.MEDIND')

    AND aq1.SourceID = ad.SourceID

    AND aq1.VisitID = ad.VisitID

    ) q3

    Secondly, the complex processing around the date selection:

    cast(cast(

    dateadd(minute,cast(right(q2.TimeofOnsetofLabour,2) as int),

    dateadd(hour,cast(left(q2.TimeofOnsetofLabour,2) as int),

    cast(right(q2.DateofOnsetofLabour,2) + '-' +

    substring(q2.DateofOnsetofLabour,5,2) + '-' + left(q2.DateofOnsetofLabour,4) as datetime)

    )

    ) as datetime) as date) between @start and @end

    Can you change this so that instead of performing calculations on the column, you're performing calculations on the parameters? The way it's written, it can't use an index (if there is one) on q2.TimeofOnsetofLabour. Also, that calculation has to be performed on every qualifying row at that point in the plan - whereas if you squish the parameters to match the column, you only do it once.

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Thanks Chris

    I'll have a look at the updated Splitter code and let you know the outcome.

    As for the second part, I'm storing what is effectively a number (e.g.20160630 for today) and a time(e.g.1359) into a datetime field.

    I'll admit to pure laziness on my part as I was using a different field in my WHERE clause and then swapped it for the <insert own phrase here> in the SELECT clause as I reaslised it was pulling the wrong dates.

    Once it was returning the correct data, I've added an INSERT so the data is permanently stored in a different table.

  • Hi Chris

    I've had a look at the updated Splitter code and it actually runs a few seconds slower than my original.

    Not entirely sure why as I would have thought it would be slightly quicker being an inner join.

    Thanks for all the help.

  • richardmgreen1 (6/30/2016)


    Hi Chris

    I've had a look at the updated Splitter code and it actually runs a few seconds slower than my original.

    Not entirely sure why as I would have thought it would be slightly quicker being an inner join.

    Thanks for all the help.

    Are the results the same?

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Hi Chris

    Absolutely identical on every row.

  • When you say it's slower, is it 4s vs 2s or 104s vs 102s?

    This is more important: what datatype are q2.DateofOnsetofLabour and q2.TimeofOnsetofLabour?

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • We're talking 46 seconds v 38 seconds, not a huge amount at all. Both of the timings are averaged out from several runs.

    They are both being stored in a varchar(1250) field.

    It's in a table that's a sort of catch-all from a 3rd-party application with some data-entry screens that we've designed.

    The vendor knows and is happy for us to do this but dumps all the data that gets input into this table with various markers so we know were the value cam from.

  • Can you provide a few examples of *exactly* what they look like?

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

Viewing 15 posts - 1 through 15 (of 16 total)

You must be logged in to reply to this topic. Login to reply