Conditional Aggregation

  • I have a table with records which are flagged as passing or failing an initial criteria test.

    The records look like this;

    For any given well I need to test each record with an OreFlag of N which is bounded by records with an OreFlag of Y and if;

    - The length of the N record is less than 1 (meter). This condition may be a variable

    - The length of the N record is less than the smaller of the to Y records

    - and the Length weighted average of the MBit column would be greater than 0.06 (This the initial - variable - critera which termined whether a record was initially set to Y or N).

    if you look at records 10 through 12 of the screen shot you will see that these should be aggregated. This, of course, leads to the next issue. Recursion. Conditions exist (see wellNum4) where, once the first aggregation is completed, It creates new records which should then also be aggregated until, at some point, no more aggregation can be done.

    Aggregation involves taking the minimum TopDepth, maximum BaseDepth and the length weighted average of MBit, PHIE, Vsh, SwE (if you're a geologist you'll know what those attributes are ;-)), and the sum of length.

    The data set I'm working with has 367,000 rows. If you're interested in helping me solve this problem and having the data set I can up-load it (it's about 4Mb zipped).

  • THe picture is a broken link. Please provide the table structure, small set of sample records, and an example of your expected results.


  • Can you also provide a create table script?


  • Ah. I should just mention that in the assembled intervals PNG the 2nd example is a case where there would be recursion. The first three records would be aggregated (shown as the row in red), and then that newly aggregated row should be combined with the two above it to produce the row shown in blue.

  • For "bound by" can we assume the sequence number is +1 and -1 for bound records?


  • The table I'm working with is acutally a temporary table I've created from the source data which has been quite processed to get it to this point.

    The table definition below will give you a table which matches the data I posted and has the appropriate PK.

    The column called Seq is one I've added for convenience of processing (makes it able to do previous row, next row processing)

    CREATE TABLE [EUB\CB238].[TEMP](

    [HOLEID] [varchar](13) NOT NULL,

    [EvalNum] [int] NOT NULL,

    [Seq] [int] NOT NULL,

    [TopDepth] [numeric](6, 2) NOT NULL,

    [BaseDepth] [numeric](6, 2) NOT NULL,

    [MBit] NUMERIC(6,2) NULL,

    [PHIE] NUMERIC(6,2) NULL,

    [SwE] NUMERIC(6,2) NULL,

    [Vsh] NUMERIC(6,2) NULL,

    [Length] [numeric](6, 2) NULL,

    [OreFlag] [varchar](1)

    )

    CREATE UNIQUE CLUSTERED INDEX TEMP_PK ON TEMP(HoleId, EvalNum, TopDepth)

  • I'm not sure what you mean by bound by. If you mean that I can join the table to it's self like this

    FROM ##TEMP CUR

    LEFT JOIN ##TEMP PREV

    ON PREV.HoleId = CUR.HoleId

    AND PREV.EvalNum = CUR.EvalNum

    AND PREV.Seq = CUR.Seq - 1

    LEFT JOIN ##TEMP NEX

    ON NEX.HoleId = CUR.HoleId

    AND NEX.EvalNum = CUR.EvalNum

    AND NEX.Seq = CUR.Seq + 1

    to get the previous and next rows - then yes.

    Again - Thanks!

  • Here is what I have so far to gather the records that need to be aggregated. How do I determine the lenght weighed average of the MBit column?

    select a.*,b.seq prec,c.seq nrec from wells a

    join wells b on a.evalnum = b.evalnum and a.holeid=b.holeid and a.seq = b.seq -1 and b.oreflag = 'y'

    join wells c on a.evalnum = c.evalnum and a.holeid=c.holeid and a.seq = c.seq +1 and c.oreflag = 'y'

    where a.oreflag = 'n' and a.length < 1and a.length < b.length and a.length < c.length

    union

    select d.*,0,0 from wells d

    join

    (select a.*,b.seq prec,c.seq nrec from wells a

    join wells b on a.evalnum = b.evalnum and a.holeid=b.holeid and a.seq = b.seq -1 and b.oreflag = 'y'

    join wells c on a.evalnum = c.evalnum and a.holeid=c.holeid and a.seq = c.seq +1 and c.oreflag = 'y'

    where a.oreflag = 'n' and a.length < 1and a.length < b.length and a.length < c.length

    ) set1 on d.seq=set1.prec

    union

    select d.*,0,0 from wells d

    join

    (select a.*,b.seq prec,c.seq nrec from wells a

    join wells b on a.evalnum = b.evalnum and a.holeid=b.holeid and a.seq = b.seq -1 and b.oreflag = 'y'

    join wells c on a.evalnum = c.evalnum and a.holeid=c.holeid and a.seq = c.seq +1 and c.oreflag = 'y'

    where a.oreflag = 'n' and a.length < 1and a.length < b.length and a.length < c.length

    ) set1 on d.seq=set1.nrec


  • Length (it should actually be [interval] thickness rather than length - My bad on the naming) weighted average is calculated like this;

    ((Prev.MBit * Prev.Length)

    + (Cur.MBit * Cur.Length)

    + (Nex.MBit * Nex.Length))

    /(Prev.Length + Cur.Length + Nex.Length)

  • Let's try this

    select

    a.holeid,a.evalnum,c.seq

    /*a.seq,a.topdepth,a.basedepth,a.mbit,a.phie,a.swe,a.vsh,a.length,a.oreflag,b.seq prec,c.seq nrec,a.mbit*a.length wambit,a.phie*a.length waphie,a.vsh*a.length wavsh,a.swe*a.length waswe*/

    ,case when case when a.basedepth > b.basedepth then a.basedepth else b.basedepth end > c.basedepth then case when a.basedepth > b.basedepth then a.basedepth else b.basedepth end else c.basedepth end maxbasedepth

    ,case when case when a.topdepth < b.topdepth then a.topdepth else b.topdepth end < c.topdepth then case when a.topdepth < b.topdepth then a.topdepth else b.topdepth end else c.topdepth end mintopdepth

    ,((a.mbit*a.length) +(b.mbit*b.length) + (c.mbit*c.length)) /(a.length+b.length + c.length) lwambit

    ,((a.phie*a.length) +(b.phie*b.length) + (c.phie*c.length)) /(a.length+b.length + c.length) lwaphie

    ,((a.swe*a.length) +(b.swe*b.length) + (c.swe*c.length)) /(a.length+b.length + c.length) lwaswe

    ,((a.vsh*a.length) +(b.vsh*b.length) + (c.vsh*c.length)) /(a.length+b.length + c.length) lwavsh

    , (a.length+b.length + c.length) sumlength

    ,case when ((a.mbit*a.length) +(b.mbit*b.length) + (c.mbit*c.length)) /(a.length+b.length + c.length) > .06 then 'Y' else 'N' end oreflag

    from wells a

    join wells b on a.evalnum = b.evalnum and a.holeid=b.holeid and a.seq = b.seq -1 and b.oreflag = 'y'

    join wells c on a.evalnum = c.evalnum and a.holeid=c.holeid and a.seq = c.seq +1 and c.oreflag = 'y'

    where a.oreflag = 'n' and a.length < 1and a.length < b.length and a.length < c.length


  • This adds the new aggregated records into the other records.

    select * from (

    select

    a.holeid,a.evalnum,c.seq

    /*a.seq,a.topdepth,a.basedepth,a.mbit,a.phie,a.swe,a.vsh,a.length,a.oreflag,b.seq prec,c.seq nrec,a.mbit*a.length wambit,a.phie*a.length waphie,a.vsh*a.length wavsh,a.swe*a.length waswe*/

    ,case when case when a.basedepth > b.basedepth then a.basedepth else b.basedepth end > c.basedepth then case when a.basedepth > b.basedepth then a.basedepth else b.basedepth end else c.basedepth end maxbasedepth

    ,case when case when a.topdepth < b.topdepth then a.topdepth else b.topdepth end < c.topdepth then case when a.topdepth < b.topdepth then a.topdepth else b.topdepth end else c.topdepth end mintopdepth

    ,((a.mbit*a.length) +(b.mbit*b.length) + (c.mbit*c.length)) /(a.length+b.length + c.length) lwambit

    ,((a.phie*a.length) +(b.phie*b.length) + (c.phie*c.length)) /(a.length+b.length + c.length) lwaphie

    ,((a.swe*a.length) +(b.swe*b.length) + (c.swe*c.length)) /(a.length+b.length + c.length) lwaswe

    ,((a.vsh*a.length) +(b.vsh*b.length) + (c.vsh*c.length)) /(a.length+b.length + c.length) lwavsh

    , (a.length+b.length + c.length) sumlength

    ,case when ((a.mbit*a.length) +(b.mbit*b.length) + (c.mbit*c.length)) /(a.length+b.length + c.length) > .06 then 'Y' else 'N' end oreflag

    from wells a

    join wells b on a.evalnum = b.evalnum and a.holeid=b.holeid and a.seq = b.seq -1 and b.oreflag = 'y'

    join wells c on a.evalnum = c.evalnum and a.holeid=c.holeid and a.seq = c.seq +1 and c.oreflag = 'y'

    where a.oreflag = 'n' and a.length < 1and a.length < b.length and a.length < c.length

    union

    select

    holeid,evalnum,wells.seq,basedepth,topdepth,mbit,phie,swe,vsh,length,oreflag

    from wells

    left join

    (select a.seq seq1,b.seq pseq,c.seq nseq

    from wells a

    join wells b on a.evalnum = b.evalnum and a.holeid=b.holeid and a.seq = b.seq -1 and b.oreflag = 'y'

    join wells c on a.evalnum = c.evalnum and a.holeid=c.holeid and a.seq = c.seq +1 and c.oreflag = 'y'

    where a.oreflag = 'n' and a.length < 1and a.length < b.length and a.length < c.length

    ) aggs on wells.seq = aggs.seq1 or wells.seq = aggs.pseq or wells.seq = aggs.nseq

    where aggs.seq1 is null

    ) n

    order by seq

    How may times do you think these would need to be aggregated?


  • Hi Old Hand;

    Thanks for your time on this.

    I don't know how many times the aggregation would have to be itterated. It sort of depends on what the initial criteria value of MBit is. If it's set low (like 0.03) then most of the intervals get aggregated right off the bat and their's likely not more than one iteration. If it's set high (0.09 or more) then there could be many itterations.

    How long do you think the script you posted above should take? I stopped it after 20 minutes.

  • It only took 1 second on the test data you gave me and that was without any indexes on the table. To run multiple iterations I think we will need to insert ithe data from each iteration into a new table to build a new sequence because after each iteration the sequence has gaps in it. My thoughts are having 2 staging tables and flipping back and forth until the number of rows match between the 2. If you want to post your full data set I can see how long it takes to run.


  • Ive been trying to upload the full data file (4mb zipped) and there seems to be something wrong with the upload utility right now.

    I'll try again in a couple of hours.

    I have to confess that I'm having a little bit of trouble following what the script is acutually doing.

    Could you please explain the principles of the approach that you've put together?

    Thanks!

  • I just found an error in my code where I reversed the basedepth and topdepth in the union. I edited the post above to correct this.


Viewing 15 posts - 1 through 15 (of 41 total)

You must be logged in to reply to this topic. Login to reply