Solving FIFO Queues Using Windowed Functions

Question

Post reply

Solving FIFO Queues Using Windowed Functions

drew.allen

SSC Guru

Points: 77012
More actions
October 5, 2016 at 10:37 pm

#400839

Comments posted to this topic are about the item Solving FIFO Queues Using Windowed Functions
J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA

Viewing 15 posts - 1 through 15 (of 17 total)

You must be logged in to reply to this topic. Login to reply

Eirikur Eiriksson SSC Guru Points: 182939 More actions · Answer 1

Nice write-up Drew, well done!

😎

One thing I missed seeing in the article was the mentioning of a POC index which is absolutely essential when working with the window functions, especially when more than 90% of the execution cost are the sort operators.

akljfhnlaflkj SSC Guru Points: 76202 More actions · Answer 2

akljfhnlaflkj

SSC Guru

Points: 76202

October 6, 2016 at 6:53 am

#1905149

Interesting logic flow, thanks.

Ken SSCoach Points: 17191 More actions · Answer 3

Ken

SSCoach

Points: 17191

October 6, 2016 at 8:29 am

#1905186

Excellent write up!

TheSQLGuru SSC Guru Points: 134015 More actions · Answer 4

Very nicely done Drew!

Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service

Alan Burstein SSC Guru Points: 61152 More actions · Answer 5

Great, great work Drew. Very well done.

"I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

-- Itzik Ben-Gan 2001

wtren SSC Eights! Points: 806 More actions · Answer 6

beautiful work, I check the execution plan, the main cost is on sort operator, in fact we do not need that much sort, we just need sort the credit operation by transid in every custid group to make a FIFO, get a correct runningtotalofcredit order, calculate the final sum debits for every custid, and then join the two result on custid , compare two sum number to get the result.

here is the some simple code for that:

set statistics time on

select c.custid, transtype,

case when runningtotalofc + amountd >=0 then 0

when runningtotalofc + amountd < amount or amountd is null then -amount

else -runningtotalofc - amountd

end as remain

from

(

(select custid,transtype,amount,sum(amount) over (partition by custid order by transid) as runningtotalofC

from dbo.transactions where transtype = 'C') as c

full outer join

(select custid,sum(amount) amountd

from dbo.transactions where transtype = 'D' group by custid) as d

on c.custid = d.custid)

order by custid

set statistics time off

please point out if there any problems, thank you.

drew.allen SSC Guru Points: 77012 More actions · Answer 7

wtren (10/8/2016)
beautiful work, I check the execution plan, the main cost is on sort operator, in fact we do not need that much sort, we just need sort the credit operation by transid in every custid group to make a FIFO, get a correct runningtotalofcredit order, calculate the final sum debits for every custid, and then join the two result on custid , compare two sum number to get the result.
here is the some simple code for that:
set statistics time on
select c.custid, transtype,
case when runningtotalofc + amountd >=0 then 0
when runningtotalofc + amountd < amount or amountd is null then -amount
else -runningtotalofc - amountd
end as remain
from
(
(select custid,transtype,amount,sum(amount) over (partition by custid order by transid) as runningtotalofC
from dbo.transactions where transtype = 'C') as c
full outer join
(select custid,sum(amount) amountd
from dbo.transactions where transtype = 'D' group by custid) as d
on c.custid = d.custid)
order by custid
set statistics time off
please point out if there any problems, thank you.

One of the requirements of the original question was to display when a credit was depleted, which requires knowing the date of the debit that did so. Since your subquery only contains a grand total of debits for a customer, it cannot return that information.

Drew

J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA

J Livingston SQL SSC Guru Points: 51550 More actions · Answer 8

I did some performance testing on the two pieces of code. They don't produce the same results, and I haven't worked through where the issues are, so these results may not stand. I've also attached a portion of the data. I couldn't upload the whole set of data due to size restrictions.

Drew...I think that if you remove

WHERE rn = 1

AND Results.debitdate IS NOT NULL

then the number of rows will be the same

iirc, I put this in to meet OP requirement ?

________________________________________________________________
you can lead a user to data....but you cannot make them think
and remember....every day is a school day

louisjeorge75 Grasshopper Points: 17 More actions · Answer 9

hi drew ,

that is wonderful work , i need a help to write this in SQL DB2 , how can i do this.

drew.allen SSC Guru Points: 77012 More actions · Answer 10

louisjeorge75 - Wednesday, December 20, 2017 5:02 AM
hi drew ,
that is wonderful work , i need a help to write this in SQL DB2 , how can i do this.

Thanks. I don't know DB2, so I can't help you with the rewrite.

Drew

J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA

autoexcrement SSCertifiable Points: 5956 More actions · Answer 11

ISNULL(LAG([to], 1) OVER(PARTITION BY CustID ORDER BY c.TransDate, c.Amount), 0)
Am I misunderstanding something here? The ISNULL check would never return NULL, and thus 0, because you've already provided a default value (1) in your LAG statement in case of NULLs. No?

Edit: Nevermind. It's the offset that's been provided, not the default value. But still, wouldn't this make more sense?
LAG([to], 1, 0) OVER(PARTITION BY CustID ORDER BY c.TransDate, c.Amount)

"If I had been drinking out of that toilet, I might have been killed." -Ace Ventura

autoexcrement SSCertifiable Points: 5956 More actions · Answer 12

Question #2...I am not entirely following this bit:

When SUM() is used with an OVER clause that contains an ORDER BY, then a window frame is "required." If you do not specify a window frame, then it will use the default RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW. The problem with this is that RANGE will always write to disk, but ROWS will not if there are fewer than 10,000 records per partition. When used with SUM(), there are only ever TWO rows per partition, because of the way that SQL calculates the running totals.

What is the significance of "writing to disk", and what is the 10k vs 2 rows per partition issue? I'm sorry for my ignorance, trying to understand what you mean here. Thanks!

"If I had been drinking out of that toilet, I might have been killed." -Ace Ventura

TheSQLGuru SSC Guru Points: 134015 More actions · Answer 13

autoexcrement - Thursday, November 29, 2018 10:59 PM
ISNULL(LAG([to], 1) OVER(PARTITION BY CustID ORDER BY c.TransDate, c.Amount), 0)
Am I misunderstanding something here? The ISNULL check would never return NULL, and thus 0, because you've already provided a default value (1) in your LAG statement in case of NULLs. No?
Edit: Nevermind. It's the offset that's been provided, not the default value. But still, wouldn't this make more sense?
LAG([to], 1, 0) OVER(PARTITION BY CustID ORDER BY c.TransDate, c.Amount)

That should be more efficient, because the replacement of missing values is coded into the same branch that does the LAG itself, as opposed to being a separate process after the LAG is computed. Looks like a nice find.

Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service

TheSQLGuru SSC Guru Points: 134015 More actions · Answer 14

TheSQLGuru - Friday, November 30, 2018 7:36 AM
autoexcrement - Thursday, November 29, 2018 10:59 PM
ISNULL(LAG([to], 1) OVER(PARTITION BY CustID ORDER BY c.TransDate, c.Amount), 0)
Am I misunderstanding something here? The ISNULL check would never return NULL, and thus 0, because you've already provided a default value (1) in your LAG statement in case of NULLs. No?
Edit: Nevermind. It's the offset that's been provided, not the default value. But still, wouldn't this make more sense?
LAG([to], 1, 0) OVER(PARTITION BY CustID ORDER BY c.TransDate, c.Amount)
That should be more efficient, because the replacement of missing values is coded into the same branch that does the LAG itself, as opposed to being a separate process after the LAG is computed. Looks like a nice find.

But looking at the lasted code, posted just today, it seems he fixed this:

LAG(tot, 1, 0) OVER(PARTITION BY CustID ORDER BY tot, TransType) AS PrevTot

That is the only LAG in his query now.

Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service