Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase

Restart runningtotal when predetermined value is reached Expand / Collapse
Author
Message
Posted Monday, September 24, 2012 10:00 PM
SSC Veteran

SSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC Veteran

Group: General Forum Members
Last Login: Monday, November 10, 2014 5:31 PM
Points: 238, Visits: 2,221
Hi, Im trying to create a running total based on FieldToSum that will restart everytime runningTotal <= 25 and count the number of times this happens in runningCount.
This needs to be done for both forwards and backwards.
This is currently what i have inherited (made some changes based on articles here already which has helped).
There are approximately 40 million records this needs to be done for.
If you run the following code then that will give the answer im after albeit quite slowly.
Im after any sort of help/advice with this. Also this is my first post so if im missing any information please let me know.

Current code
IF OBJECT_ID('TempDB..#runTotal','U') IS NOT NULL
DROP TABLE #runTotal

CREATE TABLE #runTotal
(
runTotalID INT IDENTITY(1,1),
tableID int,
FieldToSum FLOAT,
runningTotal float,
runningCount int
)

IF OBJECT_ID('TempDB..#table','U') IS NOT NULL
DROP TABLE #table

CREATE TABLE #table
(
tableID int NOT NULL,
runningCount int not null,
orderCol int,
FieldToSum float
)

insert #table
select 1, 0, 1, 4.733415944 union all
select 2, 0, 8, 4.733415944 union all
select 3, 0, 6, 4.733415944 union all
select 4, 0, 19, 4.733415944 union all
select 5, 0, 14, 6.950871733 union all
select 6, 0, 3, 6.942781406 union all
select 7, 0, 2, 6.956403557 union all
select 8, 0, 20, 6.945977457 union all
select 9, 0, 11, 6.956403557 union all
select 10, 0, 13, 6.956403557 union all
select 11, 0, 4, 6.945977457 union all
select 12, 0, 10, 6.956403557 union all
select 13, 0, 17, 6.945977457 union all
select 14, 0, 12, 6.956403557 union all
select 15, 0, 15, 6.968313001 union all
select 16, 0, 5, 6.968313001 union all
select 17, 0, 7, 7.024537923 union all
select 18, 0, 18, 7.024537923 union all
select 19, 0, 16, 6.968313001 union all
select 20, 0, 9, 6.968313001

CREATE UNIQUE CLUSTERED INDEX [Idx1] ON #runTotal(runTotalID )

Declare @flag AS int=0
DECLARE @Counter As INT =1
declare @runningTotal as float
declare @anchor int
WHILE @Flag=0 BEGIN
IF (select COUNT(1) from #table where runningCount=0)<>0
BEGIN
TRUNCATE Table #runTotal
INSERT INTO #runTotal
SELECT tableID, FieldToSum, 0, 0 FROM #table where runningCount=0 ORDER BY orderCol ASC
set @runningTotal = 0
update #runTotal
set @runningTotal = runningTotal = @runningtotal + FieldToSum,
@anchor = runtotalid
from #runTotal WITH (TABLOCKX) OPTION (MAXDOP 1)
UPDATE #table
SET runningCount=@Counter from #table a inner join #runTotal b on a.tableID = b.tableID where b.runningTotal <= 25
END
ELSE
SET @FLAG=1
IF (select COUNT(1) from #table where runningCount=0)<>0
BEGIN
TRUNCATE Table #runTotal
INSERT INTO #runTotal
SELECT tableID, FieldToSum, 0, 0 FROM #table where runningCount=0 ORDER BY orderCol DESC
set @runningTotal = 0
update #runTotal
set @runningTotal = runningTotal = @runningtotal + FieldToSum,
@anchor = runtotalid
from #runTotal WITH (TABLOCKX) OPTION (MAXDOP 1)
UPDATE #table
SET runningCount=@Counter from #table a inner join #runTotal b on a.tableID = b.tableID where b.runningTotal <= 25
END
ELSE
SET @FLAG=1

SET @Counter=@Counter+1
END


Post #1363770
Posted Tuesday, September 25, 2012 4:45 AM


Old Hand

Old HandOld HandOld HandOld HandOld HandOld HandOld HandOld Hand

Group: General Forum Members
Last Login: Yesterday @ 8:05 AM
Points: 320, Visits: 1,095
This process uses the 'Quirky Update' without the essential requirements being in place. As such it is unreliable & will almost certainly give incorrect results in the future (or may have already done so).

Quirky Update is where variable & column are updated together:
update #runTotal 
set @runningTotal = runningTotal = @runningtotal + FieldToSum

See this:
http://www.sqlservercentral.com/articles/T-SQL/68467/
for more details - the list of requirements for Quirky Update are towards the end.

It may seem a bit complicated - but unfortunately it is!
Post #1363937
Posted Tuesday, September 25, 2012 7:00 PM
SSC Veteran

SSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC Veteran

Group: General Forum Members
Last Login: Monday, November 10, 2014 5:31 PM
Points: 238, Visits: 2,221
I appreciate the help with my broken quirky update. I believed ive fixed that and edited my first post to reflect the new code.
Is there anything else anyone can recommend that could help ?
Post #1364392
Posted Tuesday, September 25, 2012 10:07 PM
Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: Wednesday, December 10, 2014 12:17 PM
Points: 1,095, Visits: 3,182
You could try the following

I'm sure that this can be improved though and I don't know if it will scale very well as it is recursive.

IF OBJECT_ID('TempDB..#table','U') IS NOT NULL
DROP TABLE #table

CREATE TABLE #table
(
tableID int NOT NULL,
runningCount int not null,
orderCol int,
FieldToSum float
)

IF OBJECT_ID('TempDB..#table2','U') IS NOT NULL
DROP TABLE #table2

CREATE TABLE #table2
(
tableID int NOT NULL,
runningCount int not null,
orderCol int,
FieldToSum float
)

insert #table
select 1, 0, 1, 4.733415944 union all
select 2, 0, 8, 4.733415944 union all
select 3, 0, 6, 4.733415944 union all
select 4, 0, 19, 4.733415944 union all
select 5, 0, 14, 6.950871733 union all
select 6, 0, 3, 6.942781406 union all
select 7, 0, 2, 6.956403557 union all
select 8, 0, 20, 6.945977457 union all
select 9, 0, 11, 6.956403557 union all
select 10, 0, 13, 6.956403557 union all
select 11, 0, 4, 6.945977457 union all
select 12, 0, 10, 6.956403557 union all
select 13, 0, 17, 6.945977457 union all
select 14, 0, 12, 6.956403557 union all
select 15, 0, 15, 6.968313001 union all
select 16, 0, 5, 6.968313001 union all
select 17, 0, 7, 7.024537923 union all
select 18, 0, 18, 7.024537923 union all
select 19, 0, 16, 6.968313001 union all
select 20, 0, 9, 6.968313001

CREATE UNIQUE INDEX [Idx1] ON #table(orderCol)

;WITH
maxordercol as (SELECT MAX(orderCol) endorder FROM #table),
a AS (
SELECT TableID, orderCol, FieldToSum,
FieldToSum RunTotal,
1 runningCount,
'A' direction
FROM #table
WHERE orderCol = 1
UNION ALL
SELECT t.TableID, t.orderCol, t.FieldToSum,
case when t.FieldToSum + a.runTotal <= 25 then
t.FieldToSum + a.runTotal
else
t.FieldToSum
end RunTotal,
case when t.FieldToSum + a.runTotal <= 25 then
a.runningCount
else
a.runningCount + 1
end runningCount,
'A' direction
FROM a
inner join #table t on t.orderCol = a.orderCol + 1
),
b as (
SELECT TableID, orderCol, FieldToSum,
FieldToSum RunTotal,
1 runningCount,
'D' direction
FROM #table
WHERE orderCol = (select endorder from maxordercol)
UNION ALL
SELECT t.TableID, t.orderCol, t.FieldToSum,
case when t.FieldToSum + b.runTotal <= 25 then
t.FieldToSum + b.runTotal
else
t.FieldToSum
end RunTotal,
case when t.FieldToSum + b.runTotal <= 25 then
b.runningCount
else
b.runningCount + 1
end runningCount,
'D' direction
FROM b
inner join #table t on t.orderCol = b.orderCol - 1
),
c AS (
select tableid, runningCount, ordercol, fieldtosum
from (
select *, row_number() OVER (partition by tableid order by runningCount) seq
from (select * from a union all select * from b) x
) y
where y.seq = 1
)
INSERT INTO #table2
SELECT *
FROM c
OPTION (MAXRECURSION 0)

UPDATE #table
SET runningCount = b.runningCount
FROM #table a INNER JOIN #table2 b ON a.tableid = b.tableid

I tried it over 1200 records as well and the performance was 40ms vs 4200ms
Post #1364420
Posted Tuesday, September 25, 2012 10:16 PM
SSC Veteran

SSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC Veteran

Group: General Forum Members
Last Login: Monday, November 10, 2014 5:31 PM
Points: 238, Visits: 2,221
Thanks !!
I will give it a go and let you know how it performs over our dataset.
Post #1364423
Posted Wednesday, September 26, 2012 2:08 AM


SSCommitted

SSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommitted

Group: General Forum Members
Last Login: Yesterday @ 12:12 PM
Points: 1,945, Visits: 3,180
Hi, Im trying to create a running total based on FieldToSum that will restart every time something_runtot <= 25 and count the number of times this happens in something_runcnt.


Did you mean “something_runtot >= 25”

This needs to be done for both forwards and backwards.


Tables are NOT magnetic tapes than can be run forward and backwards; what dose that mean?

This is currently what I have inherited.


It is NOT legacy code; it is family curse code Can you keep changing it? For example, the rounding errors in FLOAT mean that the order in which you do the addition will change the totals. Then did you really need the range of a FLOAT? Today, we have DECIMAL(s,p) and prefer to use that instead. But if you do not care about the right answers and extra overhead, we can keep going with FLOAT.

There are approximately 40 million records {sic: rows are NOT records; that is magnetic tape again} this needs to be done for.


We do not use IDENTITY in RDBMS. We do not keep the summary values in the base table being summarized. Do you have a true sequence number for the something_values you want to total? It also looks like you are using flag coupling with that 'A' and 'D' stuff in the IF-THEN-ELSE non-declarative (ugh!) code you have now. My guess is that when the columns at the wrong level of aggregation are removed and you get a valid key, the base table is simply:

CREATE TABLE Foobar
(foo_seq INTEGER NOT NULL PRIMARY KEY,
something_value FLOAT NOT NULL);

We have the ANSI/ISO Standard OVER() sub-clause, so you can easily get the running totals with the sequence numbers as the ordering.

SELECT foo_seq, something_value,
SUM(something_value)
OVER (ORDER BY foo_seq ASC
ROWS UNBOUNDED PRECEDING AND CURRENT ROW)
AS up_something_value runtot,
SUM(something_value)
OVER (ORDER BY foo_seq DESC
ROWS UNBOUNDED PRECEDING AND CURRENT ROW)
AS dn_something_value runtot
FROM Foobar;

Check this out first, then we can put it in a CTE or VIEW and do some MOD 25 math on it when we have a better spec.



Books in Celko Series for Morgan-Kaufmann Publishing
Analytics and OLAP in SQL
Data and Databases: Concepts in Practice
Data, Measurements and Standards in SQL
SQL for Smarties
SQL Programming Style
SQL Puzzles and Answers
Thinking in Sets
Trees and Hierarchies in SQL
Post #1364495
Posted Wednesday, September 26, 2012 3:30 AM
SSC Veteran

SSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC Veteran

Group: General Forum Members
Last Login: Monday, November 10, 2014 5:31 PM
Points: 238, Visits: 2,221
I will try and explain a little better. Im trying to mark each row based on having a something_runtot up to or equal to 25 - it cant go over this number and i am required to use the marked rows elsewhere (this is where runningCount comes in).

Tables are NOT magnetic tapes than can be run forward and backwards; what dose that mean? 

My terminology was a little off in places :)
It means I need something_runtot up to or equal to 50 and the current business logic requires half from ascending and half from descending. So i need to get a running total up to or equal to 25, order by orderCol ascending then mark the records and i need to do the same for order by orderCol descending.
So in my current code the first time through the loop it will give the value of 1 to all rows with a runningTotal up to or equal to 25 ordered by orderCol asc and a value of 1 to all rows with a runningTotal up to or equal to 25 ordered by orderCol desc. This needs to be repeated until no more rows are left and no rows are allowed to be used more than once.

Do you have a true sequence number for the something_values you want to total?

Yes i do but when i insert the values i use the identity column to hold the ordering. (Im also not too sure why IDENTITY shouldnt be used).

Can you keep changing it?

I can change it as much as necessary to get better performance, however id prefer not to change from a float currently as thats what the base table is using.


I will give your code a go and see what happens - im in a totally different timezone so it will be a while before i can test it.

As always if my explanations dont make sense please let me know and i will try to rewrite it.
Thanks for the help.
Post #1364522
Posted Wednesday, September 26, 2012 3:47 AM


SSCommitted

SSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommitted

Group: General Forum Members
Last Login: Yesterday @ 2:58 PM
Points: 1,683, Visits: 19,607
matak (9/26/2012)
I will try and explain a little better. Im trying to mark each row based on having a something_runtot up to or equal to 25 - it cant go over this number and i am required to use the marked rows elsewhere (this is where runningCount comes in).

Tables are NOT magnetic tapes than can be run forward and backwards; what dose that mean? 

My terminology was a little off in places :)
It means I need something_runtot up to or equal to 50 and the current business logic requires half from ascending and half from descending. So i need to get a running total up to or equal to 25, order by orderCol ascending then mark the records and i need to do the same for order by orderCol descending.
So in my current code the first time through the loop it will give the value of 1 to all rows with a runningTotal up to or equal to 25 ordered by orderCol asc and a value of 1 to all rows with a runningTotal up to or equal to 25 ordered by orderCol desc. This needs to be repeated until no more rows are left and no rows are allowed to be used more than once.

Do you have a true sequence number for the something_values you want to total?

Yes i do but when i insert the values i use the identity column to hold the ordering. (Im also not too sure why IDENTITY shouldnt be used).

Can you keep changing it?

I can change it as much as necessary to get better performance, however id prefer not to change from a float currently as thats what the base table is using.


I will give your code a go and see what happens - im in a totally different timezone so it will be a while before i can test it.

As always if my explanations dont make sense please let me know and i will try to rewrite it.
Thanks for the help.


Be aware that the SUM.. OVER(ORDER BY ...) sub-clause requires SQL Server 2012


____________________________________________________

How to get the best help on a forum

http://www.sqlservercentral.com/articles/Best+Practices/61537

Never approach a goat from the front, a horse from the rear, or a fool from any direction.
Post #1364535
« Prev Topic | Next Topic »

Add to briefcase

Permissions Expand / Collapse