Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase

Multiple Recursive References inside CTE Expand / Collapse
Author
Message
Posted Saturday, July 7, 2012 8:31 AM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Wednesday, June 18, 2014 3:57 AM
Points: 141, Visits: 842
Hi Guys,

Hope you're all doing well!

I have a problem and will provide an abstraction (extremely simplified) for it, in the hopes someone could maybe assist.
Please let me know if I could provide any more code to assist you in helping me.

Consider:
;WITH Abstraction(pYear, pMonth, pValue, Problem) AS
(
SELECT 2000
,1
,RAND() * 1000
,'?'
UNION ALL
SELECT CASE WHEN pMonth = 12 THEN pYear + 1 ELSE pYear END
,CASE WHEN pMonth = 12 THEN 1 ELSE pMonth + 1 END
,RAND(CAST(CAST(NEWID()AS VARBINARY) AS INT)) * 1000
,'?'
FROM Abstraction
WHERE CASE WHEN pMonth = 12 THEN pYear + 1 ELSE pYear END <= 2005
)
SELECT *
FROM Abstraction

In the column labelled "Problem", I would like to be have:
If we look at the output and we are for example in January 2003, the sum of the pValue column for every previous January INCLUSIVE - (January 2000, January 2001, January, 2002, January 2003)
This can be calc'd after the CTE has run, sure - but I require this figure inside the CTE as other columns will be dependent on it.
The inter-dependence is such, that "chained" CTEs will also not work.

I have attempted SUM() OVER (PARTITION BY...) and have tried joining to the CTE in the recursive block - which seems to be strangely illegal...

As I said, this is a simplified abstraction of the problem, so I hope I can answer your questions within the realm of this example.

Kind thanks in advance!

PS.

Not that I am certain that the extended syntax for window functions in SQL Server 2012 would help - I need to have this solved with syntax available to SQL Server 2008
Post #1326432
Posted Saturday, July 7, 2012 12:45 PM
SSC Veteran

SSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC Veteran

Group: General Forum Members
Last Login: Sunday, June 23, 2013 10:39 PM
Points: 231, Visits: 506
Try below..Modify it as per your need.



;WITH Abstraction(pYear, pMonth, pValue, jan,feb,mar,april,may,june,july,aug,sep,oct,nov,decb) AS
(
SELECT 1999
,12
,cast (0 as float)
,cast (0 as float)
,cast (null as float)
,cast (null as float)
,cast (null as float)
,cast (null as float)
,cast (null as float)
,cast (null as float)
,cast (null as float)
,cast (null as float)
,cast (null as float)
,cast (null as float)
,cast (null as float)

UNION ALL
SELECT CASE WHEN pMonth = 12 THEN pYear + 1 ELSE pYear END
,CASE WHEN pMonth = 12 THEN 1 ELSE pMonth + 1 END
,pv.pvalue
,case when pmonth = 12 then isnull(jan,0.0) + pv.pvalue else ISNULL(jan,0.0) end as jan --this value is for jan if the pmonth is 12 and simialrly below
,case when pmonth = 1 then isnull(feb,0.0) + pv.pvalue else isnull(feb,0.0) end as feb
,case when pmonth = 2 then isnull(mar,0.0) + pv.pvalue else isnull(mar,0.0) end as mar
,case when pmonth = 3 then isnull(april,0.0) + pv.pvalue else isnull(april,0.0) end as april
,case when pmonth = 4 then isnull(may,0.0) + pv.pvalue else isnull(may,0.0) end as may
,case when pmonth = 5 then isnull(june,0.0) + pv.pvalue else isnull(june,0.0) end as june
,case when pmonth = 6 then isnull(july,0.0) + pv.pvalue else isnull(july,0.0) end as july
,case when pmonth = 7 then isnull(aug,0.0) + pv.pvalue else isnull(aug,0.0) end as aug
,case when pmonth = 8 then isnull(sep,0.0) + pv.pvalue else isnull(sep,0.0) end as sep
,case when pmonth = 9 then isnull(oct,0.0) + pv.pvalue else isnull(oct,0.0) end as oct
,case when pmonth = 10 then isnull(nov,0.0) + pv.pvalue else isnull(nov,0.0) end as nov
,case when pmonth = 11 then isnull(decb,0.0) + pv.pvalue else isnull(decb,0.0) end as decb

FROM Abstraction a
cross apply(select cast (RAND(CAST(CAST(NEWID()AS VARBINARY) AS int)) * 1000 as float) pvalue) pv
WHERE CASE WHEN pMonth = 12 THEN pYear + 1 ELSE pYear END <= 2005
)
SELECT
pYear,pMonth,pValue,
case
when pmonth = 1 then jan
when pmonth = 2 then feb
when pmonth = 3 then mar
when pmonth = 4 then april
when pmonth = 5 then may
when pmonth = 6 then june
when pmonth = 7 then july
when pmonth = 8 then aug
when pmonth = 9 then sep
when pmonth = 10 then oct
when pmonth = 11 then nov
when pmonth = 12 then decb
end as problem
FROM Abstraction
where pYear>= 2000




GulliMeel

Finding top n Worst Performing queries
Improve the performance of Merge Join(special case)
How to Post Performance Problem -Gail Shaw
Post #1326472
Posted Saturday, July 7, 2012 12:59 PM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Wednesday, June 18, 2014 3:57 AM
Points: 141, Visits: 842
Thank you :)
Very cool approach!

Unfortunately, time was a worry, so I have re-written my code in to a while loop, and am inserting row-by-row.
While inside the loop, I can query the table I am inserting in to (To stay with the abstraction, I can find the sum of all January values as they are readily query-able in the table I am inserting in to). Not my first choice.

Hope I can find the time to fit your approach to the solution in to my code and measure the performance.

Sadly, the actual problem is a great deal more complex than the abstracted problem.

Thanks again :)
Post #1326476
Posted Saturday, July 7, 2012 1:02 PM
SSC Veteran

SSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC Veteran

Group: General Forum Members
Last Login: Sunday, June 23, 2013 10:39 PM
Points: 231, Visits: 506
rCTE is a kind of loop.If this loop works with set of data and very few loops(recursion) results are good.but if it does row by row it wont be so great..In you case it is row by row any way..but just 72 rows or recursion so shouldnt be big issue..But 72 recursion wont be great iwith big table..

Yes..Try both approaches and see which one is better..


GulliMeel

Finding top n Worst Performing queries
Improve the performance of Merge Join(special case)
How to Post Performance Problem -Gail Shaw
Post #1326477
Posted Sunday, July 8, 2012 2:36 PM


SSCommitted

SSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommitted

Group: General Forum Members
Last Login: Sunday, July 20, 2014 11:55 AM
Points: 1,945, Visits: 2,860

I think your real problem is a bad data model. This is COBOL written in SQL; you even put the comma in the front of the punch card so you can re-use or re-arrange the deck. After a few decades of fixing code, I look for those symptoms.

Another sign of COBOL-mindset is splitting temporal data into fields, the ways they are declared in the DATA DIVISION. In RDBMS, first normal form would put a temporal value into a scalar value in a column.

A useful idiom is a report period calendar. It gives a name to a range of dates.

CREATE TABLE Report_Periods
(report_name CHAR(10) NOT NULL PRIMARY KEY
CHECK(period_name LIKE '[12][0-9][0-9][0-9]-[01][0-9]-00'),
report_start_date DATE NOT NULL,
report_end_date DATE NOT NULL,
CONSTRAINT date_ordering
CHECK (report_start_date <= report_end_date),
etc);

These report periods can overlap; a fiscal quarter will be contained in the range of its fiscal year. There can be gaps between them; we have to wait a year between each “Annual Going out Of Business Sale!” and there might be long stretches of time without any special sales. But we want to know their ranges so that the table is fairly constant once it is created.

I like the MySQL convention of using double zeroes for months and years, That is 'yyyy-mm-00' for a month within a year and 'yyyy-00-00' for the whole year. The advantage is that it will sort with the ISO-8601 data format required by Standard SQL. Let's re-do yo0ur table and assume it is loaded with the right data types, etc.

CREATE TABLE Foobars
(foobar_month CHAR(10) NOT NULL PRIMARY KEY
REFERENCES Report_Periods (report_name),
foobar_value INTEGER NOT NULL);

>>If we look at the output and we are for example in January 2003, the sum of the foobar_random_value column for every previous January INCLUSIVE - (January 2000, January 2001, January, 2002, January 2003)<<


SELECT foobar_month, foobar_value,
SUM (CASE WHEN foobar_month
LIKE '____' + SUBSTRING (foobar_month, 5, 4) +'00'
THEN foobar_value ELSE 0 END)
OVER (ORDER BY foobar_month ASC
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
AS foobar_month_cum_tot
FROM Foobars
WHERE foobar_month >= '2000-00-00';

See why I like that convention? This assume that you have SQL Server 2012, but we can do this with a self-join in older releases. Q: Why did you store the computation in a column? A: That is how COBOL would do it because it has to materialize all values. SQL has virtual tables and columns. Look at the nesting of non-procedural code in the SUM() as opposed to your sequential processing model of chained CTEs.

Think table look up, not computations; JOINs and not loops; nested code and not sequential processes


Books in Celko Series for Morgan-Kaufmann Publishing
Analytics and OLAP in SQL
Data and Databases: Concepts in Practice
Data, Measurements and Standards in SQL
SQL for Smarties
SQL Programming Style
SQL Puzzles and Answers
Thinking in Sets
Trees and Hierarchies in SQL
Post #1326620
Posted Sunday, July 8, 2012 6:11 PM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Today @ 6:43 PM
Points: 36,752, Visits: 31,208
CELKO (7/8/2012)
I like the MySQL convention of using double zeroes for months and years, That is 'yyyy-mm-00' for a month within a year and 'yyyy-00-00' for the whole year. The advantage is that it will sort with the ISO-8601 data format required by Standard SQL. Let's re-do yo0ur table and assume it is loaded with the right data types, etc.


Since that would require storing temporal data as character based data, I have to ask why would you do that? And, no... not trying to be the smart guy here. I'd really like to know, Joe.


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #1326631
Posted Monday, July 9, 2012 8:28 AM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Wednesday, June 18, 2014 3:57 AM
Points: 141, Visits: 842
I may be a little presumptuous in saying that I think that the problem stems from trying to build a SQL solution to a very complex problem that is currently solved in Excel.
Post #1326887
Posted Monday, July 9, 2012 8:39 PM


SSCommitted

SSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommitted

Group: General Forum Members
Last Login: Sunday, July 20, 2014 11:55 AM
Points: 1,945, Visits: 2,860
Since that would require storing temporal data as character based data, I have to ask why would you do that?


I am naming a time period because T-SQL does not have an INTERVAL data type. No temporal math is done with it. It is the key to INTERVALs in the Calendar and report periods table.

This convention happens to have some major advantages:
language free
fixed length & short
easy regular expressions in DDL and DML
sorts with ISO-8601 display format
It is fast in a query.

Think about looking for all the Januaries in English versus "report_month LIKE '____-01-00' instead. Anyone got a better way to name a time period?



Books in Celko Series for Morgan-Kaufmann Publishing
Analytics and OLAP in SQL
Data and Databases: Concepts in Practice
Data, Measurements and Standards in SQL
SQL for Smarties
SQL Programming Style
SQL Puzzles and Answers
Thinking in Sets
Trees and Hierarchies in SQL
Post #1327237
Posted Wednesday, August 8, 2012 7:24 PM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Today @ 6:43 PM
Points: 36,752, Visits: 31,208
CELKO (7/9/2012)
Since that would require storing temporal data as character based data, I have to ask why would you do that?


I am naming a time period because T-SQL does not have an INTERVAL data type. No temporal math is done with it. It is the key to INTERVALs in the Calendar and report periods table.

This convention happens to have some major advantages:
language free
fixed length & short
easy regular expressions in DDL and DML
sorts with ISO-8601 display format
It is fast in a query.

Think about looking for all the Januaries in English versus "report_month LIKE '____-01-00' instead. Anyone got a better way to name a time period?



Maybe but the real key here is that I wouldn't name a time period, Joe. Why not use the number of months since "Date zero"?


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #1342330
« Prev Topic | Next Topic »

Add to briefcase

Permissions Expand / Collapse