Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase 12»»

Tell me if I'm crazy - Recursive CTE + View (Materialized, indexed?) Expand / Collapse
Author
Message
Posted Wednesday, July 17, 2013 9:09 PM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Sunday, July 21, 2013 9:55 AM
Points: 6, Visits: 13
First off, I'm a .NET developer, and not a SQL wonk. Please be gentle. And thanks for your help.

So I have a lattice (a multiparent hierarchy) of objects representing sets of people sliced along some axis. Currently these sets exist as documents in RavenDB. Example:

Set
{
Id: 'some-guid',
Name: 'RegionName',
Dimension: 'Location',
ImmediateParents: ['containing-set-id', 'containing-set-id'],
ImmediateChildren: ['contained-set-id', 'contained-set-id'],
ContainedBy: ['denormalized', 'list', 'of', 'all', 'sets', 'that', 'contain', 'me']
}

Person
{
...
ImmediateParents: ['sets', 'i', 'am', 'assigned', 'to', 'directly'],
ContainedBy: ['denormalized', 'list', 'of', 'all', 'sets', 'that', 'contain', 'me']
}

The bulk of our queries are either direct reads from one of those lists, or "give me all the people contained by this set" query. Obviously our current implementation makes those queries really fast, but keeping all this updated is a delicate dance.

It occurred to me that having a normalized representation in SQL, and then building a view on top of it to denormalize the "contained by" might get me off the dancefloor. And that if I could materialize (usage?) that view and index it, I could get roughly the same performance profile I have now, with the workload heavily weighted toward write time.

Then tonight I learned of Recursive CTEs and the whole thing gelled for me. But I'm betting on some pretty big assumptions about features of Views that I don't understand. Am I sniffing up a good tree? Thanks for any insight.
Post #1474869
Posted Thursday, July 18, 2013 12:15 PM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Sunday, July 21, 2013 9:55 AM
Points: 6, Visits: 13
bump. Any thoughts at all? Is my question clear?
Post #1475166
Posted Thursday, July 18, 2013 12:52 PM


SSChampion

SSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampion

Group: General Forum Members
Last Login: Yesterday @ 3:12 PM
Points: 13,441, Visits: 12,303
It sounds like you are thinking about having 3 columns that each have a delimited list of IDs or something like that? That is certainly not a normalized version. Or maybe I am misunderstanding what you are trying to do.

_______________________________________________________________

Need help? Help us help you.

Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

Need to split a string? Try Jeff Moden's splitter.

Cross Tabs and Pivots, Part 1 – Converting Rows to Columns
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs
Understanding and Using APPLY (Part 1)
Understanding and Using APPLY (Part 2)
Post #1475176
Posted Thursday, July 18, 2013 1:10 PM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Sunday, July 21, 2013 9:55 AM
Points: 6, Visits: 13
I'm thinking these tables:

Set(
Id,
Name,
Dimension,
...)

Person (
Id,
Name,
...)

SetRelationships (
ParentSetId,
ChildSetId
)

PersonSets (
PersonId,
SetId
)

And then these tables give me a lot of my queries directly. The ones I want to denormalize would be

PersonContainedBy(
PersonId
SetId
)

SetContainedBy(
SetId,
AncestorId
)

Which would be a flattening of the hierarchy described in the tables.

I.E. if A contains B, and B contains C per the tables, Then C would have records for A and B in the ContainedBy view.

I guess my question is if I build a materialized (or indexed) view over a Recursive CTE, how performant will that be? If I change the underlying data, will it have to recalculate the whole view, or just the changed records?
Post #1475182
Posted Thursday, July 18, 2013 1:32 PM


SSChampion

SSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampion

Group: General Forum Members
Last Login: Yesterday @ 3:12 PM
Points: 13,441, Visits: 12,303
jace.bennett (7/18/2013)
I'm thinking these tables:

Set(
Id,
Name,
Dimension,
...)

Person (
Id,
Name,
...)

SetRelationships (
ParentSetId,
ChildSetId
)

PersonSets (
PersonId,
SetId
)

And then these tables give me a lot of my queries directly. The ones I want to denormalize would be

PersonContainedBy(
PersonId
SetId
)

SetContainedBy(
SetId,
AncestorId
)

Which would be a flattening of the hierarchy described in the tables.

I.E. if A contains B, and B contains C per the tables, Then C would have records for A and B in the ContainedBy view.

I guess my question is if I build a materialized (or indexed) view over a Recursive CTE, how performant will that be? If I change the underlying data, will it have to recalculate the whole view, or just the changed records?


I guess I am missing why you think you need a recursive cte for this. Your data is denormalized so it would seem to eliminate the need for recursion.

Your definition of the problem is so vague that I am having a hard time envisioning what you are doing here.


_______________________________________________________________

Need help? Help us help you.

Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

Need to split a string? Try Jeff Moden's splitter.

Cross Tabs and Pivots, Part 1 – Converting Rows to Columns
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs
Understanding and Using APPLY (Part 1)
Understanding and Using APPLY (Part 2)
Post #1475192
Posted Thursday, July 18, 2013 1:48 PM
SSC-Addicted

SSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-Addicted

Group: General Forum Members
Last Login: Sunday, September 29, 2013 1:24 AM
Points: 429, Visits: 1,721
Maybe this link will help.

http://www.sqlservercentral.com/articles/Hierarchy/94040/

 
Post #1475204
Posted Thursday, July 18, 2013 1:56 PM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Sunday, July 21, 2013 9:55 AM
Points: 6, Visits: 13
First, thanks for your time. Sorry If I'm communicating poorly.

Imagine the sets describe locations: North America, USA, Georgia, Fulton County, Atlanta.

I might describe this relationship in a relationship table (I'm excluding the actual person and set tables and using names instead of Ids for clarity):

Parent Set    | Child Set
============================
North America | USA
USA | Georgia
Georgia | Fulton County
Georgia | Dekalb County
Fulton County | Atlanta
Dekalb County | Decatur

Person | Set
================
Joe | Atlanta
Jane | Decatur

Now, I have an extremely frequent and latency-sensitive query that says "Give me all the people (or sets) under Georgia". My thought is to make this query really fast by projecting it into this shape:

Person | Contained By
==============
Joe | Atlanta
Joe | Fulton County
Joe | Georgia
Joe | USA
Joe | North America
Jane | Decatur
Jane | Dekalb County
Jane | Georgia
Jane | USA
Jane | North America

And then another one for SetsContainedBy.

Sadly, Using actual snowflake schema has two distinct disadvantages in my case, the first is that sets (and dimensions) are really dynamic, and that I can have sets be child to multiple other sets (for example, Atlanta actually lies in two counties, Fulton and Dekalb).

I hope that clarifies my goal, and why the recursion seems like an obvious win to a non-dba. Thanks again.
Post #1475207
Posted Thursday, July 18, 2013 6:36 PM


Hall of Fame

Hall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of Fame

Group: General Forum Members
Last Login: Yesterday @ 8:29 PM
Points: 3,648, Visits: 5,322
Are you looking for something like this?

DECLARE @AdjacencyList TABLE
(
Parent VARCHAR(20)
,Child VARCHAR(20)
);

DECLARE @PersonsLocations TABLE
(
Person VARCHAR(20)
,Location VARCHAR(20)
);

INSERT INTO @AdjacencyList
SELECT 'North America','USA'
UNION ALL SELECT 'USA','Georgia'
UNION ALL SELECT 'Georgia','Fulton County'
UNION ALL SELECT 'Georgia','Dekalb County'
UNION ALL SELECT 'Fulton County','Atlanta'
UNION ALL SELECT 'Dekalb County','Decatur';

INSERT INTO @PersonsLocations
SELECT 'Joe','Atlanta'
UNION ALL SELECT 'Jane','Decatur';

WITH ExpandHierarchy AS (
SELECT n=1, Person, Location, Parent, Child
FROM @PersonsLocations a
INNER JOIN @AdjacencyList b ON b.Child = a.Location
UNION ALL
SELECT n+1, a.Person, a.Location, b.Parent, b.Child
FROM ExpandHierarchy a
INNER JOIN @AdjacencyList b ON a.Parent = b.Child
)
SELECT Person, Location
FROM (
SELECT n, Person, Location=Parent
FROM ExpandHierarchy
UNION ALL
SELECT 0, Person, Location
FROM @PersonsLocations) a
ORDER BY Person, n;





My mantra: No loops! No CURSORs! No RBAR! Hoo-uh!

My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.


Need to UNPIVOT? Why not CROSS APPLY VALUES instead?
Since random numbers are too important to be left to chance, let's generate some!
Learn to understand recursive CTEs by example.
Splitting strings based on patterns can be fast!
Post #1475242
Posted Thursday, July 18, 2013 8:35 PM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Sunday, July 21, 2013 9:55 AM
Points: 6, Visits: 13
Yup that just about does it I think. Thanks so much for that.

Particulars of the projection aside, my original question was about how this style query plays with views. I want the results of that query indexed, and kept largely in sync with the adjacency list (learned a new term, thanks, guys). What is my best option? Will a View over that query be performant in the face of the occasional update? Or will it have to rerun the entirety of the query and recreate the index? Should I just run the projection on a schedule or try to work out something with triggers?
Post #1475254
Posted Thursday, July 18, 2013 8:38 PM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Sunday, July 21, 2013 9:55 AM
Points: 6, Visits: 13
Steven Willis (7/18/2013)
Maybe this link will help.

http://www.sqlservercentral.com/articles/Hierarchy/94040/

 


Thanks! I found that link very helpful. The Nested Sets structure is not appropriate for me case, as I don't have a strict hierarchy, but a lattice. The solution presented in Part 2 seems promising, but I'm still trying to digest the primer materials.
Post #1475255
« Prev Topic | Next Topic »

Add to briefcase 12»»

Permissions Expand / Collapse