Hierarchies in SQL

  • danielk1 (1/28/2009)


    I just feel, to tie myself to a DB structure type, like you mentioned using the hierarchcical ID, might limit my application's portability - say, if I want to take it from MS SQL server to MySQL or Oracle or DB2.

    Doing a small amount of coding in order to remain with ISO standard fields, is in my opinion, making for a far better application.

    My 2cents.

    Just double checking... You've never used and will never use CONNECT BY until it becomes part of the ISO Standard. And you never used DATEDIFF until it became part of the standard. And you've never used GETDATE() because it's not part of the standard.

    Here we go again... I can understand that some GUI code might be made to follow ISO standards (heh, depending on which release you're talking about), but since none of the RDBMS vendors are 100% compliant with ISO standards and the very standards themselves tend to lag what's available as powerful extensions to ANSI SQL, I'll just go ahead and start a war by saying that true code portability in the batch world is a myth.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Just double checking... You've never used and will never use CONNECT BY until it becomes part of the ISO Standard. And you never used DATEDIFF until it became part of the standard. And you've never used GETDATE() because it's not part of the standard.

    Actually, no. I leave this to the code to handle.

    The DBAs in all the companies I work for, demand us to follow all ISO standards without deviation.

    Moreover, the applications I have written, both GUI and Web, are 100% portable to many of the top databases available including: MS SQL, MySQL, DB2, Oracle, and others (tested and implemented).

    I will not debate the issue as I am not saying I am right nor am I saying I am wrong - it is just one way of solving an issue. The article and others like it are of course another way. In IT, there are many ways to solve a given issue, some better than others.

    With this said, if performance using a non-ISO standard was significantly improved, the DBAs and other Stake Holders would have to evaluate their thinking and perhaps make an exception, however, as of now, performance is virtually instant and there are no areas to improve, so we will remain with the ISO standards.

    Why have standards if they are not to be followed and adhered? Sure, they may become outdated. No one is saying they are perfect and even if they were, in six months there will likely be something else better, different, or faster, requiring the standards to be revisited, invoking a new standard.

    Ahhh - I love IT - it is ever changing and evolving.

    Take care all.

  • Ah... so you're not a batch programmer. And that's the area of the code I was talking about. GUI programmers have all sorts of tools like Hibernate, etc, to write "portable" code... well, at least DB "agnostic" code so far as the GUI is concerned.

    It would be really interesting to see if any of your DBA's use GETDATE() or any of the other "proprietary" code offerings of SQL Server. I'd have to say that the first time they reference Master.sys.sysColumns, they're probably in violation. 🙂

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I have to admit, my code is usually about as portable as the Rocky Mountains. In other words, you could move it to another database engine in geologic time spans, but not much faster than that.

    I also have to admit that I've seen projects that were "completely standard and portable", where someone actually did try to move it to another RDBMS, and spent more time debugging because of partially documented differences, undocumented differences, variations in standards compliance in the RDBMS itself, etc., than they would have spent building the thing over from scratch.

    I have yet to see a case where code portability actually solved anything worthwhile for anyone. Doesn't mean they don't exist, just means I haven't seem them, in my limited experience.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • I have yet to see a case where code portability actually solved anything worthwhile for anyone. Doesn't mean they don't exist, just means I haven't seem them, in my limited experience.

    ++1


    * Noel

  • This article caught my attention because I deal with a related problem. I oversee the field structure for a company. The difference is that the structure is time-based; and a single Area can roll up to multiple parents based on the org type in use. This is primarily so I have the ability to re-state sales based on the structure that was in place on any particular date. The second item is that I am on SQL2K, with a near-term upgrade to 2005 coming, so I could not use any of the new CTE functionality.

    To do this, I had to use two table, and the "Stack" table requires a rebuild any time the hierarchy changes. (luckily not often)

    My tables are as follows:

    create table GeoParent (

    GeoParentId int not null identity(1,1),

    AreaId int not null references dbo.Geo(ID),

    ParentId int not null references dbo.Geo(ID),

    DateStart datetime not null,

    DateEnd datetime not null,

    OrgTypeId int not null references dbo.GeoOrg(ID)

    )

    create table GeoStack (

    GeoStackId int not null identity(1,1),

    AreaId int not null references dbo.Geo(ID),

    Lft int not null,

    Rgt int not null,

    Depth int not null,

    DateStart datetime not null,

    DateEnd datetime not null,

    OrgTypeId int not null

    )

    The "Stack" table is built by walking through changes to the Parent table and building the hierachy tree in a temp table, then writing it to the table after each iteration, which is currently performed in an external visual basic DLL.

    Once in place, I use code similar to danielk1 to get the hierachies:

    SELECT S2.AreaId

    FROM GeoStack S1, GeoStack S2

    WHERE S2.Lft BETWEEN S1.Lft AND S1.Rgt

    AND @date BETWEEN S1.DateStart AND S1.DateEnd

    AND @date BETWEEN S2.DateStart AND S1.DateEnd

    AND S1.AreaId = @tid

    AND S1.OrgTypeID=@org

    AND S2.OrgTypeID=@org

    Any thoughts?

    Thanks,

    Strom

  • Jeff Strom: That's basically the same as what I did, you just have it in two tables where I had it in one, and you have a time-sensitivity that I didn't have. Same concept though. What you've got should be pretty good for what you need.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • nice

  • Very interesting!

    Surfing the net is my hobby and posting blogs is my expression. I write different blog topics to various blog sites.
    Online Bingo Reviews
    Best Online Casino Games[/url]
    Best Bingo Sites

  • Hey Gus,

    Late to the party, I know, but you referenced this in another thread and I had a curiousity. Could you discuss the 'lazy updater' component a bit more. I'm not necessarily sure I understand how the 'temp range' vs. the 'real range' helps with update issues, since you're still locking and unlocking rows/pages/table using either column.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • Fascinating article! I have a couple of naive questions about the following bit of G code:

    Adjacency (NodeID, ParentID) as -- Adjacency Query

    (select 0, ID, ParentID

    from dbo.HierarchyHybrid

    where ID = @NodeID_in

    andexists

    (select*

    from dbo.HierarchyHybrid h2

    where h2.TopParentID = HierarchyHybrid.TopParentID

    and RangeStart is null)

    1. Is it supposed to have the "select 0, " or is that just a typo? It looks like the surrounding code is expecting 2 values there, not 3. If not a typo, I don't understand the "0" part.

    2. Why doesn't this work?

    Adjacency (NodeID, ParentID) as -- Adjacency Query

    (select 0, ID, ParentID

    from dbo.HierarchyHybrid

    where ID = @NodeID_in

    and RangeStart is null)

    Thanks, regardless of whether you have time to reply.


    "If I had been drinking out of that toilet, I might have been killed." -Ace Ventura

  • A very good solution and something that we've implemented in one of our solutions, the only difference is that we also have a RelationshipType table hung off the HeirarchyRelationship table.

    The code snippet below, is an example of how we set this up to cater for differnt Geographic needs with an organisation.

    CREATE TABLE [Common].[Geography] (

    [GeographyId] INT IDENTITY (1, 1) NOT NULL,

    [GeographyCode] NVARCHAR (50) NOT NULL,

    [GeographyName] NVARCHAR (150) NOT NULL,

    [Level] NVARCHAR (50) NULL

    );

    GO

    CREATE TABLE [Common].[GeographyRelationship] (

    [GeographyRelationshipId] INT IDENTITY (1, 1) NOT NULL,

    [GeographyId] INT NOT NULL,

    [ParentGeographyId] INT NOT NULL,

    [RelationshipTypeId] SMALLINT NOT NULL

    );

    GO

    CREATE TABLE [Common].[RelationshipType] (

    [RelationshipTypeId] SMALLINT IDENTITY (1, 1) NOT NULL,

    [RelationshipTypeName] NVARCHAR (20) NOT NULL,

    [RelationshipTypeDescription] NVARCHAR (50) NOT NULL

    );

    GO

    This allows us to create many different variations from a single list of nodes and at a variety of grains, some down only as far as sub-regions (State/County), others as deep as City Street.

    Using a CTE the SQL is very simular to the Article, with only the addition of a join to the RelationshipType table and filtered on the required Type of Hierarchy.

    _________________________________________________________________________
    SSC Guide to Posting and Best Practices

  • Interesting article, but I have a question, which hopefully will save me a lot of hours:

    In Oracle 10 I do it in one select statement, using START WITH and CONNECT BY.

    Now, migrating one of my customers to SQL 2008, is there anything close to it in SQL Server?

    Thanks in advance

  • Thanks for the article Gus. I'm glad I caught it this time around.

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • Craig Farrell (11/19/2010)


    Hey Gus,

    Late to the party, I know, but you referenced this in another thread and I had a curiousity. Could you discuss the 'lazy updater' component a bit more. I'm not necessarily sure I understand how the 'temp range' vs. the 'real range' helps with update issues, since you're still locking and unlocking rows/pages/table using either column.

    The idea is to use Snapshot Isolation. That will allow the asynchronous update to complete without wrecking performance while it runs (which was a flaw in my original implementation). This could also be used in a purely nested sets hierarchy, so that infrequent updates don't interfere with concurrent reads. That's what I was refering to in that other thread.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

Viewing 15 posts - 16 through 30 (of 45 total)

You must be logged in to reply to this topic. Login to reply