Intricate Database Design for a Family Tree--Help?

  • Does anyone know of an article of how to design a database for a wiki application. As I work on my application I come to find out connections between topics grow exponentially and it is hard for me to figure out how to build tables to satisfy all the relationships.

    I'm building a family tree app. Let's say a sister is related to four brothers. On her page, there are links to each brother. When you click on a brother, it takes you to his page, and on his page it has a link to three brothers and one sister.

    The problem I'm having is the reciprocity back to the sister/brothers. How can I design a database where there are connections between sister and brothers without it getting too messy. I'm having a hell of a time figuring it out. The solution would require a sort of spider-web configuration between individual data as opposed to tables. It's hard trying to do it with tables. Or maybe I'm just not smart enough, ha-ha.

    I've created a table called siblings. In it I have sibling a, b, c and d. The only way I could make connections between all of them using tables is to connect each one with three of the rest and put them in the table. See below:

    Sibling 1 Sibling 2

    a b

    a c

    a d

    b a

    b c

    b d

    c a

    c b

    c d

    d a

    d b

    d c

    This looks okay if you only did this family, but what about adding bigger families. This table will be astronomical, particularly if there were say 11 kids! Mathematically it would be 11 * 10=110 entries just for that one family of 11 siblings. And what if another family wanted to add their kids to the database, I can see this table exploding into the billions for just a few thousand families. I don't think this is the best way to show relationships between siblings in a table. Does anyone have any simpler ideas or strategies?

    Any help will be appreciated. In the mean time, I'll keep plugging along.

  • The way I see it, a family tree is a tree. And in trees, you should only keep the parent's id in the record...Why would you want to specifiy the siblings explicitly? All siblings have same FatherID or MotherID (I prefer both)

    If you have a person record, and you have an ID primary key, you could save the parent ID in ParentID column for each of the siblings. This way, all implied relations are implied and not hard coded (sibling have same father/mother (not necessarily the same for all kids)), uncle has FatherID same as for one of my parents, etc.

    However, you should really consider what your goal is, as you will pretty soon bump into complicated queries (divorce = marital state is relevant between date1 and date2, etc.)

  • As usual, I jumped into the database design issue, overlooking your initial request:

    Does anyone know of an article of how to design a database for a wiki application

    Maybe yuo'll find some helpful info here:

    http://www.math.clemson.edu/~simms/genealogy/ll/gedcom55.pdf

  • Mark (9/12/2009)


    The problem I'm having is the reciprocity back to the sister/brothers. How can I design a database where there are connections between sister and brothers without it getting too messy.

    No reason to get messy.

    Lets go to the basics...

    How many entities do you have? two? correct! they are people and relationships.

    So... how do you implement that ER model?... two tables? perfect!

    People_Table

    One row per person including an unique identifier - lets call it PersonId

    Columns: PersonId [PK], LastName, FirstName, ... , etc, etc.

    Relationships_Table

    One row per each relationship a person in People_Table has.

    Columns: (PersonId_1, PersonId_2) [PK], Relationship

    Tables have a 1-n relationship like:

    People_Table > Relationships_Table

    Isn't it a beauty? 😎

    _____________________________________
    Pablo (Paul) Berzukov

    Author of Understanding Database Administration available at Amazon and other bookstores.

    Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.
  • There are many ways to implement this, each with its strengths and weaknesses.

    It's probably best to have an understanding of the issues you may face further down the track before you start.

    This is a pretty good summary of the options, and it is very user-friendly:

    http://blogs.msdn.com/anthonybloesch/archive/2006/02/15/Hierarchies-in-SQL-Server-2005.aspx

  • I'm in agreement with Paul W that

    There are many ways to implement this

    but... only one way would be elegant and efficient at the same time 😉 That's why data modeling still is an art, not a science.

    _____________________________________
    Pablo (Paul) Berzukov

    Author of Understanding Database Administration available at Amazon and other bookstores.

    Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.
  • I'd recommend setting up some look-up tables for normalization purposes.

    for example a relationships table

    1 - brother

    2 - sister

    3 - 2nd cousin twice removed

    etc.

    Then in your base table where your wiring things together you can keep things more efficient with rows that use those tiny numbers instead of the whole plain text description of the relationship

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply