The Daily Lookup

  • I'm with Chris Smith, leave this guy alone! Without the rest of the story you don't know what the purpose of the design is. I've seen date lookup tables used in BI/OLAP applications with great success. In fact when you start dealing with writing reports for accountants and dealing with holidays and other "wierd" date related requests this is often the only solution that is mantainable. My opinion is, "yes it looks funny to OLTP guys but it makes sense to BI/OLAP guys"

  • Let's have a big "What he said!" for the people who understand that this design looks reasonable for a data warehouse structure (Chris, Ben et alia). Any good OLAP endeavor will eventually approach the Fifth Normal Form which frightens the pants off of an unenlightened relational dba-type. The Daily WTF seems to be populated largely with dinos who couldn't spell MDX if their lives depended on it.

     

    They act like they've never seen a Time dimension before. They aught to look at some best practices design samples before they go calling the kettle black. Looked at AdventureWorks DW recently, anyone? Project REAL?

     

    In the multidimensional world there is no case for "strategic denormalization" (for performance reasons or any other). In order to provide maximal, efficient slicing/dicing on every dimension, the absolute minimum number of non-integer attributes should be included in the fact design. Strings and date/times are eeevil. In fact, all non-numeric data types should be verboten in a data warehouse! All keys must be integer (or I'll rap your knuckles with my sliderule). I might let you have one or two non-numeric attributes, but you better justify them with lots of freaking measures! (OK, now I'm the one who's being bombastic. Heh.)

     

    Some of the attribute names in the WTF design could be improved (like DateDate - duh!) and an object suffix like _tb (for table, I presume) is just silly, but that's all semantics.

     

    The developer in the story, the dba in the story and nearly every poster @ the WTF need to go back to BI school. Talk about the blind leading the blind... Repeat after me: A Star Schema is a Good Schema. Now your penance is to go write it 1,000 times. In binary.

  • Chris's point is dead on. In a data warehouse environment, this is the proper way to handle dates. However, usually you store the actual date as well as the date key. This allows the user to constrain on a date in the usual fashion without requiring another join, but provides the structure to answer business questions involving the nature of the dates (how many reservations were made on a holiday weekend) if needed.

  • While this conceptual design does look silly on its face, I wouldn't conclude that it was inappropriate without knowing the full background of the problem space it is modeling, and the solution(s) it is supporting.

    I can imagine a scenario where lots of instances of lots of different types of events need to be linked to dates, and relating to the date objects rather than storing date values is a better overall implementation. Sure, its unlikely, but it is possible.

    In other words, we shouldn't condemn the solution without really understanding the problem.

    Jett

  • Please, people

    It was clearly stated on the dailyWTF forum that this WAS NOT AN OLAP APPLICATION, so this leads us to the conclusion that this DB design sucks.

    The workaround in this case was to upload 4 year's worth of dates so the customers can make reservations or buy tickets. This is also stated in the original post on the dailyWTF.

    This screenshot is supposed to make you think of bad design and maybe throw a couple of jokes at the forum, that's how the dailyWFT is supposed to work. Do you guys ever have a sense of humour?

    So, don't be surprised if all of you end up in a post on the dailyWTF: "Look, these people are trying to defend an utterly bad design we posted here.... hehehe, snickers"

    Mike Rod

  • To be honest, I've seen worse, and actually the system isn't as bad as it first appears.

    I don't know if you have ever dealt with date issues, globalization, time offsets, etc, all simultaneously, but it's not a pretty site. As example test of your globalization skills, try taking a report of events happening with GetDate(), then roll up by hour, based on the client's timezone, crossing over a daylight savings change, and output in the client's date format. You'll soon see what I mean. I'd love to see your SQL on that one. Wait maybe I wouldn't.

    Also, currently, I'm doing development for an accounting system, and quite honestly, a date table would actually make my life a factor easier rather than harder, considering they use 2 fiscal weeks per fiscal pay period, 26 per year, oh wait, some years there are 27 pay periods. Considering the calculation for this is rather complex, it seems an additional join against a properly indexed date table would be cheaper. Just be sure not to mistakenly cartesian join.

    Now another issue I'd like to point out is a lot of times you do something for brevity's sake and later it gets used a lot more than you ever thought. Not many people have enough budget to think every aspect out 30 years ahead, and scale to the moon. You make it work for what you would guess would be a few years, and if you get performance problems, you deal with them as they arise, not to over design.

    Just my two cents,

    John

  • I want to bypass the "Is theis a WTF or not" part of the argument and address another thing mentioned.

    It's an attitude pervasive to this site and, indeed, to the DBA community as a whole. This sentence helps to sum it up:

    Most of you who have read this for awhile know I like to poke fun at developers who dabble in SQL.

    There is a pervasive problem in the application programming and database programming community regarding this attitude.

    First, DBAs seem to believe that they are responsible for the data INSIDE of the database. And far too many companies misunderstand the role of DBAs and force them to do so.

    Second, far too many programmers (I hate the word "developer") believe that they need to know whatever language is hip and feel that they can rely on the DBAs to help with the "database stuff." This is akin to working as a mechanic in a big rig repair shop and not knowing how to work on diesel engines ("But I know how to work on the chassis and structure..."). Any person who works as a database programmer must be introduced and LEARN RELATIONAL THEORY. And not just "SQL". That's akin to learning the language without understanding the underlying theory and math involved (Also known as "getting one's certification").

    Third, it seems almost everybody believes the division of labor is binary, when in fact it is ternary:

    a. DBA.

    b. programmer.

    C. DATA MODELER.

    And the data modeler must come from the programming side to be any good. (S)he must understand performance issues and perhaps some experience as a DBA would help this greatly, but a programmer who thoroughly understand relational theory should be the primary path to data modeling.

    Sorry to seem so confrontational, but the data modeler/data architect role is one that is routinely ingored or forgotten or glossed over in the IT world. Everybody feels that they can just slime into the position without real education and working experience.

  • "Everybody feels that they can just slime into the position without real education and working experience."

    That's because when management doesn't really understand what the job entails, and they just know they don't want to do it, they just recruit a warm body. It's a mixed blessing. On one hand, you have people screwing up as they "learn the ropes". On the other hand, you potentially get a job/contract to fix it.

    Another reason why it's typically ignored is because most shops grow from small shops (read one guy), which means basically one guy started doing everything, then it got split up. Often the responsibilities cannot be divided so cleanly like union work for the movie studios. That's not my job doesn't fly too well in small environments.

  • Everyone seems to be looking at this database as though it was created all at once. Maybe it was done a part at a time. Sometimes what works for a small database doesn't work as well when you add more tables, but you are stuck with the existing design.

    -SQLBill

  • Yeah, I know management is probably one of the biggest offenders. I'm with you on that one; it's frustrating.

    As to the second point, I don't buy it so much. But I see what you're saying. I work in a very small company now (10 IT, 35 total). It's wonderful and I have a lot of latitude, respect, and RESPONSIBILITY (can't hide in the system anymore). The original systems are a NIGHTMARE. I'm tasked with modeling what was intended, fixing what might break, maintaining what needs to be generated, and working on new projects, creating new systems. So much fun. But it doesn't excuse writing garbage. If you're a data company, hire data people. And small companies aren't good for people just starting out in their careers (seems obvious to me, now).

    My point is that I think it's systemic, and we modeler/architects need to get the good news out (so to speak)....

  • "But it doesn't excuse writing garbage"

    It's been said before, but without more details, I can't say I agree this design is garbage. I might not have come up with it, but that doesn't make it garbage. However, I do think an automated precaution should be put in place to prevent the need to add date records by hand.

    Depending on the number of records expected to be returned, and proper indexing/db stats, you could likely get away with a design like this, and still have stellar performance. A properly indexed schema with 10 joins returning one record is still only 10 index lookups. It just depends on how high you need the system to scale. I've worked on systems that have gotten 1 million hits/day, and systems with over 12 million records in single tables.

    So, in summation, without further info, I can't say the WTF is the schema, just why the date records required manual additions.

    Just my two cents,

    John

  • "I don't know if you have ever dealt with date issues, globalization, time offsets, etc, all simultaneously, but it's not a pretty site. As example test of your globalization skills, try taking a report of events happening with GetDate(), then roll up by hour, based on the client's timezone, crossing over a daylight savings change, and output in the client's date format. You'll soon see what I mean. I'd love to see your SQL on that one. Wait maybe I wouldn't."

    You wouldn't do that in SQL but in a reporting framework like Crystal Enterprise or Reporing Services, wouldn't you?

    -- J.T.

    "I may not always know what I'm talking about, and you may not either."

  • Raise your hand if you have never inherited a system, program, database from someone else, taken a look at it, said one or more WTFs, and then found out later that wasn't as WTF as your thought and actually made sense once you understood the thinking and reasoning behind the design.

    Anyone?

    Right.

    The point, I guess, is that there is always someone who thinks the way you do something isn't the best way and can give you ten reasons why. Most likely, all of us are one of those people too.

    Additionally, a lot of things in IT don't have the luxury of growing up mature. That is to say that a lot of work in IT gets done in an expedited manner because deadlines sometimes aren't flexible and it just has to work. Not many applications wind up being built exactly the bext way they could be.

    The WTF we are looking at may very well be a good example. The date table and all those FKs were added to solve a particular problem at a particular point and no one had the time or inclination to go back and fix it...or they couldn't because it had become part of the system and to change it would require more resources than was available.

    I can give you dozens of examples. Like the database I dealt with where the designer had used a float(14,2) as the ID column in some tables and a int in others.

    I think we've proven that one person's WTF is another's good practice.

    -- J.T.

    "I may not always know what I'm talking about, and you may not either."

  • "You wouldn't do that in SQL but in a reporting framework like Crystal Enterprise or Reporing Services, wouldn't you?"

    I've never seen a reporting framework flexible enough to deal with those issues properly, and roll-up the dates and times based on a timezone from the requested web-client. This is not to mention in some countries, daylight savings is determined by government decree, not a formula. If you know otherwise, let me know.

    John

  • The latest version of Crystal Enterprise could probably do that. Something to look into.

    -- J.T.

    "I may not always know what I'm talking about, and you may not either."

Viewing 15 posts - 16 through 30 (of 35 total)

You must be logged in to reply to this topic. Login to reply