Agile and Data Modeling

  • Agile is fast approaching our company, and although I can really see some of the benefits of it, I am having a hard time applying to some of what my team does.

    The DBA team is also a database developer team, so not only do we do the typical DBA work, we also handle the data modeling, schema creation, capturing and creating the business logic in the stored procedures,etc.

    I feel like the agile method and data modeling/design are kind of at odds. Is it possible to design a well performaning database without the whole picture or a lot of refactoring down the road? And once the database is in use that refactoring can become quite painful with all the data.

    Right? wrong?

    Thanks

  • Agile done right works just fine with database modeling. Agile done wrong digs a deep dark pit of pain into which data modeling is thrown and suffers until someone realizes they're doing Agile wrong or they decide that they can "fix" the problem by introducing an Object to Relational Mapping tool and tossing the data model entirely (which digs a different deep dark pit of pain into which database performance and data integrity is thrown).

    Agile, in and of itself is not an issue. But, to really do Agile correctly requires quite a bit of discipline. I've been on projects where, because they had no discipline at all, we would go through multiple phases of "REFACTOR ALL THE THINGS" requiring complete database design overhauls. So, first thing, you have to ensure that the people managing the project, really manage it and instill the discipline necessary to make agile work. After that, you, as a DBA, need to adopt, as much as you can, everywhere you can, development methodologies as your own. I do seminars on why this is important, so I won't be able to type it all here, just believe that, you need to work like the developers do, as much as you can.

    Some things that will help. Scott Ambler has some good stuff[/url] (although he goes off the rails occasionally). More here. This is also an interesting white paper.

    Overall, it doesn't have to be a bad experience. It can even be a good one. But if you start seeing "REFACTOR ALL THE THINGS" and "all we need to do is let the ORM tool build the database" types of stuff happening, I promise you, the world will be a horrible place.

    In order to get your database into source control (assuming ORM isn't running things) you're going to want some tooling. Visual Studio Data Tools can do it for you, but you'll have to work within Visual Studio. If you want to stay in SQL Server Management Studio, I'd suggest looking at Red Gate SQL Source Control (disclaimer, I work for Red Gate).

    If you need help, get in touch. I've done this before, a few times.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Grant Fritchey (12/12/2013)


    . So, first thing, you have to ensure that the people managing the project, really manage it and instill the discipline necessary to make agile work.

    So I'm completely *bleeped* :unsure:

    Thanks for the info I will read it over.

  • joe.eager (12/12/2013)


    Grant Fritchey (12/12/2013)


    . So, first thing, you have to ensure that the people managing the project, really manage it and instill the discipline necessary to make agile work.

    So I'm completely *bleeped* :unsure:

    Thanks for the info I will read it over.

    Probably.

    The worst sort of management types just love the concept of agile. "Oh cool. We don't have to plan anything any more. We'll be AGILE and everything will be unicorns and rainbows"

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • The only thing worse than data modelling and Agile is data modelling without Agile.

    Strangely I have been researching this all of today and I'm going to be a bit Off topic but I will come back... This guy has some interesting things to say about the Data Vault model and Agile

    http://www.youtube.com/watch?v=e0cHFdeGEeE

    The Data Vault model theory which to me seems to sit quite nicely between 3NF OLTP systems and Dimensional Star schemas is a hyper normalised 3NF model.

    Entity keys become HUBS

    Entity relationships become LINKS

    Entity attributes that are not part of the key become one or more SATELLITES.

    SATELLITES can be attached to HUBS or LINKS (but not other SATELLITES)

    you can have more than one SATELLITES per HUB/LINK - either along differing source system data for the same HUB type or along speed of change (e.g. Order status and value go into a fast changing SATELLITE, Order date and payment method go into a different SATELLITE; this way you don't have to track all the fields in SCD)

    If organised properly you can load all of your HUBs in parallel as there are no dependencies between them. You can then load all your LINKs in parallel because all your HUBs are up-to-date.

    Never clean or reject data on the way into the Data Vault: it should represent the state of the data from the source systems. Federate and reject the data from the DV to the star schema

    The other killer APP seems to be BiML (Business Intelligence Markup Language) with it's ASP-like BiMLSript. I just watched a webinar where they guy wrote about 20 lines of code and created data transfer packages for all the tables in AdventureWorks. Refactoring the script to change where the ASP FOR-EACH loop was, changed that to one package with N number of Data Flow Tasks and rebuilt the dtsx in about 30 seconds - WOW!

    BiML lets you concentrate on the data patterns rather than putting the jigsaw together hundreds of times. It looks like it would be particularly useful where you need templates as you can include external scripts and then regenerate the packages so if you have a problem with (lets say) the onError handler in a package you can modify it once in the BiMLScript and all packages that use that script get rebuilt.

    Although not Agile in themselves, these two concepts together look to me to be key requirements for rapid DW development.

  • I just came back from Scrum training and since my company relies on what i would call a "robust" layer of data design and I had some of the same thoughts that you do.

    I think your instinct is right in that in an Agile environment you don't model and build an entire data layer as a distinct phase in your development (that would be closer to waterfall). In Scrum the goal is to always be building thin vertical slices of potentially shippable software, so you wouldn't dedicate an entire sprint or sprints to working out the data layer design. Instead, if you were building, say a CRM, you might spend Sprint 1 just building the data layer required to store a simple Customer Object; later, maybe in Sprint 3, you might work on a story involving storing multiple addresses per customer so then you'd model/build a CustomerAddress story.

    What's probably going to happen is that by Sprint 8 you'll find you're throwing out or refactoring a chunk of the work that you did in Sprint 1 b/c the stories will have evolved, but that's supposed to be okay - the work wasn't "wasted" since it helped to lead to a more robust product.

    I'm just starting to figure out how to apply this framework in my environment, so I can't vouch for its success but my engineers seem willing to try it.

    I guess the approach is that in each Sprint you should keep the big picture in the side view (don't model with blinders on) but your deep focus should be on just developing the stories in that Sprint backlog and not doing work that doesn't apply to those stories.

  • I agree, on doing small slices of the data model, I guess I'm just worried about the re-factoring chunks near the end. But as was mentioned earlier, those stories will take longer since they involve re-factoring and moving data into the new model.

  • joe.eager (12/13/2013)


    I agree, on doing small slices of the data model, I guess I'm just worried about the re-factoring chunks near the end. But as was mentioned earlier, those stories will take longer since they involve re-factoring and moving data into the new model.

    To be honest that's the part that sticks in my craw a little, too. It's a bit of a leap of faith that a lot of late sprint refactoring will add enough business value that it's worth it. The idea is that the final design will be more efficient/stable, etc. b/c it wasn't set in stone early in the project (which is the stage when you know the least about what the final product will look like).

  • Steve Thompson-454462 (12/13/2013)


    joe.eager (12/13/2013)


    I agree, on doing small slices of the data model, I guess I'm just worried about the re-factoring chunks near the end. But as was mentioned earlier, those stories will take longer since they involve re-factoring and moving data into the new model.

    To be honest that's the part that sticks in my craw a little, too. It's a bit of a leap of faith that a lot of late sprint refactoring will add enough business value that it's worth it. The idea is that the final design will be more efficient/stable, etc. b/c it wasn't set in stone early in the project (which is the stage when you know the least about what the final product will look like).

    It really can work, and work well. I've seen it done right. Unfortunately, I've seen it done wrong more often. It's all about buying into the concepts expressed by the DevOps people and getting behind the idea that development is just a part of the IT life cycle, not a separate concept as has so often been the norm. But boy does it take a lot of discipline to do it well.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • My company has our own flavor of Agile. When we initially started brainstorming about adopting Agile last year, one thing we all agreed is that the first initial sprints will be design sprints. We try to come up with conceptual/logical design (alteast the core entities and relationships) in the design sprints. As we start getting into the development sprints, we flush out the detail design.

    We completed four projects so far. Seemed to be working fine so far. It is not that scary as I initially thought.

    Good luck

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply