Toilets and Your Systems

  • I am sort of famous for being the "standards Uber Alles" guy in the SQL community. I'm very proud of it and it served me very well. I've written 10 books on SQL which adhere as strictly as I could to those standards, collected the royalties on those books and gotten free drinks at bars.

    However, I'm the first guy to admit that not all standards cover all things. A lot of basic definitions are not standardized. For example, what is the median? If I have the set {1,2,2,3,3,3}, how to find the median? One way says that it has to be in the set, so the only thing you can consider is {2,3}, but we don't allow a set result for a median, so pick one. The other way is to do an average so you use ((2+3)/2.) = 2.5 instead. However, I prefer to do a weighted median because I feel it's more representative of a central tendency to give us SUM(2,2,3,3,3)/5 = 13/5 = 2.6 instead.

    So which one should be a standard? For those of you are old enough to remember when Pascal was the "language du jour", they had a whole debate on how to define modular arithmetic. The problem is that when it goes to negative numbers in the two parameters, it's not well defined.

    Now going back to my defense standards. The one thing that they do at least is the terms which we use to describe things are standardized. We know that an expression is not the same as a function, that a variable is not the same as a constant, and so forth. We might disagree on how they interact or work in a language (I'm trying to get an article on this assignment statement and all the variations that it has, even though it looks like it's one of the most fundamental in any language).

    Damn, I guess this programming thing is trickier than it looked. 🙁

    Please post DDL and follow ANSI/ISO standards when asking for help. 

  • Awesome points. Thanks for sharing Joe.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • One of the practises that contribute to agile development (not Agile TM) is constant refactoring of team processes, code..etc to design out failure points.

    I do a lot of refactoring work on legacy code bases for data pipelines.  This is largely fixing the U-Boat bog and trying to remain civil with the person who last used it.  In most cases U-Boat Bog systems are poorly documented and the designers are long gone.

    Beyond the code itself I  capture how I think a system works in Confluence and/or GitHub Markdown files.  I then go through this with anyone who has been involved in the system that is still in the organisation.  This is to verify my understanding so I can correct and/or elaborate where necessary.  Even if it is just diagrams and bullet points what I have got is an agreed and shared understanding captured in a shared and commonly accessed location.

    I capture the processes team members have to do to make the system work.  As an outsider I find that some of the questions I ask reveal gaps, duplications or even zombie processes.  What I mean by a zombie process is one that you do, you don't know why and when you dig deeper you find it is to feed a system that is no-longer there.

    At this point you find things that can simply cease to be.

    I can refactor code to break it into simpler smaller, testable pieces.  In some cases the steps to take would be obvious without all the analysis I did earlier.  In others I'm looking at WTF code and really need to understand its intent.  Sometimes I just have to accept that the code is what it is.  Whoever wrote it produced a stream of unconciousness that defies analysis.

    If successful I will have a (largely) documented system that is a damn sight more maintainable and with mechanical tests that prove that the captured understanding of the system is what is happening.

    The tricky bit is to make sure that when I leave the client's team keep up the practises of testing, documenting and refactoring.  I was called in to help with a system I had dealt with previously to find that the team had disabled several of the tests.  When I asked why they said that the tests were disabled because they broke the build pipeline.  It turns out those failing tests were failing for a very good reason and that in disabling them they had allowed a whole series of bugs to re-infest their system.  They'd done the equivalent of pre-installing dry-rot in their basement.

     

     

  • The beauty of code, over toilets, is that we can, if we choose, improve the code. Tougher to do with toilets.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • So, on a submarine, you can't design the toilets to flush into the ballast tanks or maybe a designated septic tank that periodically gets evacuated?

     

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • Eric M Russell wrote:

    So, on a submarine, you can't design the toilets to flush into the ballast tanks or maybe a designated septic tank that periodically gets evacuated?

    Heh... sounds like you already know the latter is the way they are designed.  And that bring us to what David Poole wrote above...

    "One of the practises that contribute to agile development (not Agile TM) is constant refactoring of team processes, code..etc to design out failure points."

    You have to be really careful here.  Constant "refactoring" of anything is a really bad idea.  It just like when one of the idiots onboard our sub suggested cross joining the potabable water system to the sanitary tank to "flush it out once in a while".  It's a REALLY stupid idea.  And, constant refactoring frequently results in constant changes and that ends up being a problem in itself.  To continue with the analogy, why would you continuously veer from a known safe course in narrow waters?  It's not an improvement and that's how some people end up with an automobile bridge as a new super structure.

    Find what works and continue to use it.  Don't continuously "refactor".  Make sure that you have good change control, which also includes some really good testing.  Otherwise, you've gone back to a cowboy environment.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden wrote:

    You have to be really careful here.  Constant "refactoring" of anything is a really bad idea. 

    I take your point though I  think you are mistaking refactoring for something more dramatic and dangerous like a wide scale or significant rewrite.  What I mean is more akin to spotting friction points, say a RBAR operation that should have been a set based operation.

    In our case all data pipeline code has to be covered with tests so you know if a change will break something.  A whole set of disciplines has to be in place  and rigorously enforced for it to be viable.

  • David.Poole wrote:

    Jeff Moden wrote:

    You have to be really careful here.  Constant "refactoring" of anything is a really bad idea. 

    I take your point though I  think you are mistaking refactoring for something more dramatic and dangerous like a wide scale or significant rewrite.  What I mean is more akin to spotting friction points, say a RBAR operation that should have been a set based operation.

    In our case all data pipeline code has to be covered with tests so you know if a change will break something.  A whole set of disciplines has to be in place  and rigorously enforced for it to be viable.

    Yep... but to continue my bit of objection to what is being said, a "whole set of disciplines has to be in place and rigorously enforced for it to be viable" IS the right way to go and has nothing to do with "constant refactoring".  If you have to constantly refactor a process, you're probably doing something wrong in designing the process and need to stop "refactoring" until you figure out what you're doing wrong.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 8 posts - 16 through 22 (of 22 total)

You must be logged in to reply to this topic. Login to reply