Refactoring SQL Code

  • Comments posted to this topic are about the item Refactoring SQL Code, which is is not currently available on the site.

  • For me, the factors that stop me from refactoring are

    • I am not 100% sure of the blast radius of my change
    • It requires commitment from others, too
    • It detracts from what we have been committed to deliver
    • The system is going to be deprecated in the next 12 months anyway.

    If you get challenged on why a commitment wasn't delivered and say, I spent time refactoring 'X', the stakeholder is going to say, "Where did I prioritise or mention X?"

    Where I know the blast radius is within my control, then yes, I will refactor.

    I work for a company where refactoring has the blessing from on high.  It is seen as an efficiency and cost reduction activity.  Part of my role is to work out how to make refactoring easier.  As we deliver our solutions to clients in the cloud, this is actually a selling point.

  • I agree with David Pool except on his last point about the system being deprecated - I've heard that too many times and then seen the system still up and running 5-10 years later and still hearing "only 12 more months and it'll be gone". If the refactor makes sense, do it even if the system is supposed to only be alive for 12 more months unless you are going to miss deadlines. Plus run it past the project lead/your manager to make sure they are OK with you working on that.

    The other point I'd like to add about refactoring is the "why?". What is the benefit to the refactoring that you are doing? is it for performance and if so, can you quantify it to justify your refactor? If it is for readability, why wasn't that done initially and how did it pass code review if there are readability problems? Is it to correct security concerns or compatibility issues (like new .NET versions)? If so, may be worth the effort or may not. If the app is specifically designed for a specific OS version, then compatibility issues may not be worth fixing and if the app is designed to be run on an internet disabled computer, security concerns may not be as important. And so on... I need to justify the refactor first to myself then to the team as to why I'm doing this. If it is a "make work" project, then it's not worth the time.

    And when I say quantify the performance I mean if the SP runs in 0.5 seconds and my refactor is going to save 10%, the 10% sounds good on paper but doesn't make sense in that scenario. If it takes a minute to complete and my refactor will get it down to 30 seconds, that's a pretty good savings. If it is an ETL load that takes an hour AND runs during company downtime and I can get it down to 5 minutes, yes it is a HUGE performance increase, but nobody will notice because it runs during company downtime so no business impact to it being slow.

    My thought is that you have to make sure you can justify the code refactor before you start or people will think you are just not busy and need more work which may not be the case or that you are no longer required at the company. My current role doesn't involve coding but my previous role did and some refactoring I could justify some I could not...

    The above is all just my opinion on what you should do. 
    As with all advice you find on a random internet forum - you shouldn't blindly follow it.  Always test on a test server to see if there is negative side effects before making changes to live!
    I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.

  • Great article and well written post, Steve! I've been a software engineer for many years. Yes, I have refactored code but not always. As others have said, sometimes I've been afraid of the ramifications of changes, especially to reports. Another reason I will hesitate is if the developer who wrote the code originally is still around. If I see a problem, I'll contact that other person and suggest they need to make changes. Some people can be very touchy about you messing around in their code. My previous supervisor had a bad attitude towards people refactoring or rewriting code. It was his opinion that all software developers hate all other software developers and all we want to do is rewrite everyone else's code. 10 years ago, he assigned the worst codebase I've ever seen, to me. It was written by developers who left before I was hired. It was extremely hard to maintain because the original developers didn't document anything, they didn't save the code in any source control, and they didn't leave the code in one place. It took me a month to pull it together, so I could work on it. There were programs that would call other programs only to call a method in the other program of a third program, etc. But my boss ordered me to leave it alone. Eventually, due to upgrades to Windows and system routines this poorly written app called, it no longer can compile, but he still refused to let me update it. He retired last year, and I've finally gotten permission to rewrite it, thank God.

    I've also seen the "not invented here" attitude used to both not adopt an existing solution or not move forward with a better solution, because of NIH. The NIH attitude is, in my opinion, an example of what I've been struggling with to get people to adopt using Git, and that's culture. At the end of the day, the thing that holds anything back is most often culture, not technology. In many ways this is the thing that hurts the most and really detracts from working in such environments.

    Rod

  • I have to say that when it comes to refactoring code, it's usually my own code that I refactor. This is typically because I've learned more since I originally wrote the code, so there are better or more performant ways to write it. When I refactor someone else's code, it's usually because there's a performance problem with it, so I'll update it to run faster. When I do that, I also tend to update it if it also violates standards, provided that it doesn't impact how it would function or interface with other areas. In either case, I test the crap out of it to make sure that it still performs as expected and that my "enhancements" are worthwhile. There are a very few times when I have re-written code assuming it would run better, but if it made no difference (or, heaven help me, ran worse), then I'll throw out my changes and leave it as is. In those cases, I keep a copy to study so I can learn from it.

  • Mr. Brian Gale wrote:

    If it is an ETL load that takes an hour AND runs during company downtime and I can get it down to 5 minutes, yes it is a HUGE performance increase, but nobody will notice because it runs during company downtime so no business impact to it being slow.

    I must admit that in this case, I would still make the change. Nobody else may notice, but if I know about it and I can do something to improve it, it will bother me to no end until I can fix it. That may be a personal issue, but it's my cross to bear.

  • All the time, refactoring is pretty central to how I work because I spend a lot of time with inherited code bases. A great principle is never to try to boil the ocean - a small refactor that fits in time you can make available without derailing your goal and takes you closer to a good state will make often make it possible to fit another, perhaps larger one in with the next requirement-driven change. This is particularly true of sprawling analytical databases that have grown arms, legs and tentacles over time.

    A few scenarios worth picking out:

    • Not refactoring as such, but when I inherit some sprawling query spammed across the screen like some sort of text art, I always reformat with consistent indentation so that I can see the structure and start to reason about it.
    • Early career, I had to wade through a lot of PL/SQL, so did a lot of converting ANSI-89 joins to ANSI-92 ones - again, so that I could see the structure and reason about it.
    • Converting subqueries to CTEs, again because it makes the structure easier to see. Also window and apply (or cross join lateral for PostgreSQL and Snowflake people) are your friends. I'm a great fan of factoring out repeated expressions with apply/lateral.
    • The analytical system I mostly work on at the moment makes refactoring particularly safe and easy because it contains its own regression test - the final step copies materialized views to tables for external access, so to test, I can just run a partial rebuild and compare the layers.
  • Mr. Brian Gale wrote:

    I agree with David Poole except on his last point about the system being deprecated - I've heard that too many times and then seen the system still up and running 5-10 years later and still hearing "only 12 more months and it'll be gone".

    Brian, this is horribly true.  I know of a system that used to run in SQL Server 7 compatibility mode and was eventually ported to SQL Server 2014, where it lived past 2020.  It was one of those mission-critical systems that provided the finance department with the lifeblood of their spreadsheets.

    Anything that threatens the existence of a Finance spreadsheet will be fought to the bitter end.  Resistance is futile my A*&*

  • I tend to adopt Aaron's view a lot. I am refactoring my code and trying to make it better, but also help me clean things up that are an investment in the future. Especially if this code will be touched in the future, and I can improve either the performance, or the ability to context switch and understand it quicker, it's a good idea.

    I am very cognizant that there can be an impact on other code, so I want changes to not cause issues. I can't rename columns, parameters, etc. but if I can change the code to be cleaner (no subqueries, use an OVER(), etc.) without a visible change to callers, then great. I might rename PKs or other constraints to provide information.

    I look to do this in the margins, not as a project that impacts other deadlines. I've never had a job where I didn't need a break or a switch of work, even for a few minutes. Working on a refactor during those times provides some satisfaction, as well as a healther codebase for others. That's worth a few minutes and it's a nice change of pace.

  • I am currently in the midst of a refactoring project primarilly intended to "quiet" the log files.  Non-catestrophic errors and warnings are polluting the log files causing unacceptable growth.

    One challenge has been ensuring the client facing response is effectively one-to-one with the original code; even when that code is "wrong" or "broken."  I have caught myself and must err on the side of caution to not "fix" things I find along the way.  Although fixing edge cases may seem like the right thing to do, I've been burned when a downstream block expects the "broken" portion to continue to act in its present state.

    Anything severely out of line must incurr its own changeorder task with adequate tracking and regression to be addressed outsided the scope of the "refactoring only" project.

  • Very good point, @andrew.ssc.

    Rod

  • David.Poole wrote:

    Mr. Brian Gale wrote:

    I agree with David Poole except on his last point about the system being deprecated - I've heard that too many times and then seen the system still up and running 5-10 years later and still hearing "only 12 more months and it'll be gone".

    Brian, this is horribly true.  I know of a system that used to run in SQL Server 7 compatibility mode and was eventually ported to SQL Server 2014, where it lived past 2020.  It was one of those mission-critical systems that provided the finance department with the lifeblood of their spreadsheets.

    Anything that threatens the existence of a Finance spreadsheet will be fought to the bitter end.  Resistance is futile my A*&*

    Wish  I knew your tricks! I have a SQL 2000 instance that I've been told for YEARS is only going to be around for 6 more months and it never dies. Worst part is that the stupid software that is tied to it breaks if we upgrade and the team that supports it refuses to update the code because the system is going to be retired "soon".

    I really hate the temporary-permanent solutions as they never go away and slowly become impossible to support... BUT at least it is in SQL and not Excel or Access even if it is an old SQL

    The above is all just my opinion on what you should do. 
    As with all advice you find on a random internet forum - you shouldn't blindly follow it.  Always test on a test server to see if there is negative side effects before making changes to live!
    I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.

  • No company can afford a highly skilled coder to go around refactoring their code base and shouldn't use an inexperienced coder for the task.  When does it make sense to refactor?  I can think of a couple of opportunities for refactoring that may make sense.  When a piece of code is causing a bottle neck and needs to be fixed or when a feature is being added to existing code since you are going to be modifying the code, anyway.  In both cases it is necessary to understand the code and an evaluation needs to be made to determine how much refactoring can be justified. Using your editor to reformat the code to company standards or default standards of the editor is a quick refactor that makes understanding the code easier.  If documentation is sparse, the person doing the work needs to document how the current code works as comments in the code because it takes less time and is readily available to the next coder.  Then the logic needs to be understood and dependencies identified and documented for the next time modifications are needed.  Next the bottle neck needs to be identified and corrected or the new code added.  Finally, depending on the skill level of the coder, any obvious bad coding should be identified and evaluated if the time it takes to make the changes are a good investment now or if the refactoring should be delayed to the future.

  • doysheaf wrote:

    I am currently in the midst of a refactoring project primarilly intended to "quiet" the log files.  Non-catestrophic errors and warnings are polluting the log files causing unacceptable growth.

    One challenge has been ensuring the client facing response is effectively one-to-one with the original code; even when that code is "wrong" or "broken."  I have caught myself and must err on the side of caution to not "fix" things I find along the way.  Although fixing edge cases may seem like the right thing to do, I've been burned when a downstream block expects the "broken" portion to continue to act in its present state.

    Anything severely out of line must incurr its own changeorder task with adequate tracking and regression to be addressed outsided the scope of the "refactoring only" project.

    Those are the tough projects. usually I aim to go quicker, even if it's less accomplished, and get something accomplished under lesser scope. At least I get some part of this done until dependent systems can adjust.

  • hmbacon wrote:

    No company can afford a highly skilled coder to go around refactoring their code base and shouldn't use an inexperienced coder for the task.  When does it make sense to refactor?  I can think of a couple of opportunities for refactoring that may make sense.  When a piece of code is causing a bottle neck and needs to be fixed or when a feature is being added to existing code since you are going to be modifying the code, anyway.  In both cases it is necessary to understand the code and an evaluation needs to be made to determine how much refactoring can be justified. Using your editor to reformat the code to company standards or default standards of the editor is a quick refactor that makes understanding the code easier.  If documentation is sparse, the person doing the work needs to document how the current code works as comments in the code because it takes less time and is readily available to the next coder.  Then the logic needs to be understood and dependencies identified and documented for the next time modifications are needed.  Next the bottle neck needs to be identified and corrected or the new code added.  Finally, depending on the skill level of the coder, any obvious bad coding should be identified and evaluated if the time it takes to make the changes are a good investment now or if the refactoring should be delayed to the future.

    I think it always makes sense to refactor along the way. The main reason is that Steve from today knows more and has a better handle on the system than Steve that wrote the code last year. I wouldn't refactor the code I wrote last week (usually), but the code I encounter from last year that I know could be structured, formatted, etc. in a better way.

    Your team should be growing and learning new techniques. You might get some new items for free in an upgrade, but you have to rewrite code to use them. However, you have to decide to rewrite code.

    This isn't something I'd say is a big project. It's as I'm working with some object/module/etc., if I see something bad, then if:

    • I know a better way
    • have decent test coverage
    • and its' not a time sink

    I'd fix it.

    This might be adding aliases that we've decided are easier to read. It might be just reformatting code that's always a pain to work with. It might be cleaning up a know bad structure (subquery to join) that performs better.

    Maybe I change a SELECT * to a limited list.

    Those are all refactorings

Viewing 15 posts - 1 through 15 (of 15 total)

You must be logged in to reply to this topic. Login to reply