Code smells versus transgressions

  • Comments posted to this topic are about the item Code smells versus transgressions

    Best wishes,
    Phil Factor

  • From the article:


    ...and I've never found a good use for SQLVARIANT in a production table column. Am I wrong?

    Not much but, yes... kind of and only if you believe in "column based audit tables" that are in the form of EAVs.

    We have some ridiculously wide tables, all "designed" by the previous regime of self-appointed "experts". Some of these table are necessarily audited and they decided to use "column based triggers" to populate an EAV table as an audit table. They had originally made the "FromValue" and "ToValue" columns each a VARCHAR(8000) which, of course, required explicit conversion "on the way in" and raised all sorts of hell when trying to find things like dates and numeric values. Of course, they also did really stupid things like add commas to some of the values to "make reporting easier".

    Along with that, they made the mistake of using a "self programming CLR trigger" on each of these tables, which required full copies of the INSERTED and DELETED rows from the trigger because they were out of scope for the comparisons and conversions in the CLR. It was taking 4 minutes to audit just 4 columns on just 10,000 rows.

    I wrote code to generate hard-coded triggers (if someone changes the base table, they just need to rerun the code to regenerate the trigger) and a system to "collect" all the audit rows in a staging table with no indexes and insert them into the audit table en masse once a minute instead of constantly hammering the audit table, which had become a large source of blocking with the CLR triggers.

    A large part of the speed increases and simplification for reporting was due to us changing the "FromValue" and "ToValue" columns to SQLVariant, which not only negated the requirement to do conversions but also allowed indexes to work better because they no longer did implicit conversions which slowed things down a lot due to full index scans.

    And, yes... I've found a strong correlation between the thoughtfulness of the format and general structure of the code and the skill of the person writing the code.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Shifting gears to your other question, yes, there are a whole lot of other "code smells" but one of the biggest code smells is the formatting of the code itself and another is the absence of meaningful comments especially in the form of a thoughtful header that contains the purpose of the code, usage examples that actually work, and revision history.

    As a bit of a side bar, usage examples that actually work are also form of "here's how to test the code". It first demonstrates that the author of the code actually did do unit testing, offers a quick check method when troubleshooting or making modifications, and gives QA some inkling as to what their test cases should be (properly written embedded comments also help a whole lot there especially in non-TDD environments like the one I work in).

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • should be banned from any respectable SQL Server database?

    It's an interesting psychological distinction; that between "is not ideal" and "should be banned". "Should be banned" means the person who wants it, actually has to justify why they want it, and their chances of success are slim at best. "Is not ideal" can mean "I'm in a hurry. The CEO is leaning on me. I want to look good for him/her. Someone else can clean it up later." And then bad practise just morphs into common practise because the CEO is ALWAYS leaning on you and you NEVER want look unresponsive. Be very careful when you're asked to make your boss a hinge.

    On the face of it, it is a good idea to get people to justify what they're doing if it is ordinarily a bad practise. The problem with having to justify is that in too many organisations you'll end up having your case adjudicated by people who have NO IDEA whether what you're suggesting is a good or bad thing. The decision will be made on the basis that "Mr Ilovenolock seems like a nice chap. Ms Nomoreloops is an argumentative doomsayer. I suggest we stay with Mr Ilovenolock's method." Having said all of that, there are things that should be banned. A lot of them are attitudes:

    :exclamationmark: The policy of "We won't upgrade the version of SQL we're running until we're absolutely forced to. Just leave it the hell alone!"

    :exclamationmark: The policy of "Training is expensive. Do some googling on your weekend. It's free. Anyway, SQL is easy."

    :exclamationmark: The policy of "We don't have the resources to waste on reviews. There's no RoI. Besides we're too busy fixing old mistakes."

    :exclamationmark: The policy of "The inheritors of your code should be able to work it out. You don't need to waste time with commenting."

    :exclamationmark: The policy of "I'm their manager. I'm too busy to know what they're doing... oh and have you see this neat little app I'm working on for the CEO? Foreign keys? Foreign wh...?"

    :exclamationmark: The policy of "We can't afford to get an independent external opinion from professional troubleshooters. Are you implying that there could be a better way? Cost threshold for parallelism? Bah. Witchcraft."

    :exclamationmark: The policy of "It's a small simple change. Just do it on the production database... and be really careful. What? You want me to put it in writing? Now you're being obstructive. We need team players here."

    :exclamationmark: Using incrementing integers to cause uniqueness, instead of using an understanding of the business and constraints to enforce uniqueness.

    :exclamationmark: Storing your data in your code. I knew of a high flyer once (who got promoted to bigger and better things because he seemed like a nice chap) who lovingly curated a 700 line scalar function of which 680 lines were a massive CASE expression. The scalar function called another scalar function which was 1400 lines long (another massive CASE). Every time the business changed directions on something, he'd go into the scalar functions and add " ...AND @this_date BETWEEN '1 Jul 2004' AND '31 Dec 2009 23:59:59.999'..." in various places (his syntax , not mine). By the time it was discovered, he'd used it in maybe twenty dependencies. To change it to a couple of tables and rewrite all the code was a big job, and the schmucks who fixed it were the ones who looked unproductive!

    :exclamationmark: Storing datetimes as varchars.

    :exclamationmark: The policy of "I should be sysadmin... I'm a manager! I know what I'm doing, see... look at my 680 line CASE expression!"

    :exclamationmark: The automatic right click on the green writing, > [Missing index details...] > Uncomment > F5. Followed by "Hey I fixed the performance bottleneck! High fives, guys!"

    :exclamationmark: Storing data in mutli-valued, delimited fields because... Jeez do I have to write another CREATE TABLE statement? Sorry, multivalues are not relational. That's what Pick databases are for.

    ...One of the symptoms of an approaching nervous breakdown is the belief that ones work is terribly important.... Bertrand Russell

  • I've used SQLVariant as an EAV when required to "Have the ability to add or remove columns from data objects without changing the data structure, and make sure the data types are correct".

    Would I do it the same way now?

    Possibly, but I might use XML instead. It worked, but the way the ORM attacked this was clumsy and there were significant issues. Yes, we had the ability to retrieve only one (of possibly many) Attribute Value without reading and parsing a large XML column.

  • Can anyone think of a use case for a scalar function that simply selects data from other tables?

    ...One of the symptoms of an approaching nervous breakdown is the belief that ones work is terribly important.... Bertrand Russell

  • Surely the most obvious one is:

    SELECT * FROM ...

    Which is just plain wrong unless justified.

    I am a huge fan of coding standards but one that stipulates the way to do certain things and the things to not do AND stipulates that exceptions are allowable but MUST be justified (a comment in the code is best where appropriate) e.g.

    -- [Justification for using the below. I am struggling thinking of a scenario but I bet

    -- one exists.]

    SELECT * FROM ...

    Gaz

    -- Stop your grinnin' and drop your linen...they're everywhere!!!

  • @gary

    If exists (select * from myTable) begin ... End

    ... is surely legitimate since they fixed the optimiser (was it SQL2000?)

    Best wishes,
    Phil Factor

  • Phil Factor (4/4/2016)


    @Gary

    If exists (select * from myTable) begin ... End

    ... is surely legitimate since they fixed the optimiser (was it SQL2000?)

    Doh!!! Yes. It is even something I do myself. Good evidence that a simple rule may not capture what is intended. It may be that either the rule would have to cater for this or not exist at all.

    For a lot of "code smells" these sort of situations can exist i.e. never do XXX but of course in the context of YYY it is fine!!!

    Gaz

    -- Stop your grinnin' and drop your linen...they're everywhere!!!

  • GPO (4/2/2016)


    :exclamationmark: Storing data in mutli-valued, delimited fields because... Jeez do I have to write another CREATE TABLE statement? Sorry, multivalues are not relational. That's what Pick databases are for.

    It's a funny thing that so many people agree that's absolutely the wrong thing to do in a database but will then allow storage of XML, JSON, or some other non-relational atrocity to settle in because "every does it".

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • :exclamationmark: Using incrementing integers to cause uniqueness, instead of using an understanding of the business and constraints to enforce uniqueness.

    In all fairness there are perfectly valid reasons to do this, for example order numbers. But as a general rule putting an identifier as the primary on every table should be a no no though. Although I've also seen databases that use an identifier as the primary key because it's easier to join on but also have a unique constraint defined by business rules which works well.

  • there are perfectly valid reasons to do this, for example order numbers

    Well... kinda. I don't think I explained myself very well. My apologies. Incrementing integers CAUSE uniqueness. They don't enforce it. There are good reasons for including an incrementing integer in your table design, but you always want uniqueness to relate back to something in the real world. Incrementing integers are an entirely artificial construct.

    In your orders example, if your only idea of uniqueness is the incrementing integer, and your app starts creating duplicate orders, you won't see it. If a particular combination of columns other than the incrementing integer represent a unique order and you create a unique constraint based on those columns, then you will prevent duplicate orders getting in.

    That's not to say that you shouldn't have the incrementing integer at all. It can serve a purpose as an efficient key for your unique clustered index. It's narrow, always increasing, non-null and... well... unique. This has flow-on economic benefits for your non-clustered indexes as well.

    ...One of the symptoms of an approaching nervous breakdown is the belief that ones work is terribly important.... Bertrand Russell

  • GPO (4/4/2016)


    there are perfectly valid reasons to do this, for example order numbers

    Well... kinda. I don't think I explained myself very well. My apologies. Incrementing integers CAUSE uniqueness. They don't enforce it. There are good reasons for including an incrementing integer in your table design, but you always want uniqueness to relate back to something in the real world. Incrementing integers are an entirely artificial construct.

    In your orders example, if your only idea of uniqueness is the incrementing integer, and your app starts creating duplicate orders, you won't see it. If a particular combination of columns other than the incrementing integer represent a unique order and you create a unique constraint based on those columns, then you will prevent duplicate orders getting in.

    That's not to say that you shouldn't have the incrementing integer at all. It can serve a purpose as an efficient key for your unique clustered index. It's narrow, always increasing, non-null and... well... unique. This has flow-on economic benefits for your non-clustered indexes as well.

    On the other hand, particularly in the case of an order number, the order number may be the only thing which can guarantee uniqueness. While two orders from the same customer at the same time for the same goods is certainly a good reason to launch an inquiry, unless your system is designed to forbid that very condition, it isn't impossible for it to occur.

    - Les

  • GPO (4/4/2016)


    there are perfectly valid reasons to do this, for example order numbers

    Well... kinda. I don't think I explained myself very well. My apologies. Incrementing integers CAUSE uniqueness. They don't enforce it. There are good reasons for including an incrementing integer in your table design, but you always want uniqueness to relate back to something in the real world. Incrementing integers are an entirely artificial construct.

    In your orders example, if your only idea of uniqueness is the incrementing integer, and your app starts creating duplicate orders, you won't see it. If a particular combination of columns other than the incrementing integer represent a unique order and you create a unique constraint based on those columns, then you will prevent duplicate orders getting in.

    That's not to say that you shouldn't have the incrementing integer at all. It can serve a purpose as an efficient key for your unique clustered index. It's narrow, always increasing, non-null and... well... unique. This has flow-on economic benefits for your non-clustered indexes as well.

    Well this comes down to what your system of record is, in the case of orders if SQL(or the application using it) is not the system of record then auto increment or auto generated keys can create artificial uniqueness. On the other hand if SQL is the system of record they do not and are a very handy tool to use.

  • On the other hand if SQL is the system of record they do not and are a very handy tool to use.

    We're talking about two different kinds of unique. If your application generates a transaction that creates an order which is given a unique incrementing integer (say 1636462) and a millisecond later the app sends the same thing through again in error (causing SQL Server to generate a new unique incrementing integer, say 1636463) then from a business perspective a duplicate order has been generated. The only sense in which they are unique, is that the incrementing integer has CAUSED the uniqueness. The table hasn't enforced uniqueness.

    ...One of the symptoms of an approaching nervous breakdown is the belief that ones work is terribly important.... Bertrand Russell

Viewing 15 posts - 1 through 15 (of 44 total)

You must be logged in to reply to this topic. Login to reply