Reusing Tools for New Purposes

Question

Reusing Tools for New Purposes

Steve Jones - SSC Editor

SSC Guru

Points: 734449
More actions
January 22, 2022 at 12:00 am

#3975884

Comments posted to this topic are about the item Reusing Tools for New Purposes

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply

Jeff Moden SSC Guru Points: 1003863 More actions · Answer 1

Such workarounds are absolutely stupid. Not because the workaround itself is stupid but, rather, because Microsoft has made such a workaround NECESSARY.

They release broken, only partially functional, and performance challenged code and then they showcase the ability to track the number of and location of cars and people at an airport and expect us to applaud but they do things like making online rebuilds that would corrupt your data. Remember that?

They released regressive code in 2012 and it took them until SP3 to almost get things back to normal. They destroyed SSIS installations with 2014 SP1.

They released jokes known as PIVOT (the one in ACCESS is much better), the partially functional STRING_SPLIT() function, and the extremely performance challenged FORMAT() function.

They killed a whole lot of functionality when they released the "newer" temporal datatypes, realized their mistake, and released DATEDIFF_BIG() to try to make up for it. Do you think that they'd make a DATEADD_BIG to go along with it? Oh hell, no.

They released EOMONTH(). Using end of month dates is actually a WORST PRACTICE! Do you think they'd release an FOMONTH() function to go along with it? Oh hell no! We have to continue doing temporal math workarounds for that and because of how they crippled the "newer" temporal datatypes, we have to use workarounds in our workarounds to calculate the first of the month for those unless we've made the mistake of actually using the "newer" temporal datatypes.

And how amount the "System Version" audit tables affectionately referred to as "Temporal Tables". Those were almost beautiful but there's no chance of having them identify WHO or WHAT modified the rows and so back to the drawing board on audit/history tables for us!

And then they don't understand the utility of things like having a built-in Tally function/sequence generator. Just bloody brilliant!

And then there are the abominations known as REBUILD, REORGANIZE, and SHRINKFILE. Hey, Microsoft! It's been decades since those horrible mistakes have been released! Ever hear of a fellow by the name of Peter Norton??? Lordy my!

And then there are the horrible throwbacks they created when they began releasing SSMS as a separate product. We can't even copy'n'paste the custom colors we setup anymore like we used to be able to do. And a block-select backtab in the "Replace tabs with spaces" mode absolutely slaughters the indentation of the code, now. It didn't use to! And the search functionality has become a joke. I'd tell them about the other dozen or more problems but they won't listen, anyway.

And did you know that having a scalar function in a computed column auto-magically guarantees queries to be single-threaded even if the column isn't used in the query? Good lord! THEY NEED TO FIX IT!

And then they make it impossible to do minimal logging if SET IDENTITY INSERT is on and it blows out TempDB doing a sort. That bug has existed since it came out but that's not a high value fix, is it? No marketing joy in fixing stuff that's broken, right?

And how difficult would it actually be to make true BEFORE triggers like Oracle has? And how about making it so we don't actually need dynamic SQL to do things like "Catch-All" queries or create variable column and table names or use dynamic OPENROWSETs? That would virtually wipe out the cause of SQL Injection which is still one of the leading attack vectors but, oh, there's no marketing value in that, right?

And do you think they could actually make a T-SQL command that would generate a table script and another to generate an index script and store them in a MAX variable? We used to have a workaround to do that but they even took that away! I know, I know. What's the marketing value in that, right?

And then to not make a decent scheduler for one of their flagship products and people have to use the likes of ADF as a workaround? WOO-HOO! WAY TO GO, MICROSOFT! They're living up to everything that they've taught me to expect of them!

Ah, maybe I'm wrong there. After all, it only took them two and a half decades to finally realize that there's these CSV and TSV files that need to be imported using BULK INSERT or BCP. That's real progress, right? And it only took them 6 years to realize they needed to fix the STRING_SPLIT() function. That's real progress, right? Maybe they'll actually release it to on-premise versions in the next decade or so. And, if they keep going, maybe they'll come out with a BULK EXPORT command sometime in the next couple of decades.

Heh... "Reusing Tools for New Purposes". That should be re-titled as "Workarounds to Make Up for Microsoft's Incompetence and DILIGAF Attitude Except When It Comes to Marketing".

I could go on but they won't listen. They never have and that's why I'm ranting here instead of on their latest "feedback/suggestion" abortion site. They're making things like Postgres start to look awfully good! But there's no marketing value in that, either. Right?

Whup! There goes a pedestrian at the airport! Track him! Aw... there you go. Aren't they special? Bless their little hearts.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Rod at work SSC-Dedicated Points: 33900 More actions · Answer 2

Steve, why is it that code reuse in T-SQL, in particular refactoring, results in performance penalties? I'm ignorant as to why this is so. Just trying to learn more, thank you.

Kindest Regards, Rod Connect with me on LinkedIn.

Jeff Moden SSC Guru Points: 1003863 More actions · Answer 3

Rod at work wrote:

Steve, why is it that code reuse in T-SQL, in particular refactoring, results in performance penalties? I'm ignorant as to why this is so. Just trying to learn more, thank you.

I'm not answering for Steve... IMHO, though, code reuse in SQL Server isn't the problem. The problem is with HOW the code reuse is done by many that aren't experienced in SQL Server.

For example, the use of functions in imperative/procedural code is a time honored method to prevent redeveloping the wheel. The problem with that in SQL Server (as you already know), is that scalar and mTVF functions DO have performance issues. Even the presence of a computed column that contains a scalar function will prevent parallelism in queries that don't even use the column. Brent Ozar published a very well written post on the subject (and an alternative) and I've verified his findings using his code and insitu on my servers at work. Here are the links to a couple of Brent's articles on the subject:

https://www.brentozar.com/archive/2016/01/another-reason-why-scalar-functions-in-computed-columns-is-a-bad-idea/

https://www.brentozar.com/archive/2020/10/using-triggers-to-replace-scalar-udfs-on-computed-columns/

In front end and other imperative code, you never have to worry about how an optimizer is going to use your code. In SQL, you do. People will write views of views (for example) an not realize that they're calling the same table many times and it could be in a many to many fashion. They also don't realize that if they're creating aggregates in a "base" view and then use other views that "filter" on the aggregates from the first view, it can cause the first view to fully materialize for every row in the outer view. The problem is that folks don't understand the nature of the beast beyond what they learned for successful imperative programming.

And then there's the "killer". People will write stored procedures to handle the OLTP code, which has usually been written to handle just one "record" and then try to reuse that same code to process large batches. It's almost as bad as using nested scalar functions. I've also seen that taken to an extreme of being nested 8 levels deep.

A very recent post of that nature can be found at the following link. Fortunately, the OP listened to us.

https://www.sqlservercentral.com/forums/topic/how-can-i-execute-oledb-command-without-result-set#post-3975899

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

m60freeman Old Hand Points: 370 More actions · Answer 4

Steve,

Using Azure Data Factory as a replacement for SQL Agent can be a good idea. Although SQL Agent is free if you need a full SQL instance anyway, if you don't you would need to pay at least $300 per month for an Azure VM with SQL Server Standard Edition (in a US region) just to run Job Agent. Using ADF to just run stored procedures on a scheduled basis is practically free.

We generally use Azure Automation Runbooks though, because we have many processes that need to run across many databases across multiple subscriptions, each requiring different credentials, and we have a PowerShell genius on our Infrastructure team who handles all that coding (including having Ola's stored procedures start at a different off-work-hours time for each database in an Elastic Pool to minimize overlaps in the load, based on the number of databases in the pool). We were using Runbooks for many non-SQL related processes already anyway, so using them for SQL tasks was the obvious choice for us. But for people with simpler use cases, ADF is a great solution.

below86 SSChampion Points: 11513 More actions · Answer 5

Jeff Moden wrote:

... Using end of month dates is actually a WORST PRACTICE! ...

I agree with a lot of what you said Jeff, but I can't agree here. I've worked at various insurance companies over my 30+ year career and everyone uses the end of month date. I don't understand why so many(I've heard this from others) think this is bad. It's not a hard concept to code for. With insurance comes auditors and if you are trying to explain your book of business for all of January 2022. Having a date of 01/01/2022 doesn't sit well with them, now if I say it's 01/31/2022 it's easier to accept. Also in insurance comes the need to calculate how much premium is earned each month. If I carry the 01/01/2022 as my date I need to calculate the end of month date each time. When we use the actual end of month date of 01/31/2022 we can easily show and calculate that amount.

I know this thread is about other things, and your rant was about a lot more, and again I can agree with everything else. Why couldn't we have a special date column, but that column only carry the year and month, YYYYMM, now that would be something to end the argument over first of month vs. end of month.

-------------------------------------------------------------
we travel not to escape life but for life not to escape us
Don't fear failure, fear regret.

qbrt SSCrazy Points: 2737 More actions · Answer 6

qbrt

SSCrazy

Points: 2737

January 25, 2022 at 4:18 pm

#3977043

qbrt SSCrazy Points: 2737 More actions · Answer 7

Jeff Moden wrote:

Such workarounds are absolutely stupid. Not because the workaround itself is stupid but, rather, because Microsoft has made such a workaround NECESSARY. .......

F@#$%NG EPIC!

Jeff Moden SSC Guru Points: 1003863 More actions · Answer 8

below86 wrote:

Jeff Moden wrote:
... Using end of month dates is actually a WORST PRACTICE! ...
I agree with a lot of what you said Jeff, but I can't agree here. I've worked at various insurance companies over my 30+ year career and everyone uses the end of month date. I don't understand why so many(I've heard this from others) think this is bad. It's not a hard concept to code for. With insurance comes auditors and if you are trying to explain your book of business for all of January 2022. Having a date of 01/01/2022 doesn't sit well with them, now if I say it's 01/31/2022 it's easier to accept. Also in insurance comes the need to calculate how much premium is earned each month. If I carry the 01/01/2022 as my date I need to calculate the end of month date each time. When we use the actual end of month date of 01/31/2022 we can easily show and calculate that amount.
I know this thread is about other things, and your rant was about a lot more, and again I can agree with everything else. Why couldn't we have a special date column, but that column only carry the year and month, YYYYMM, now that would be something to end the argument over first of month vs. end of month.

The answer is simple... because it's not bullet proof and it actually takes less effort to make it bullet proof. I've seen too many times where people have changed the underlying datatype for a table to start recording times and no one remembers the code won't handle it and they go for months not realizing that they missing the entire day at the end of whatever period they're basing stuff on.

To each their own but using closed/closed temporal ranges will always be a worst practice to me and I won't allow it to go to prod that way.

EDIT... and to be sure on the following subject...

below86 wrote:

Why couldn't we have a special date column, but that column only carry the year and month, YYYYMM, now that would be something to end the argument over first of month vs. end of month.

I absolutely agree. For reporting purposes, that's the way to go.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jeff Moden SSC Guru Points: 1003863 More actions · Answer 9

qbrt wrote:

Jeff Moden wrote:

Such workarounds are absolutely stupid. Not because the workaround itself is stupid but, rather, because Microsoft has made such a workaround NECESSARY. .......
F@#$%NG EPIC!

BWAAAAA-HAAAA!!! At least I know that one person "got it". 😀 Thank you for the feedback and the very much needed laugh! 😀

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

below86 SSChampion Points: 11513 More actions · Answer 10

Jeff said

"The answer is simple... because it's not bullet proof and it actually takes less effort to make it bullet proof. I've seen too many times where people have changed the underlying datatype for a table to start recording times and no one remembers the code won't handle it and they go for months not realizing that they missing the entire day at the end of whatever period they're basing stuff on."

Based on that thought process a first of month is no more bullet proof than end of month. If it's a column that is first of month with a 01/01/2022 date, and is changed to a datetime, and someone allows time to be added like 01/01/2022 01:12:34, any SQL that is expecting it to be just a date of 01/01/2022 will miss it. The same as you say for end of month. I'm not talking about any date values you would use in a range to pick up data within a time period. I'm talking dates used to define a months worth of data within your data warehouse. This would be different than any transactional datetime fields you may have. example: transaction processed at 01/01/2022 00:01:00 would get an EOM date of 01/31/2022, transaction on 01/15/2022 12:45:23 would get 01/31/202 EOM date. Transaction at 01/31/2022 11:59:59 would also get 01/31/2022 as the EOM date. transaction on 02/01/2022 00:01:00 would get 02/28/2022 EOM date. And if it was a leap year it would get 02/29/YYYY.

As you like to say Jeff 'it depends'. I just didn't want anyone reading your comment to think that using EOM date is wrong. Do what is best for your company.

-------------------------------------------------------------
we travel not to escape life but for life not to escape us
Don't fear failure, fear regret.

Jeff Moden SSC Guru Points: 1003863 More actions · Answer 11

If you're using SomeDateCol BETWEEN FirstOfMonth and EOMonth, then you have closed/closed and it's not bullet proof because SomeDateCol could have a time in it.

Using the basic criteria of SomeDateCol >= FirstOfMonth AND SomeDateCol < FirstofNEXTMonth is bullet proof. I don't want anyone reading your comment to thing otherwise. 😀

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)