June 20, 2025 at 12:00 am
Comments posted to this topic are about the item Multiple Monitoring Tools
June 20, 2025 at 6:21 am
I don't think SQL Server emits OpenTelemetry metrics yet, but this strikes me as where OpenTelemetry should come in. As I understand it, the parts you need are as follows: -
As SQL Server is not an OTEL emitter, you need a collector instead.
There is nothing that says all these components need to be in a single tool. If impact on source systems is a concern, then choose the collector to suit your needs and concerns.
The receiver is the glue component that gives centralised observability.
June 20, 2025 at 1:39 pm
I keep thinking about OT as well. I think xEvents do a lot of what OTEL would want, though perhaps not without some overhead of the fields being collected.
I do think that not all the data needs to be in one tool, but then how do you navigate from one to the next? Times can help, but coordination across tools might be cumbersome. One of the goals of observability is to quickly and easily query across many dimensions, which seems to imply you need all the data accessible from one tool
June 20, 2025 at 10:05 pm
Multiple tools means more systems to administer and more tools to patch which also means more man-hours to keep your environment up to date.
I much prefer to get all of my information in 1 dashboard and (where possible) a weekly email report as well as immediate alerts where possible.
Plus, like you said, each tool eats up some resources from the server. Once you start looking at virtualizing, you don't want that extra overhead if you can avoid it. The less overhead load I have on my systems, the happier I am. BUT I also want SOME monitoring. None of my systems are super critical (nobody will die if it goes down for a day), but are important (some will stop the business if they go down). So monitoring is important to me as long as it doesn't hurt the performance of the system too badly.
I also hate having multiple tabs open to see what I need. If I have multiple tools for monitoring, then I am going to either buy or build a tool to consolidate their data into something I can use to see my systems at a glance. When things hit the fan, I want to flip through as few tools as possible to figure out the problem and correct it.
The above is all just my opinion on what you should do.
As with all advice you find on a random internet forum - you shouldn't blindly follow it. Always test on a test server to see if there is negative side effects before making changes to live!
I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.
June 23, 2025 at 8:38 am
I tend to agree. I was surprised that some DBAs didn't want to compromise to 1 tool and then ad hoc queries as needed. The overhead worries me, especially if I'm worried about a system. I want to minimize the impact.
June 23, 2025 at 7:04 pm
With the ad hoc queries as needed, a good monitoring tool should be able to capture some of those too. RedGate Monitor (formerly SQL Monitor) allows you to capture custom metrics that return a numeric value AND configure alerts for them. At least last time I checked. Been away from work for a little bit so still catching up to what's new in RedGate Monitor apart from the name. I have a few custom metrics added to that but you have to be careful with the custom metrics - it is easy to write a bad query and that 2-5% hit to the system jumps up to 20-50%. No fault to RedGate Monitor (or any other monitoring tool) mind you - that's entirely the fault of the DBA. And I've seen some DBAs write some horrible code (myself included). I saw a DBA recommend "RECOMPILE" on a stored procedure when "OPTIMIZE FOR UNKNOWN" fixed the problem and had a lot less impact. I've seen DBA's recommend "NOLOCK" because a query was often causing blocking (that one was me, but I learned the mistake of my ways and have since corrected it with isolation levels and better written queries). But I could see someone putting a CURSOR in RedGate Monitor custom metric and then complaining that the tool is slow to alert on their custom metric LOL.
The above is all just my opinion on what you should do.
As with all advice you find on a random internet forum - you shouldn't blindly follow it. Always test on a test server to see if there is negative side effects before making changes to live!
I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.
Viewing 6 posts - 1 through 6 (of 6 total)
You must be logged in to reply to this topic. Login to reply