Database administrators tend to perform a lot of routine checks, referred to as sanity checks, on a daily, weekly, and monthly basis. Rob Gravelle shows you how to write some PowerShell scripts to automate a couple of fairly common sanity checks.
One of the more important roles that SQL Agent plays is to be a job scheduler for anything SQL Server. There are several tools out there that allow you to glean information from the jobs, but if you want to roll your own, you’ll likely run into the less than useful way that SQL Agent stores its duration values.
Many people have created and published solutions to this problem. However, most involve some sort of string manipulation (split, concatenation, etc.) to get to the value that they want. In the end, this is just a number, so let’s explore how we can use some math rules and .Net objects to get to a value that we can use without converting it to a string.
What’s the Problem?
In general, SQL Agent defines a duration in the form of HHMMSS for job activities. You can find this in sysjobservers, sysjobhistory, and sysjobsteps as well as the stored procedures built on top of these tables. The problem is that when someone sees a value of 125 or 14053 in one of these “duration” column, the assumption is that it returns seconds. While I’m not going take this opportunity to lay out what assuming does to you and me – it turns out that is a bad assumption.
It may be easier to visualize this value as a string – 00:01:25 and 01:40:53 respectively. A value of 125 really means 1 minute 25 seconds. A value of 14053 really means 1 hour 40 minutes and 53 seconds. This has created heartache in the lives of DBAs for years and I suspect will continue to do so. In addition to the inherent issue of breaking apart the value, there’s also an insidious problem lurking for jobs that run over about 4 days long. SQL Agent does not add a “Day” value, it just keeps adding hours. So, a process that would take 5 days would be stored as 1200000 representing 120 hours, 00 minutes, and 00 seconds.
Let’s look at a job that has a duration value of 23456. As I said before, this can be logically broken up into 02:34:56. This is where most people (including myself at first) take the number, convert it to a string, and split out the sections to have 2, 34, and 56 respectively. I have always assumed that numbers hate being turned into strings. I envision all their number friends pointing at them and saying “Dirty!” or “UnClean!”, but I digress…
This process does work, you take each character, convert it back to a number (unfortunately, it still feels dirty), then multiply the first (the hour) by 3600 – the number of seconds in an hour, add the second number (the minutes) multiplied by 60 and then add the last number to the mix. You now have the total seconds this value represents – in this case 9296 seconds. Once you have this value, you can use various forms of date arithmetic to get the value that you may want to display.
The string method works fine, but if it is not written properly, there are several methods out there that will fail when the duration exceeds 999999 – which I have unfortunately seen quite a few of. Almost all of the cases that exceeded a few days were issues that needed resolved, but if the reporting code cannot parse this larger number, or worse, only parse the last 6 characters, action could be taken on incorrect or incomplete information.
Slightly Less than Basic Math
There are a couple of number tricks that can be used with these whacky values to extract a real “duration” out of this madness.
The first thing we’ll be using is a basic concept, but wanted to touch on it briefly. If we have a number 23456 and divide by 100, we end up moving the decimal place left by 2 places to create 234.56. If we divide the same number by 1000, moves the decimal 3 places to create 23.456. Essentially, you move the decimal to the left once for each zero in the divisor.
The second thing we’ll be using is the modulo operation. Using modulo (represented as % in several programming languages, including PowerShell and T-SQL) provides the remainder of a division operation. For instance, if we have 9/4 we would end up with 2.25. If we are to take the modulo of these two numbers (9%4) we end up with the number 1. If you reach way back to when we were writing down division problems, this makes sense as we end up taking 9/4 and write down 2R1. To get a decimal representation, we would take the remainder over the divisor getting 1/4 or .25 and add that to the quotient of 2. 2.25, 2 1/4, and 2R1 are the same number in this case.
With PowerShell, I’ll also be utilizing a method of the .Net Math class called Truncate. When I wrote this script on automatically determining the base and description of a data size, I used the Floor method. An ensuing Twitter discussion pointed out a flaw in this decision. Floor and Truncate are essentially the same with values >= zero. Truncate will just lop off the numbers past the decimal place whereas Floor will find the largest whole integer less than the current value. With numbers < 0 (negatives), both methods still do the exact same thing, however, the results are not the same. Floor finds the largest whole integer less than the current value, which would be a higher negative value. So, Truncate(-1.5) returns -1 whereas Floor(-1.5) returns -2. It would not matter for this post which to use, but I wanted to be certain to get that out there for anyone who may not be aware of this issue.
Applying the Math
I’m going to use the same duration value as we used previously, 23456. First, we can simply use the powers of ten and Truncate to get the hours component.
Truncate(23456 / 10000 = 2.3456 ) = 2
We have two hours. Now, for the middle and last values. We still use the powers of 10 and Truncate, but we add in the Modulo.
Truncate((23456 / 100 = 234.56) % 100 = 34.56) = 34
In this example, 234.56 / 100 has a remainder of 34.56. We lop off the numbers after the decimal place, and we now have 34. That’s how many minutes we have.
Armed with the Sword of Modulo, we can now finish up and get the last value
23456 % 100 = 56
An added benefit here is that, because we are sticking to numbers and math, regardless the size of integer, the value provided will be technically accurate (up to the bounds of the integer data type). All that is left is to add these three numbers together.
Theoretical to Practical
Doing math in PowerShell is fairly straight forward. I’ve included a function here named ConvertFrom-AgentDuration. This function applies the principles discussed previously to convert the stored duration values into something useful.
[Math]::Truncate($AgentDuration/10000) * 3600 + `
[Math]::Truncate($AgentDuration/100%100) * 60 + `
Within this function, I put this value into a variable of $DurationSeconds. With this value locally, you can specify a datetime parameter (StartDate) to invoke code that will return the ending date of a specific job.
Alternatively, if you specify the TimeSpan switch, the function will return a TimeSpan object which will allow you to get all sorts of different values of duration, including Minutes, milliseconds, etc.
One of the best parts of a TimeSpan object is the ToString() method. This method will return the value in HH:MM:SS. This can very quickly be sent into a SQL function as a 2012 Time parameter for further manipulation – provided it doesn’t extend beyond a day execution.
With all this PowerShell, it seems that T-SQL was getting lonely. Fortunately, all of these calculations that we are doing can be done in T-SQL as well
FLOOR(run_duration/10000) * 3600 +
FLOOR(run_duration/100%100) * 60 +
The FLOOR function in T-SQL is identical to the .Net Floor method. There is not an equivalent to Truncate in T-SQL. However, with positive numbers, both return the same result. Technically, the FLOOR is unnecessary in the T-SQL code because Microsoft implemented truncate internally when switching from a decimal to an integer (see chart about half way down here). I included the function for consistency and to avoid any issue of attempting to take this query to a different platform.
In addition to being able to quickly parse the correct value, another benefit of having such a concise formula is that you can use it inline in a query. Trying to encapsulate logic to deal with these values in a scalar or a cross apply UDF will limit your ability to parallelize the queries. FLOOR, +, *, and % are all parallelizable in SQL Server.
Monitoring SQL Agent jobs is generally an important aspect to a daily routine. Hopefully, these tips and the PowerShell script below will help you to streamline that process and continue to act on accurate information.