SQL Clone
SQLServerCentral is supported by Redgate
Log in  ::  Register  ::  Not logged in

Brooke Philpott of sqlSentry

By Andy Warren,

I had a chance to meet Brooke Philpott, the lead developer of sqlSentry, at TechEd 2005 in Orlando this year and discuss a few technical points about the product. It's always interesting to hear about the development of a product, so we decided to continue the discussion via email and with Brooke's permission, the result has been written up in an interview format.

AW: Brooke, let's start with some background about you, and then move into some technical questions. First question - how did you get into programming?

BP: I began programming at 8, when my parents bought my brothers and me an Apple IIc. I began writing simple programs in Apple Basic. I liked games like Zork so I started writing text based adventures. I was always fascinated with games and my parents bought me a C game programming book at that time, but I remember telling them I couldn’t use it because I didn’t have that compiler. They had no idea what I was talking about. After that I didn’t really program until I was a sophomore in college.

I majored in Mathematics and I started learning how to program in Mathematica. I really wanted to do a Mandelbrot set generator because I’d studied fractals in high school and I found them to be really interesting (and great looking). At that time 24-bit color machines were finally readily available, so I set out to do a Mandelbrot set generator and finally finished my first prototype. It ran dog slow but it rendered the set in true color. I continued to refine it and my third iteration was 100 times faster than the first. That really got me excited. After that I took another programming course for math and learned C++/Java. I decided to write a Mandelbrot set generator (see a pattern here?) in C++ for my Macintosh PowerPC. That was much harder than using Mathematica because I had to use a lot of Pascal-based libraries like QuickDraw and I was pretty much teaching myself at that point (the teacher was teaching me how to write for loops and cin/cout, not 24-bit rendering/QuickDraw usage). Eventually I got something done but I almost tore my hair out in the process.

AW: Did you see yourself as a programmer when you were young?

BP: I guess, since I was programming at 8. I didn’t know whether I’d end up a programmer, but I knew I was going to be doing “computer stuff”, I just wasn’t sure if it would be hardware or software as I was also very into electronics at that time.

AW: Where did you go to college and what was your major?

BP: I went to Davidson College and majored in Mathematics.

AW: How long have you been with sqlSentry?

BP: I’ve been working on sqlSentry since the product's inception about two and a half years ago. I've been with InterCerve, the company behind sqlSentry, for over 5 years. Initially I was charged with coming up with a visual renderer for jobs to help get a handle on schedule collisions occurring on our own SQL Servers, so the very first thing I worked on was the concept code for the calendar. That work began as a side project during time away from other projects, and quickly expanded from there.

AW: A big coding question with most developers I've worked with: lights on or lights off?

BP: When I’m at work I code with the lights on because I’ve got other people around me. When I’m working from home though I keep them off with no music or other distractions. I tend to focus better that way.

AW: How big is the development team?

BP: The development team is five. We have two core developers, myself and Seth Dingwell who have ownership of all the C# code, plus Greg Gonzalez (the PM and President of the company) writing some of the remote stored procedures like block detection, remote queue failsafe code, DTS and SQL Agent Log readers, etc. Greg also provides a lot of expertise in the areas of database performance tuning/indexing, so we can get the product running as fast as it can on the SQL Server platform. We have two other developers that work on targeted areas like the licensing process on more of an as-needed basis, and they work on other InterCerve development projects as well.

AW: Working in such a small team usually is pretty exciting. Do you enjoy it or would you like to have more people working on the development of the product?

BP: Working on a small team has its advantages. I like the tight feedback loop. When we want to get something done we just sit down, talk about it, scope it out, and do it. Things are very efficient. On the other hand, it can be difficult to get such a large project completed in such a small time with such a small team. It takes a lot of hard work and discipline. The product being the size it is could benefit from additional developers (I think this is true of most projects) but with that comes other challenges. Source control gets trickier. It’s more important to make sure everyone has a clear focus on what they need to be doing. And, of course, you have to pay more money to keep all the developers on the payroll.

AW: Let's move into some more product specific questions now. What language are you using and why did you choose it?

BP: We chose C# for a number of reasons. We wanted to move to .NET as we all came from VB backgrounds and wanted rapid development, as well as the increased power and flexibility offered by the framework. At the time that sqlSentry was started version 1.1 of the framework was out and we had already starting using C# as our primary programming language. Personally I was attracted to the language because it was very clean and concise. I always felt that VB.NET got syntax extensions to handle things it wasn’t initially designed for, and hence things that should be relatively straightforward, like casting, become garbled using functions like CType. I also prefer case sensitivity in a language. Interop using P/Invoke also flows better because I can take C/C++ samples and modify them relatively easily if I need to.

AW: How many lines of code is sqlSentry?

BP: sqlSentry v2.0 is roughly 250 thousand lines of code, including reusable libraries written during the process (the calendar, thread management, and general utilities). This is up from about 80 thousand lines in our v1.2 product.

AW: Do you use SQL-DMO for managing the jobs?

BP: We started using SQL-DMO but had problems with reliability. SQL-DMO is very stateful, and we may need to pass around a “job” object to multiple parts of the app on multiple threads, plus save/load attributes of that to and from the database. We ended up getting a lot of exceptions because we’d need that job to be readable and couldn’t always count on it, because if the connection it was attached to was disconnected for any reason reading the properties wouldn’t work. It’s also a lot thicker because it’s COM, so we wanted to move away from that. Right now we are using SQL-DMO in a few specialized places, like creating a job script and reading SQL Server/Group registrations from the registry.

AW: I understand the application also makes use of SQL Name Space - where/why do you use and how has it worked out?

BP: SQL-NS allowed us to quickly tie into forms that may have otherwise taken a long time to recreate, at the same time providing a familiar UI for the DBA. We knew when somebody clicked properties on a job we wanted to show the job properties as they look in Enterprise Manager, but we didn’t want to reinvent the wheel. SQL-NS made it a snap to do so. However, bringing up the SQL-NS forms via COM-Interop is easier than getting rid of them. We were plagued early on with some very strange errors, including ones that would just make the application crash outright. We were able to resolve every known one by just being really aggressive about getting rid of SQL-NS when done by explicitly calling Marshal.ReleaseComObject and CoFreeUnusedLibraries() via P/Invoke. Without these we ran into issues like heap corruption.

AW: Using SQL-NS seems to have made a lot of sense for you and your users, what do you think about Microsoft's decision to not implement something similar in 2005?

BP: While I would liked to have seen SQL-NS like functionality in 2005 there just wasn’t enough demand for it. Our long-term plan was always to replace these screens with our own versions anyway, so it won't really impact us too much other than expediting that process a bit. SQL Server 2005 is huge product release and I think in order to make any sort of deadline Microsoft had to decide which features would make the cut. When we spoke to them, they were surprised we were using it, as they had a small number of developers that were using the feature in 2000. It was always unsupported anyway, so I can understand why it didn’t make the cut for 2005. Ultimately you have to provide the features that benefit the customer most and I wouldn’t want them to hold up the 2005 release for another 3 months just for us and some other development shops using SQL-NS.

AW: What has been the most complicated feature to implement and why?

BP: There are two. The first was the tiling algorithm for the calendar on busy/complex schedules. Getting the calendar to render single events was easy, but getting it to organize events when there are a lot of events was a challenge. I think I went through 5 algorithms before I finally got the one that works 100% of the time, which is the one you see in the product today. It was finalized before 1.0 but every time I thought I had it nailed some weird case would pop up and it wouldn’t lay out the events correctly. The other really hard piece was the Job Monitor, which actively looks at running jobs and sends notifications when they start, exceed a run threshold, or are missed. It was tough because we are not scheduling these. We have to go in after the fact and figure out when they are going to run and pretty much handle any transition SQL Agent throws at us. For instance, if you know a job is scheduled to start at 10:00 AM and it runs for 2 hours, how do you know when it’s started? You can’t read the log file because SQL Agent doesn’t write to the log until it’s done. If you rely on the fact that it’s executing you may miss it if it’s a short running job. How do you know when it’s complete? You can’t use the execution status because it may start again before you check that. All in all it’s pretty complicated and you have to have a lot of different checks and balances to make sure you get everything.

AW: Do you have an install where we could get a screen shot of a very complicated schedule?

BP: Here is one of a busy global view showing long running jobs and failures across all servers.

Click for larger image

AW: What's your favorite feature?

BP: My favorite feature is the notification system. I’m really proud of that part. It’s very scalable and is very flexible at the same time. I can say that certain conditions are only allowed to be assigned to certain object types (job started can only be assigned to jobs) and certain actions can only be assigned to those (kill job can’t be assigned to job completed). This allows a rich user experience without really any hard coding since these relationships are stored internally. The engine handles the mappings and the inheritance of conditions (global to server to job levels). The whole system can be easily extended as well and can apply to more than just jobs/SQL Server. You could base any monitoring/notification system around it. You just create the conditions and actions, the relationships between them, then feed messages into the notification pipeline when things occur and it takes care of the rest.

AW: What do you believe is the most overlooked benefit of using sqlSentry?

BP: I think its probably queuing. Queuing is complicated and a lot of people don’t fully get it (it took me a while to understand.), but it can be extremely effective in helping level your schedules across your server, because it provides a way for sqlSentry to dynamically reschedule jobs as they are about to run based on the load of the current system. It’s very cool but it takes a little while to figure out the intricacies.

AW: What's a scenario where queuing makes sense?

BP: Queuing is great for cases where you want a particular job to be able to use whatever resources it needs (disk, cpu, memory, network, etc.) and not have to compete with other jobs, but you either can't or don't want to define explicit dependencies using our chaining feature. Greg Gonzalez's (sqlSentry product manager) recent article in SQL Server Standard illustrated how even a small amount of schedule contention can lead to significant performance problems and prevent jobs from ever reaching their optimal runtimes...which can lead to the dreaded maintenance window overrun. For example, if I have a backup job that is being slowed down every night by several recurring jobs that run continuously, I can simply right-click the backup job on the calendar and set it to queue up to 5 other jobs for a specified time, say 30 minutes. Next time the backup job runs sqlSentry will effectively put up to 5 other job schedules "on hold" until either the backup job completes, or runs past 30 minutes, allowing the backup job use whatever resources it needs during that time. There are several other options to give you precise control over exactly what will be queued and for how long, as well as whether or not a queued job auto-starts automatically or resumes its next scheduled run upon leaving the queue. The image below is an example of a backup job that queues several recurring jobs, and how queuing helped reduce it's runtime by 7.5%. orange represents schedule collisions.

Click for larger image

AW: Sounds like you've put a lot of time into it - any planned enhancements on the horizon?

BP: Probably the best new queuing feature in v2.0 is the ability to set an "auto-start threshold" for any queuing job. One risk with queuing has always been that if a queuing job happened to queue some critical job that only runs once a day or less frequently, say a nightly backup job, if you didn't remember to set the backup job to "never be queued" or to auto-start automatically when popping off the queue you might miss a scheduled run. In our v1.0 product you had to remember to do this for any critical jobs, which could be quite tedious.

In v2.0, the auto-start threshold defaults to 4 hours, which means that any time a job is queued, when it pops off the queue if its next scheduled run is more than 4 hours in the future it will auto-start automatically...if it's within 4 hours it will resume it's next scheduled run. This does two things: it prevents those critical non-recurring jobs from ever missing a run, and it helps even the load when a queuing job completes by only auto-starting the jobs that really need to be started, thus minimizing contention from a bunch of jobs being auto-started at once. The best part is you don't have to touch any jobs other than the queuing job for this to happen.

AW: Can you tell us a bit about your internal beta program that you conduct before any public releases?

BP: For maintenance releases, the developers test the code first after it’s written and before anything is checked in. We consider that Phase 1. We do a build when we are ready that’s internal. After that everything is labeled, our issue tracking system is updated that items are ready for test, and the build is marked. We’re fortunate that one of the business units of our parent company, InterCerve, is a Microsoft-focused hosting operation, so we have a tremendous test bed internally to help flush out issues early before we push anything out to external beta testers or the general public.

So in Phase 2 it moves to our testers and DBA's here internally, who verify every change and also start using it day to day, running it 24/7 against over 100 servers (SQL Server and Task Scheduler). I call this the “bake” period because it’s in the oven. Generally we let this set for a while until we are comfortable that the fixes and features were implemented correctly and no regressions were introduced. Once we reach that point we release it to the public.

New to sqlSentry v2.0 is an automated version checker, so users are notified right away whenever a new build is available. After a major release we average one maintenance release every 1 - 2 weeks, and this feature has proven invaluable in helping ensure customers have the latest and greatest bits. For major releases it’s a similar process but we obviously have a larger dev/testing window, and before public release we also have a Phase 3 where it’s sent to a targeted set of private beta sites that tend to have large, complex environments since we really want to stress the app. Our integrated exception reporting system is key during all beta phases as it enables testers to report issues with minimal effort. It’s also critical after release to the public since if anything happens to slip through beta we know about it right away.

AW: What about exception reporting? Is that done via email or internet connection?

BP: It's done via Internet connection to a secure web service if the "submit" button is clicked on the exception dialogue box. If connectivity isn't available the user can just as easily copy and paste the exception details into an email and send it to us. Most of the exceptions that come in are via the web service. From there we have an exception management system which aggregates submitted exceptions by build #, total unique users affected, times submitted, etc., which is great for helping us prioritize the associated fixes.

AW: How many support requests do you get in a week/month?

BP: On a light week we may have about 5 requests. On a busy week we may have 20 to 40. The monthly average is probably around 100. We’ve recently introduced forums and a KB on our site to give folks another means to find answers without having to contact support directly. We’re also in the process of rolling out a customer portal so that users can log in and submit bugs and feature requests as well as check the status of their open issues.

AW: You mentioned that the application checks for updates automatically. Many servers are firewalled with no access to the internet. Can I assume it fails gracefully in such situations? And what are the alternatives to learn about updates?

BP: This is true, many servers are firewalled and can't use it. However, what we've found is that many DBA workstations do have Internet access, and since the update checker runs only from the sqlSentry Console which is typically installed on the workstation, it is able to connect successfully. It's on by default and checks only when the console is first opened, and since it runs on a different thread it won't block other console activity while it's trying to connect. If it can't connect it will respond gracefully with an error message, and can easily be disabled permanently by checking a box. If the user isn't running the update checker they can always go directly to our download page to get the latest build: http://www.sqlsentry.net/bp . We do email users whenever a major version or "milestone" incremental version is released, but we don't typically email users for every minor incremental release. That is unless we are working with someone on a particular issue that affects them, in which case we'll let them know directly as soon as a new build is available with a fix.

AW: How will the changes in SQL Server 2005 affect the application? Will it require a different version?

BP: There are some pretty significant changes in SQL Server 2005. Some are small, like the fact that jobs and schedules are separate entities with a many to many relationship versus a one to many. Others are more significant, for example, the transition from DTS to SSIS. Others are non-existent (SQL-NS is gone in 2005). There will be a new version required to support 2005 due to these changes, sqlSentry v2.5, which we just announced at Tech-Ed. (link: http://www.sqlsentry.net)

Vendor Update: sqlSentry v2.5 was released on November 30, 2005

AW: Brooke, I think that wraps up the technical questions. Let's conclude with a final question about you - what do you do to relax and have fun? And do you have a photo to share so we can make you famous?

BP: I try to keep in shape so I work out about 5 times a week. I’m a big gaming fan so I play online games like World of Warcraft and Battlefield 2 (my current favorite). I like to travel as well when I have the time. I’m also trying to get back into making music but it’s a time consuming process.


Total article views: 5735 | Views in the last 30 days: 1
Related Articles

Team-based Database Development: Playing Nice With Others

Here are the slides and links to awesome resources for my presentation, “Team-based Database Develop...


Building a job to monitor other jobs

The other day Tom Roush (b/t) and Tim Radney (b/t) were having a discussion on twitter about using s...


Question about job scheduling

How does the 'every x minutes/hours' schedule work?


Professional Development Week

Last week was Professional Development Week at SQL University. Steve Jones talks about the importanc...


Other Questions about BULK_INSERT

Other Questions about BULK_INSERT