• RandomStream - Wednesday, January 3, 2018 8:28 PM

    I will leave the booking writing to others. Just being bored at work and wonder how other DBA's fill their day. 

    Of course, I would like to know how other DBA's manage large number of instances, whether it is possible to have a standard approach that we can establish and share...

    In that case, here's a typical day for me...

    First, I refuse to be bored.  That makes for a not-so-fun job.  To be honest, work is rather fun because of the attitude that I've developed.  I don't have a large number of instances to work on because we've rather consolidated a lot of things onto just several servers and we don't have a huge number of people connecting (usually just a little over 400 connections to the "big server" at one time) so I can't help you there although some of the things I do could be applied to larger server farms.

    I'm a bit of a night owl but I have to be up for the daily scrum at 9AM.  A little after 8AM, during morning coffee, I login remotely and check the morning reports (which I wrote myself) for jobs that may have failed overnight.  If any have failed (and it's a very rare thing), I check the error messages and any logs the jobs may have produced and send emails out to the owners of the jobs (all in IT) with what the failure was and any suggestions I may have due to esoteric knowledge  even though they receive the same email.  It's a color coded report and it usually takes just several seconds for the 6 main servers.  Most importantly, I check to make sure that the backups had no failures (2 lines on each report) and that the nightly test-restore for the "big boy" worked correctly.  If a job does appear as failed, there's a lot of data for each job on the report that helps with troubleshooting but I still have to check the logs.  Like I said, it's very rare that there's a failure.  If there is a failure that I can help with, then I know what a part of my morning will look like.

    Then I check the "Enterprise Disk Status" report (also wrote that).  It checks 285 "servers", most of which are not SQL Servers, for anything that looks out of the ordinary insofar as free-space or dirty disks or being able to access any given box (is the box up?).  Again, if something goes haywire here, I know what a part of my morning will look like.

    Then I check the big server a bit more closely.  I check PerfMon to see how it's running before "the crowd" starts hitting it for real at 9 and I check the email folder that has any alerts for blocking and high CPU usage just to make sure that nothing went crazy during the major nightly runs that we do.  Part of the checks that start with Perfmon is that I have baselines that I've "learned" and copied and if something doesn't look right, I check to see "whats runnining" (another proc I wrote) to see if someone may have had to do a rerun from a system that does a lot of runs using a different scheduling system (which I hate because I have no visibility of that). 

    Then I do a check of what the "worst queries" (another proc I wrote) running overnight were to see if anything changed.  I may grab the included execution plan or two from that (it's checks by CPU, Reads, Writes, Duration, and Counts) if something seems a bit odd, which would also define a part of my morning if something did look odd.

    Since the system handles most of the GUI stuff AND the big nightly and daily batch runs, I can tell by "feel", whether or not I need to "nuke the system" with a DBCC FREEPROCCACHE to get over any possible bad form of parameter sniffing (sometimes happens because it some huge runs over night and those runs aren't necessarily "cache-conducive" to what the GUI expects.  Contrary to popular belief, it's not actually a huge evolution to do a DBCC FRRPROCCACHE on a system (it recovers VERY quickly) but your mileage may vary.

    Then, at 9AM, I dial in to participate in the morning scrum.  I always go last (let's other people describe any problems that I might be able to help with or have knowledge of) and describe the expected "here's what I did yesterday, here's what I'm doing today" and,. if necessary, "here's what I need help with (which is pretty rare) where I also provide a verbal reminder of the emails I may have sent out about jobs.

    Then I take a shower and go to work (I've got a 12 minute commute time).  After that, just about anything goes.

    I do a shedload of peer reviews for the front end Developers (great bunch of folks that "get it", especially when it comes to SQL Server performance and resource usage).  The peer reviews used to take a long time because I used peer reviews as a time to mentor folks in SQL Server.  We don't have a high turnover rate so I've needed to do less mentoring in the last year but still do so when necessary.  They know it's "not optional" when it happens and are happy to learn something new or different because they get to brag about performance fixes they may have made on their scrum and I brag about them, as well.

    I work on the "DBA Task List" that the Dev Manager puts here requests for peer reviews and deployments on an tackle those usually in the morning.  Of course, there's walkup questions that I help the Developers and others (like the Infrastructure Team, the Data Architect Team, etc, etc) with throughout the day and I almost never push them off.  Because of the mentoring I've done, they don't usually have many questions anymore but, when they do, it's a good one and I'm happy to help them and they're happy to have the help.  It sometimes keeps them from having to do hours of research and possibly from coming to a bad conclusion.

    Then I turn to my projects and there's a ton of long term stuff (that I've determined need to be worked on).  For example, I've found a way to determine which statistics on some of the Terra-Byte large databases are actually being used and the long term experiment is to make sure I'm right and so check my runs on that, do any tweaking, etc.  I've also determined that indexes usually don't need any kind of maintenance (and that most people are actually doing it wrong) but have found some exceptions to the rule and am developing a new system for detection and handling of those. I've got several such projects that I'm working on.  I may have also been given some assignments by my boss, the Enterprise Architect, such as helping with concepts and code on a major rewrite the rest of his team is doing, creating "Loan Migrations" for bank mergers or system changes, or writing code to replace or optimize poorly performing code/indexes that I may have found during my daily checks.  I also work with his on-site contractors.

    Not having a shedload of systems (due to some serious consolidation) to monitor is actually a double edged sword... although I have CPU, Blocking, and other alerts setup, I keep a copy of PerfMon running on an RDC to each production system (not the load on the system that most think and is critical that I already have a session established if something goes haywire) and will occasion flip over to check them especially if I get more than 1 alert from a system.  Of course, if I see something odd, I'll check it out using the stored procedures that I wrote to ("whats running", etc).

    Depending on the day, there may be a lot of other ad hoc stuff that I have to do.

    In the evening, I'll do any deployments that couldn't be done due to contention during the day and that's when I can bear down on some of my stuff that also needs to be done after hours.  Of course, sometimes, it makes for a long night.  And, of course, since working with SQL Server is kind of a hobby and not just a job, I'll cruise this site for ideas and help where I can.  A lot of times, I'll end up writing something to answer a question (or seeing someone else's answer) that actually has some application at work.

    There are some times that I have a half hour before a meeting and it makes no sense to ramp up on working on one of my or other projects and so I'll spend a little time reading Books Online about something that I'd seen the previous evening on this site or something that I'm just curious about.  I'm never bored with the job.  In the very rare cases where there's little to nothing to day, I'll make the time to learn or try something new because, as long as I've been doing all of this, there's always something new to learn or some "Eureka" moment to be had, especially in the area of performance of code.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)