To setup an alert for balancing sql cluster nodes in the event of a failover

  • Here's what I am really looking for...

    We have a 2 node cluster setup in our environment, and in the recent times we had to face quite a lot of trouble as all the resources on suppose node a fails over and sits there on node b, so for e.g: if I am having 5 instances running one node a, and 5 on node b, if for some reason (may be a n/w hiccup), quorum is lost and the node gets evicted (in this case node a), and hence all the resources/instances from node a flips on to node b, and remains there causing excessive load on the server.

    Ok now to address our main concern. The problem here is we don't want to overburden the server with all the resources junked up on one node, but instead we want the nodes to be balanced for e.g: 5 instances on node a & 5 instances on node b. It doesn't matter who is the owner of these instances, it could be node a or node b, but at least the nodes should be balanced.

    Now, my concern is on setting up an alert on this to mail me, only when certain criteria is met (for e.g: if there are more than 5 instances on node b, I should get an alert to re balance the nodes, and vice versa). So, basically what I meant is that I need to setup an alert on SQL Cluster which will send me an alert if & only if I meet a certain condition wherein any of the node seems to be out of balance.

    I have a vague idea for this...something to start of with.

    The condition probably should be

    nSQLNode > round(nSQL/nNode)

    where nNode = total # of nodes in a windows cluster (in our case = 2)

    nSQL = total # of sql instances in the cluster (depends)

    nSQLNode = # of SQL instances on any of the node (for e.g: node a/node b)

    round() is a math function to round the decimal number.

    I have laid the foundation to build this, but I would really appreciate if you guys can put around your coding skills, and get this thing working. It would be really helpful for all of us. I hope to see either a script that you might already been using in your environment or either pointing out to something from where we can build things further (like some kind of a blog or generalized script which can be customized to suit our requirements). Any kind of help and assistance would be highly appreciated.

    You can always mail me on my e-mail id faisalfarouqi@gmail.com, if you found any useful scripts for setting this up, it can be t-sql (priority on top), powershell or any other thing but it should work to get the alert going...if and only if the condition is met.

    Thanks,

    Faisal

  • This would be one for powershell, query the Get-WMIObject Win32_Process cmdlet see how many sqlserver.exe processes are running and if > 5 send you a mail.

    Could then set it up as a scheduled task using windows schedule, or as a Agent job on each instance, but I would go with scheduled task as that way if a node fails you dont get between 5 - 10 emails saying the same thing.

    Edit

    This is a starting block

    Get-WmiObject win32_process | where-Object {$_.ProcessName -eq "sqlservr.exe"} | select name

    Will get the number of processes running, would just need to aggregate them and find a send mail function of the web, then build in the logic.

  • Hi anthony,

    Thanks. for the reply I can get a send mail function, but I just need a complete structure in order to get this working. I believe setting up a scheduled task will be more useful over here, suppose incase if the instance fails to come up on node b then there won't be any possibility of an e-mail being triggered.

    I believe at least we got started over here, but I need some if logic implemented in there which will generate out an alert only if the above condition is met, and some kind of a retry interval to check the status of these nodes.

    I would appreciate if some one could build up a complete generalized logic around this, as I am not that good with powershell, but an honest comment around the code could work wonders, and help others as well in figuring out what the code does.

    Regards,

    Faisal

  • function sendmail{

    Write-Host "Sending Email"

    #SMTP server name

    $smtpServer = ""

    #Creating a Mail object

    $msg = new-object Net.Mail.MailMessage

    #Creating SMTP server object

    $smtp = new-object Net.Mail.SmtpClient($smtpServer)

    #Email structure

    $msg.From = ""

    $msg.To.Add("")

    $msg.subject = "To many instances on $env:COMPUTERNAME"

    $msg.body = ""

    #Sending email

    $smtp.Send($msg)

    }

    #Get the number of sqlservr.exe processes running and assign it to the param

    $param = Get-WmiObject win32_process | where-Object {$_.ProcessName -eq "sqlservr.exe"} | select name

    #If the number of sqlservr.exe process > 5 then execute the sendmail function

    IF ($param.Count -gt 5) {sendmail}

    Or am I missing something?

    Create it as a PS1 file, setup a windows OS task on each node for the same PS1 file, run it every X minutes

  • Thanks. a lot anthony. I really appreciate your help. I'll still test this thing up, in my environment and will get back to the topic in a few days. One more thing Anthony, pls. if possible could you edit your post, and comment out the code part for a brief description of what the code is actually doing, it would certainly help all of us in understanding it's logic, and the execution structure.

    Guys...I'll still keep this topic open ended, as I would like to see people coming up with different strategies to get this working. I would be glad to see something in t-sql as well.

    Thanks & Regards,

    Faisal

  • Hi Anthony,

    I tested this powershell script, and it's working amazingly well. The only problem that I am facing now, is that my boss is looking for something which needs to be more generic rather than a hard coded solution. For example, if we have 15 instances on a 2 node windows cluster, then we don't have to change the value in this line IF ($param.Count -gt 5) {sendmail} (like changing the value from 5 to 6 etc). Now, I need to take this through parameters, not manually stating the value. For e.g: this could be something like this

    if (nSQLNode > round(nSQL/nNode)) {sendmail}

    where nNode = total # of nodes in a windows cluster -- for e.g if it's a 2 node cluster, we are only interested in getting the count, not the node name. I know there is something cluster node to view the name of the nodes, but we just need the count

    nSQL = total # of SQL instances in the cluster --same applies here...we need count

    nSQLNode = # of Sql instances in any of the node.

    round() is a math function to round the decimal number.

    I know for sure, this can be implemented in powershell, but I don't know how I can get the count using cmdlets for each of these parameters. Can we take the parameter value using t-sql code, and incorporate it in powershell, and run it through windows task scheduler. Is that possible?

    I believe rest of the script will remain same.I don't have an idea whether we have anything called round() as a powershell function, but I do know there is something for e.g

    $param = [System.Math]::Round().

    Pls. provide your valuable inputs on how can programmatically get the counts.

    Regards,

    Faisal

  • Its easy, first setup a table in your DBA util database listing the nodes and the preferred location of any instances.

    Just use

    select serverproperty('computernamephysicalnetbios')

    in your audit job and it will return the node name of the cluster node the instance is active on. Use this tio determine when your instance moves and when more than one occupy the same node 😉

    -----------------------------------------------------------------------------------------------------------

    "Ya can't make an omelette without breaking just a few eggs" 😉

  • Thanks Perry,

    But that's not what I am looking for. Instead, something in powershell would work wonders. I would only like to have to get those 3 parameters from the system instead of hard coding them...anyone who could provide a little help over here.

    Regards,

    Faisal

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply