Problem
I’ve noticed on demo machines that sometimes Telegraf doesn’t start on the first try, and this seems to not happen on most of my production servers, but they have a lot more memory and CPU power. So I figured I would write a quick blog post and provide a way to set up a way to get the service to start when the machine is rebooted. This is a known issue that a user has offered a bounty to get it fixed so if you know some Go and have time, please check out the issue on Github.
Solution
The solution is relatively simple. I’ve created a PowerShell script to run in a loop to start the service (it usually starts on the second try) and sleeps for 90 seconds between attempts. I’ve edited the Install-Telegraf.ps1 file provided in the presentation Collecting Performance Metrics to auto-create a folder to hold the script that will autostart the job and create a Start-Telegraf.ps1 file to run when the server starts up to loop until the service starts. NOTE: The script below assumes you will be copying files from a network location to your servers, you will need to make some adjustments if that is not how you install it.
$servers = @(
'server1', 'server2'
)
$servers | % {
Write-Host "$($_)..."
Write-Host "..Create folders and copy files..."
New-Item -Path "\\$($_)\c$\Program Files\telegraf" -ItemType Directory -Force
New-Item -Path "\\$($_)\c$\DBOps" -ItemType Directory -Force
Copy-Item -Path "\\server\telegraf\telegraf.*" -Destination "\\$($_)\c$\Program Files\telegraf\" -Force
Copy-Item -Path "\\server\telegraf\Start-Telegraf.ps1" -Destination "\\$($_)\c$\DBops\Start-Telegraf.ps1" -Force
Invoke-Command -ComputerName $_ -ScriptBlock {
Write-Host '..Install service...'
Stop-Service -Name telegraf -ErrorAction SilentlyContinue
& "c:\program files\telegraf\telegraf.exe" --service install -config "c:\program files\telegraf\telegraf.conf"
SC.EXE Config telegraf Start=Delayed-Auto
Start-Service -Name telegraf
Start-Sleep 90
# Make sure it starts
$service = Get-Service | Where-Object {$_.Status -eq "Running" -and $_.Name -eq "telegraf"}
While($service.count -eq 0) {
Start-Service -Name "telegraf"
Start-Sleep 90
$service = Get-Service | Where-Object {$_.Status -eq "Running" -and $_.Name -eq "telegraf"}
}
Write-Host '..Setup job to mark sure it autostarts...'
#Create job to start job on startup
$trigger = New-JobTrigger -AtStartup -RandomDelay 00:00:30
Register-ScheduledJob -Trigger $trigger -FilePath C:\DBOps\Start-Telegraf.ps1 -Name Start-Telegraf
}
}$service = Get-Service | Where-Object {$_.Status -eq "Running" -and $_.Name -eq "telegraf"}
while ($service.count -eq 0) {
Start-Service -Name "telegraf"
start-sleep 90
$service = Get-Service | Where-Object {$_.Status -eq "Running" -and $_.Name -eq "telegraf"}
}The developers of Telegraf are looking into this issue on Windows, but until it is identified, I needed a way to make sure my demo machines would start the service without me having to do it manually.