Baseline Collector Solution V2

  • Comments posted to this topic are about the item Baseline Collector Solution V2

  • Thanks for the article. I'll need to look it over more closely.

  • Thanks Robert. I've been using TYPEPERF to collect counters from several servers. It appears that if I want IO detail at the database or database file level, I will need to use a solution like yours. Correct me if I'm wrong, but it appears there are no server level counters for database or database file IO. The TYPEPERF solution is nice because a single server can efficiently collect counters from several servers into one database. How well do you think your solution would adapt to a central solution, perhaps using SSIS or powershell to collect the information from various instances?

    We have recently started using SCOM in our data center. I have not used it yet or tried the SQL Server Management pack. Do you know if it would compare with your solution?

    RandyHelpdesk: Perhaps Im not the only one that does not know what you are doing. 😉

  • Hi,

    The script checks for serverproperty(engineedition) != 4 which is sql express (and cannot support jobs)

    however further down the script it will attempt to create schedules:

    Msg 14262, Level 16, State 1, Procedure sp_verify_job_identifiers, Line 67

    The specified @job_name ('Baseline - Collect InstanceInfo/ConfigData/DatabaseInfo/FileInfo/WaitStats') does not exist.

    It should probably perform this test again; or issue a warning.

    Dallas

  • rstone (9/21/2015)


    Thanks Robert. I've been using TYPEPERF to collect counters from several servers. It appears that if I want IO detail at the database or database file level, I will need to use a solution like yours. Correct me if I'm wrong, but it appears there are no server level counters for database or database file IO. The TYPEPERF solution is nice because a single server can efficiently collect counters from several servers into one database. How well do you think your solution would adapt to a central solution, perhaps using SSIS or powershell to collect the information from various instances?

    We have recently started using SCOM in our data center. I have not used it yet or tried the SQL Server Management pack. Do you know if it would compare with your solution?

    Thanks for you comment.

    Performance counters is just one single piece of a well implemented Baseline. In case of features for example which use teampdb (like versioning) you should also monitor the tempdb usage. It is also important to have historical waitstats data for better troubleshooting.

    Last but not least there is the IO you mentioned. Understanding IO is complex. It cannot be determined with a single counter. As far as I know the best source to analyze IO on file level in SQL Server is the [sys].[dm_io_virtual_file_stats] DMV.

    I think my solution can be improved to fit into a central solution. This topic is on my table. I plan to add this feature in the future. I just have too much thing on my list 🙂

    Unfortunately I have never used SCOM so i cannot answer your last question.

  • dallas-1069889 (9/21/2015)


    Hi,

    The script checks for serverproperty(engineedition) != 4 which is sql express (and cannot support jobs)

    however further down the script it will attempt to create schedules:

    Msg 14262, Level 16, State 1, Procedure sp_verify_job_identifiers, Line 67

    The specified @job_name ('Baseline - Collect InstanceInfo/ConfigData/DatabaseInfo/FileInfo/WaitStats') does not exist.

    It should probably perform this test again; or issue a warning.

    Dallas

    Thanks for you comment and your finding. I will fix it! To be alerted about updates to the Baseline Collector Solution, you can sign up to the Update Newsletter[/url].

  • Hello,

    I have executed your script on my sql server. I had a lot of error because it is case sensitive .

    I corrected the mistakes and fontionne well, I think I 'm not the only one , would you corrected the script ?

  • cyrille.pele (9/22/2015)


    Hello,

    I have executed your script on my sql server. I had a lot of error because it is case sensitive .

    I corrected the mistakes and fontionne well, I think I 'm not the only one , would you corrected the script ?

    Thanks for your comment. I put this on my list with high prio and fix it.

    You can sign up to the Update Newsletter[/url] so you will be informed about updates, new versions.

  • cyrille.pele (9/22/2015)


    Hello,

    I have executed your script on my sql server. I had a lot of error because it is case sensitive .

    I corrected the mistakes and fontionne well, I think I 'm not the only one , would you corrected the script ?

    Here is a Quick Fix for Case Sensitive Instances:

    Replace the followings in the setup script:

    ORIGINAL ---> NEW

    @Configname ---> @ConfigName

    @retention ---> @Retention

    @DestTAble ---> @DestTable

    @sqlcommand ---> @SQLCommand

    [Value] ---> [value]

    sp_collectconfigdata ---> sp_CollectConfigData

    sp_collectdatabaseinfo ---> sp_CollectDatabaseInfo

    sp_collectfileinfo ---> sp_CollectFileInfo

    sp_collectinstanceinfo ---> sp_CollectInstanceInfo

    sp_collectiovfstats ---> sp_CollectIOVFStats

    sp_collectperfmondata ---> sp_CollectPerfmonData

    sp_collecttempdbusage ---> sp_CollectTempDBUsage

    sp_collectwaitstats ---> sp_CollectWaitStats

    Sorry for the inconvenience caused.

    Please report every bug[/url] so i can fix it in the next update (probably V2.0.3)

  • Robert, I'm running your solution on my desktop. Looks good to me. Added some views for fun. Single values for average MB/sec, IOPS, and stall per database? I'm going to try this on some of our VMs using the SAN. I know some of the numbers are going to be bad. (My typeperf counters already show this.) It will be interesting to see the effect on the database files. BTW, have you thought about using SSRS for reporting?

    USE [BaselineDB]

    GO

    SET ANSI_NULLS ON

    GO

    SET QUOTED_IDENTIFIER ON

    GO

    CREATE VIEW [dbo].[vw_iovf_stats_by_partition]

    AS

    SELECT MIN([capture_date]) AS [capture_date_start],

    MAX([capture_date]) AS [capture_date_end],

    [database_name],

    LEFT([physical_name], 1) AS [partition],

    SUM([sample_ms]) AS [sample_ms],

    SUM([num_of_reads]) AS [num_of_reads],

    SUM([num_of_bytes_read]) AS [num_of_bytes_read],

    SUM([io_stall_read_ms]) AS [io_stall_read_ms],

    SUM([num_of_writes]) AS [num_of_writes],

    SUM([num_of_bytes_written]) AS [num_of_bytes_written],

    SUM([io_stall_write_ms]) AS [io_stall_write_ms],

    1000. * SUM([num_of_reads]) / SUM([sample_ms]) AS [avg_iops_read],

    1000. * SUM([num_of_writes]) / SUM([sample_ms]) AS [avg_iops_write],

    1000. * (SUM([num_of_reads]) + SUM([num_of_writes])) / SUM([sample_ms]) AS avg_iops_total,

    1000. * SUM([num_of_bytes_read]) / SUM([sample_ms]) AS [avg_bps_read],

    1000. * SUM([num_of_bytes_written]) / SUM([sample_ms]) AS [avg_bps_write],

    1000. * (SUM([num_of_bytes_read]) + SUM([num_of_bytes_written])) / SUM([sample_ms]) AS [avg_bps_total],

    CASE WHEN SUM([num_of_reads]) = 0 THEN NULL

    ELSE 1.0 * SUM([io_stall_read_ms]) / SUM([num_of_reads])

    END AS [avg_io_stall_read_ms],

    CASE WHEN SUM([num_of_writes]) = 0 THEN NULL

    ELSE 1.0 * SUM([io_stall_write_ms]) / SUM([num_of_writes])

    END AS [avg_io_stall_write_ms],

    CASE WHEN SUM([num_of_reads]) + SUM([num_of_writes]) = 0 THEN NULL

    ELSE 1.0 * (SUM([io_stall_read_ms]) + SUM([io_stall_write_ms])) / (SUM([num_of_reads]) + SUM([num_of_writes]))

    END AS [avg_io_stall_ms]

    FROM dbo.iovf_stats

    GROUP BY [database_name], LEFT([physical_name], 1)

    GO

    USE [BaselineDB]

    GO

    SET ANSI_NULLS ON

    GO

    SET QUOTED_IDENTIFIER ON

    GO

    CREATE VIEW [dbo].[vw_iovf_stats_by_database]

    AS

    SELECT MIN([capture_date]) AS [capture_date_start],

    MAX([capture_date]) AS [capture_date_end],

    [database_name],

    SUM([sample_ms]) AS [sample_ms],

    SUM([num_of_reads]) AS [num_of_reads],

    SUM([num_of_bytes_read]) AS [num_of_bytes_read],

    SUM([io_stall_read_ms]) AS [io_stall_read_ms],

    SUM([num_of_writes]) AS [num_of_writes],

    SUM([num_of_bytes_written]) AS [num_of_bytes_written],

    SUM([io_stall_write_ms]) AS [io_stall_write_ms],

    1000. * SUM([num_of_reads]) / SUM([sample_ms]) AS [avg_iops_read],

    1000. * SUM([num_of_writes]) / SUM([sample_ms]) AS [avg_iops_write],

    1000. * (SUM([num_of_reads]) + SUM([num_of_writes])) / SUM([sample_ms]) AS avg_iops_total,

    1000. * SUM([num_of_bytes_read]) / SUM([sample_ms]) AS [avg_bps_read],

    1000. * SUM([num_of_bytes_written]) / SUM([sample_ms]) AS [avg_bps_write],

    1000. * (SUM([num_of_bytes_read]) + SUM([num_of_bytes_written])) / SUM([sample_ms]) AS [avg_bps_total],

    CASE WHEN SUM([num_of_reads]) = 0 THEN NULL

    ELSE 1.0 * SUM([io_stall_read_ms]) / SUM([num_of_reads])

    END AS [avg_io_stall_read_ms],

    CASE WHEN SUM([num_of_writes]) = 0 THEN NULL

    ELSE 1.0 * SUM([io_stall_write_ms]) / SUM([num_of_writes])

    END AS [avg_io_stall_write_ms],

    CASE WHEN SUM([num_of_reads]) + SUM([num_of_writes]) = 0 THEN NULL

    ELSE 1.0 * (SUM([io_stall_read_ms]) + SUM([io_stall_write_ms])) / (SUM([num_of_reads]) + SUM([num_of_writes]))

    END AS [avg_io_stall_ms]

    FROM dbo.iovf_stats

    GROUP BY [database_name]

    GO

    USE [BaselineDB]

    GO

    SET ANSI_NULLS ON

    GO

    SET QUOTED_IDENTIFIER ON

    GO

    CREATE VIEW [dbo].[vw_iovf_stats_by_type]

    AS

    SELECT MIN([capture_date]) as [DateStart]

    ,MAX([capture_date]) as [DateEnd]

    ,[database_name]

    ,CASE [type] WHEN 0 THEN 'ROWS' WHEN 1 THEN 'LOG' WHEN 2 THEN 'FILESTREAM' WHEN 4 THEN 'FULLTEXT' END as [type]

    ,SUM([sample_ms]) as [sample_ms]

    ,SUM([num_of_reads]) as [num_of_reads]

    ,SUM([num_of_bytes_read]) as [num_of_bytes_read]

    ,SUM([io_stall_read_ms]) as [io_stall_read_ms]

    ,SUM([num_of_writes]) as [num_of_writes]

    ,SUM([num_of_bytes_written]) as [num_of_bytes_written]

    ,SUM([io_stall_write_ms]) as [io_stall_write_ms]

    ,1000. * SUM([num_of_reads]) / SUM([sample_ms]) as [avg_iops_read]

    ,1000. * SUM([num_of_writes]) / SUM([sample_ms]) as [avg_iops_write]

    ,1000. * (SUM([num_of_reads]) + SUM([num_of_writes])) / SUM([sample_ms]) as [avg_iops_total]

    ,1000. * SUM([num_of_bytes_read]) / SUM([sample_ms]) as [avg_bps_read]

    ,1000. * SUM([num_of_bytes_written]) / SUM([sample_ms]) as [avg_bps_write]

    ,1000. * (SUM([num_of_bytes_read]) + SUM([num_of_bytes_written])) / SUM([sample_ms]) as [avg_bps_total]

    ,CASE WHEN SUM([num_of_reads]) = 0 THEN NULL

    ELSE 1.0 * SUM([io_stall_read_ms]) / SUM([num_of_reads]) END as [avg_io_stall_read_ms]

    ,CASE WHEN SUM([num_of_writes]) = 0 THEN NULL

    ELSE 1.0 * SUM([io_stall_write_ms]) / SUM([num_of_writes]) END as [avg_io_stall_write_ms]

    ,CASE WHEN SUM([num_of_reads]) + SUM([num_of_writes]) = 0 THEN NULL

    ELSE 1.0 * (SUM([io_stall_read_ms]) + SUM([io_stall_write_ms])) / (SUM([num_of_reads]) + SUM([num_of_writes])) END as [avg_io_stall_ms]

    FROM [dbo].[iovf_stats]

    GROUP BY [database_name], CASE [type] WHEN 0 THEN 'ROWS' WHEN 1 THEN 'LOG' WHEN 2 THEN 'FILESTREAM' WHEN 4 THEN 'FULLTEXT' END

    GO

    USE [BaselineDB]

    GO

    SET ANSI_NULLS ON

    GO

    SET QUOTED_IDENTIFIER ON

    GO

    -- Experimenting with average by the hour. Daily, weekly, or monthly would also be easy, but perhaps a hard table for these as old data can be purged.

    CREATE VIEW [dbo].[vw_iovf_stats_by_database_hourly]

    AS

    SELECT MIN([capture_date]) AS [capture_date_start],

    MAX([capture_date]) AS [capture_date_end],

    [database_name],

    DATEPART(hour, [capture_date]) as [hour],

    SUM([sample_ms]) AS [sample_ms],

    SUM([num_of_reads]) AS [num_of_reads],

    SUM([num_of_bytes_read]) AS [num_of_bytes_read],

    SUM([io_stall_read_ms]) AS [io_stall_read_ms],

    SUM([num_of_writes]) AS [num_of_writes],

    SUM([num_of_bytes_written]) AS [num_of_bytes_written],

    SUM([io_stall_write_ms]) AS [io_stall_write_ms],

    1000. * SUM([num_of_reads]) / SUM([sample_ms]) AS [avg_iops_read],

    1000. * SUM([num_of_writes]) / SUM([sample_ms]) AS [avg_iops_write],

    1000. * (SUM([num_of_reads]) + SUM([num_of_writes])) / SUM([sample_ms]) AS avg_iops_total,

    1000. * SUM([num_of_bytes_read]) / SUM([sample_ms]) AS [avg_bps_read],

    1000. * SUM([num_of_bytes_written]) / SUM([sample_ms]) AS [avg_bps_write],

    1000. * (SUM([num_of_bytes_read]) + SUM([num_of_bytes_written])) / SUM([sample_ms]) AS [avg_bps_total],

    CASE WHEN SUM([num_of_reads]) = 0 THEN NULL

    ELSE 1.0 * SUM([io_stall_read_ms]) / SUM([num_of_reads])

    END AS [avg_io_stall_read_ms],

    CASE WHEN SUM([num_of_writes]) = 0 THEN NULL

    ELSE 1.0 * SUM([io_stall_write_ms]) / SUM([num_of_writes])

    END AS [avg_io_stall_write_ms],

    CASE WHEN SUM([num_of_reads]) + SUM([num_of_writes]) = 0 THEN NULL

    ELSE 1.0 * (SUM([io_stall_read_ms]) + SUM([io_stall_write_ms])) / (SUM([num_of_reads]) + SUM([num_of_writes]))

    END AS [avg_io_stall_ms]

    FROM dbo.iovf_stats

    GROUP BY [database_name], DATEPART(hour, [capture_date])

    GO

    USE [BaselineDB]

    GO

    SET ANSI_NULLS ON

    GO

    SET QUOTED_IDENTIFIER ON

    GO

    -- If specific times are used for some tasks, perhaps grouping on the time slots?

    CREATE VIEW [dbo].[vw_iovf_stats_by_database_sla]

    AS

    SELECT MIN([capture_date]) AS [capture_date_start],

    MAX([capture_date]) AS [capture_date_end],

    [database_name],

    CASE WHEN DATEPART(hour, [capture_date]) = 0 THEN 'BACKUP'

    WHEN DATEPART(hour, [capture_date]) = 3 THEN 'IMAGE'

    WHEN DATEPART(hour, [capture_date]) BETWEEN 7 AND 18 THEN 'USER-SLA'

    WHEN DATEPART(hour, [capture_date]) BETWEEN 19 AND 20 THEN 'PROESSING'

    WHEN DATEPART(hour, [capture_date]) BETWEEN 22 AND 23 THEN 'MAINT'

    ELSE 'NONE'

    END as [time_slot],

    SUM([sample_ms]) AS [sample_ms],

    SUM([num_of_reads]) AS [num_of_reads],

    SUM([num_of_bytes_read]) AS [num_of_bytes_read],

    SUM([io_stall_read_ms]) AS [io_stall_read_ms],

    SUM([num_of_writes]) AS [num_of_writes],

    SUM([num_of_bytes_written]) AS [num_of_bytes_written],

    SUM([io_stall_write_ms]) AS [io_stall_write_ms],

    1000. * SUM([num_of_reads]) / SUM([sample_ms]) AS [avg_iops_read],

    1000. * SUM([num_of_writes]) / SUM([sample_ms]) AS [avg_iops_write],

    1000. * (SUM([num_of_reads]) + SUM([num_of_writes])) / SUM([sample_ms]) AS avg_iops_total,

    1000. * SUM([num_of_bytes_read]) / SUM([sample_ms]) AS [avg_bps_read],

    1000. * SUM([num_of_bytes_written]) / SUM([sample_ms]) AS [avg_bps_write],

    1000. * (SUM([num_of_bytes_read]) + SUM([num_of_bytes_written])) / SUM([sample_ms]) AS [avg_bps_total],

    CASE WHEN SUM([num_of_reads]) = 0 THEN NULL

    ELSE 1.0 * SUM([io_stall_read_ms]) / SUM([num_of_reads])

    END AS [avg_io_stall_read_ms],

    CASE WHEN SUM([num_of_writes]) = 0 THEN NULL

    ELSE 1.0 * SUM([io_stall_write_ms]) / SUM([num_of_writes])

    END AS [avg_io_stall_write_ms],

    CASE WHEN SUM([num_of_reads]) + SUM([num_of_writes]) = 0 THEN NULL

    ELSE 1.0 * (SUM([io_stall_read_ms]) + SUM([io_stall_write_ms])) / (SUM([num_of_reads]) + SUM([num_of_writes]))

    END AS [avg_io_stall_ms]

    FROM dbo.iovf_stats

    GROUP BY [database_name],

    CASE WHEN DATEPART(hour, [capture_date]) = 0 THEN 'BACKUP'

    WHEN DATEPART(hour, [capture_date]) = 3 THEN 'IMAGE'

    WHEN DATEPART(hour, [capture_date]) BETWEEN 7 AND 18 THEN 'USER-SLA'

    WHEN DATEPART(hour, [capture_date]) BETWEEN 19 AND 20 THEN 'PROESSING'

    WHEN DATEPART(hour, [capture_date]) BETWEEN 22 AND 23 THEN 'MAINT'

    ELSE 'NONE'

    END

    GO

    RandyHelpdesk: Perhaps Im not the only one that does not know what you are doing. 😉

  • In the past, I use a tool called ClearTrace to report on the worst of the worst. (I set high limits on the trace reads or writes.) Even great code can go bad. For example, a BizTalk polling procedure updated with a WHERE compare to N'' instead of to the non-Unicode '' caused scans. The procedure used more IO than all the other applications using the server combined. (It now uses 0 logical IO most of the time.) Such a simple thing that went unnoticed for a long time. It would be really nice to have a process in place that tracked the worst offenders in a baseline. There will be always a worst offender, but it would be nice to see how bad they are or if new ones appear.

    Along those lines, a schema change log (schema baseline?) would also be useful to relate to other counters if there is a sudden sustained spike in IO.

    Just thinking out loud.

    RandyHelpdesk: Perhaps Im not the only one that does not know what you are doing. 😉

  • rstone (9/23/2015)


    Robert, I'm running your solution on my desktop. Looks good to me. Added some views for fun. Single values for average MB/sec, IOPS, and stall per database? I'm going to try this on some of our VMs using the SAN. I know some of the numbers are going to be bad. (My typeperf counters already show this.) It will be interesting to see the effect on the database files. BTW, have you thought about using SSRS for reporting?

    <I truncated the code - Robert>

    Thanks for your comment and your code. See? This is exactly why I designed the solution to collect only raw data 🙂 To let everyone write his/her own code. I am glad that you shared yours which is based on my solution.

    Regarding your question about SSRS: yes, this part is already in progress

    I have just too many things on my table.

    Here are some improvements which I plan in the next versions:

    - adding inside code versioning (it is for me, to make the developing more easier)

    - rewrite the SetupScript to use SQLCMD mode

    - schema separation => security purpose

    - create SSRS reports for the Solution

    So Stay Tuned and don't forget to Sign Up to the Update Newsletter![/url]

  • Robert, I look forward to the updates. Here is a script to enable the jobs and schedules after a fresh install. It might save a few clicks here and there for those doing a fresh install.

    USE [msdb]

    GO

    EXEC msdb.dbo.sp_update_job @job_name=N'Baseline - Collect InstanceInfo/ConfigData/DatabaseInfo/FileInfo/WaitStats',

    @enabled=1

    GO

    EXEC msdb.dbo.sp_attach_schedule @job_name=N'Baseline - Collect InstanceInfo/ConfigData/DatabaseInfo/FileInfo/WaitStats',

    @schedule_name=N'baseline - daily - every 6 hours /6AM/12PM/6PM/12AM/'

    GO

    EXEC msdb.dbo.sp_update_schedule @name=N'baseline - daily - every 6 hours /6AM/12PM/6PM/12AM/',

    @enabled=1

    GO

    USE [msdb]

    GO

    EXEC msdb.dbo.sp_update_job @job_name=N'Baseline - CollectIOVFStats',

    @enabled=1

    GO

    EXEC msdb.dbo.sp_attach_schedule @job_name=N'Baseline - CollectIOVFStats',

    @schedule_name=N'baseline - every 4 weeks - 1week - every 3 hours /12AM/3AM/6AM/9AM/12PM/3PM/6PM/9PM/'

    GO

    EXEC msdb.dbo.sp_update_schedule @name=N'baseline - every 4 weeks - 1week - every 3 hours /12AM/3AM/6AM/9AM/12PM/3PM/6PM/9PM/',

    @enabled=1

    GO

    USE [msdb]

    GO

    EXEC msdb.dbo.sp_update_job @job_name=N'Baseline - CollectPerfmonData',

    @enabled=1

    GO

    EXEC msdb.dbo.sp_attach_schedule @job_name=N'Baseline - CollectPerfmonData',

    @schedule_name=N'baseline - every 4 weeks - 1week - every 1 hour - BT/6AM-7PM/'

    GO

    EXEC msdb.dbo.sp_update_schedule @name=N'baseline - every 4 weeks - 1week - every 1 hour - BT/6AM-7PM/',

    @enabled=1

    GO

    USE [msdb]

    GO

    EXEC msdb.dbo.sp_update_job @job_name=N'Baseline - CollectTempDBUsage',

    @enabled=1

    GO

    EXEC msdb.dbo.sp_attach_schedule @job_name=N'Baseline - CollectTempDBUsage',

    @schedule_name=N'baseline - every 4 weeks - 1week - every 1 hour - BT/6AM-7PM/'

    GO

    EXEC msdb.dbo.sp_update_schedule @name=N'baseline - every 4 weeks - 1week - every 1 hour - BT/6AM-7PM/',

    @enabled=1

    GO

    USE [msdb]

    GO

    EXEC msdb.dbo.sp_update_job @job_name=N'Baseline - Output File Cleanup',

    @enabled=1

    GO

    EXEC msdb.dbo.sp_attach_schedule @job_name=N'Baseline - Output File Cleanup',

    @schedule_name=N'baseline - daily - at 2351PM'

    GO

    EXEC msdb.dbo.sp_update_schedule @name=N'baseline - daily - at 2351PM',

    @enabled=1

    GO

    RandyHelpdesk: Perhaps Im not the only one that does not know what you are doing. 😉

  • Hi Ribert, the link to download V2 is corrupt (https://www.sqlservercentral.com/articles/baseline-collector-solution-v2), it lands us

    http://www.sqlapprentice.net/download-bcs/

    Not Found

    The requested URL was not found on this server.

    Additionally, a 404 Not Found error was encountered while trying to use an ErrorDocument to handle the request.

    Thanks to share the right URL

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply