Getting a rough average daily data transferred stat, will this work?

  • One of my co-workers came to me with a need.  He was being asked (by a customer) to determine how much data was going in / out for their system.  Now, they may be looking for the data movement from the application server to and from the clients, the customer wasn't clear on this, and not the data going in / out of their various databases on the SQL Server, but I looked into if and how I could grab such information.

    It seemed to me I should be able to query SYS.DM_IO_VIRTUAL_FILE_STATS for the server, filtering out any system databases, which led me to come up with the below query.

    My question, is this a reasonable presumption of the data I get out of this?  I'm not looking for exact numbers, I know the data in the NUM_OF_BYTES_* is cumulative since the last server start (thus why I'm aiming to get an "average" number instead of exact) so can I reasonably say that "yes, here's a rough average over {timespan} of data read and written to their databases, which does include reads by the backups."

    select sample_ms / 1000.0000 / 60.0000 / 60.0000 / 24.0000 as [Sample_in_days]
        , sum(num_of_bytes_read) / 1000000000.0000 as [Num_GB_Read]
        , (sum(num_of_bytes_read) / 1000000000.0000) / (sample_ms / 1000.0000 / 60.0000 / 60.0000 / 24.0000) as [Avg_GB_Read_day]
        , sum(num_of_bytes_written) /1000000000.0000 as [Num_GB_Written]
        , (sum(num_of_bytes_written) / 1000000000.0000) / (sample_ms / 1000.0000 / 60.0000 / 60.0000 / 24.0000) as [Avg_GB_written_day]
    from sys.dm_io_virtual_file_stats(NULL, NULL)
    where database_id not in (select database_id from sys.databases where name in ('master','model','msdb', 'tempdb'))
    group by sample_ms

  • That'll also keep track of any throughput due to backups, index maintenance, dbcc etc.  Depending on your job frequencies, this may be heavily skewing the IO.

    You could report on the backups as an indicator of data in / data modification.  Depending on the amount of backup history you have, that could also be given as a more granular report showing the ebb and flow of traffic through the day (and giving you the opportunity to point at a big spike at 2am Sunday, saying "and that's where we rebuild your indexes")

    Thomas Rushton
    blog: https://thelonedba.wordpress.com

  • A further thought - the virtual file stats dmv will also not account for any data requests that are served from cache.  Depending on the config of your server, that could be almost all of it.

    Thomas Rushton
    blog: https://thelonedba.wordpress.com

  • Since the reason is not clear why this information is needed, I would contact network admin to find out data movement between two server, they can probably setup a capture on firewall  to find out , or they already may have a tool that have this information.

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply