How to isolate that network is issue of bad query performance

  • Hi Friends,

    I would like to know a way to isolate sql server bad query performance is caused by NETWORK and not the sql server.

    Are there any specific tips and tricks in ssms, waitype of specific perfmon counter permissable values for network is concerned? How to baseline the network related counters and what values should I consider as good /better/poor.

    Any help would be appreciated.

    Thank you.

  • In my experience, the network is very rarely the cause of poor performance. When it appears to be the network it is more often one of the following:

    • An app server slow to process rows returned from a query
    • A query returning too much data
    • A domain controller throwing a wobbly and causing failed connections

    But sometimes network can be an issue, and there are a few ways to find out.

    First try pinging the server. Make sure you ping it from the PC or server that is running the application that has the problem.

    Then there's the ASYNC_NETWORK_IO wait type, but high wait time for this type does not prove anything as a poorly designed application could also be the cause.

    Check for packet errors from the SQL Server. Anything other than zero needs to be reported to your network team, especially if the value continues to rise.

    select @@PACKET_ERRORS

    There are also some performance counters accessible from SQL Server

    select *

    from sys.dm_os_performance_counters

    where counter_name like 'Network IO waits%'

    Again, anything non-zero needs investigation.

    There are more network-specific performance counters, but you'll need to use perfmon to look at those.

    And finally, look at the connectivity ring buffer. Entries here don't necessarily point to network problems. They could be AD glitches, or even high CPU on the server. Still worth looking at though.

    use master;

    set nocount on;

    select top 100 dateadd(ms, b.timestamp - i.ms_ticks, getdate()) notification_time,

    cast(b.record as xml).value('(//RecordType)[1]', 'nvarchar(100)') record_type,

    cast(b.record as xml).value('(//RecordSource)[1]', 'nvarchar(100)') record_source,

    cast(b.record as xml).value('(//Spid)[1]', 'int') spid,

    cast(b.record as xml).value('(//OSError)[1]', 'int') os_error,

    cast(b.record as xml).value('(//SniConsumerError)[1]', 'int') sni_consumer_error,

    cast(b.record as xml).value('(//SniProvider)[1]', 'int') sni_provider,

    cast(b.record as xml).value('(//State)[1]', 'int') state,

    cast(b.record as xml).value('(//RemoteHost)[1]', 'nvarchar(20)') remote_host,

    cast(b.record as xml).value('(//RemotePort)[1]', 'nvarchar(10)') remote_port,

    cast(b.record as xml).value('(//TdsInputBufferError)[1]', 'int') tds_input_buffer_error,

    cast(b.record as xml).value('(//TdsOutputBufferError)[1]', 'int') tds_output_buffer_error,

    cast(b.record as xml).value('(//TdsInputBufferBytes)[1]', 'int') tds_input_buffer_bytes,

    cast(b.record as xml).value('(//PhysicalConnectionIsKilled)[1]', 'int') physical_connection_is_killed,

    cast(b.record as xml).value('(//DisconnectDueToReadError)[1]', 'int') disconnect_due_to_read_error,

    cast(b.record as xml).value('(//NetworkErrorFoundInInputStream)[1]', 'int') network_error_found_in_input_stream,

    cast(b.record as xml).value('(//ErrorFoundBeforeLogin)[1]', 'int') error_found_before_login,

    cast(b.record as xml).value('(//SessionIsKilled)[1]', 'int') session_is_killed,

    cast(b.record as xml).value('(//TotalLoginTimeInMilliseconds)[1]', 'int') total_login_time_in_milliseconds,

    cast(b.record as xml).value('(//LoginTaskEnqueuedInMilliseconds)[1]', 'int') login_task_enqueued_in_milliseconds,

    cast(b.record as xml).value('(//NetworkWritesInMilliseconds)[1]', 'int') network_writes_in_milliseconds,

    cast(b.record as xml).value('(//NetworkReadsInMilliseconds)[1]', 'int') network_reads_in_milliseconds,

    cast(b.record as xml).value('(//SslProcessingInMilliseconds)[1]', 'int') ssl_processing_in_milliseconds,

    cast(b.record as xml).value('(//SspiProcessingInMilliseconds)[1]', 'int') sspi_processing_in_milliseconds,

    cast(b.record as xml).value('(//LoginTriggerAndResourceGovernorProcessingInMilliseconds)[1]', 'int') login_trigger_and_resource_governor_processing_in_milliseconds,

    m.text

    from sys.dm_os_ring_buffers b

    cross join sys.dm_os_sys_info i

    left join sys.messages m on m.message_id = cast(b.record as xml).value('(//SniConsumerError)[1]', 'int') and m.language_id = 1033

    where b.ring_buffer_type = 'RING_BUFFER_CONNECTIVITY'

    and cast(b.timestamp - i.ms_ticks as bigint) between -2000000000 and 2000000000

    order by 1 desc

    Hope this helps.

  • Yeah, networking is pretty near the bottom of my list for performance issues too. In addition to the network IO waits you can also look to bytes total/sec and % Net Utilization just to see load. I'll bet most of your issues are elsewhere.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Thank you much for the help.

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply