Compare trace from 2 different DB Servers

  • Hi Geeks,

    I have a situation where our customers are planning to migrate their application from one Datacenter to another. We are currently performing a test to understand any performance impact it might have in new DC with respect to Database. Hence we are trying to compare sql trace from both the DC's.

    So the setup is App and DB installed on separate servers in old DC and the same App and DB (with the exact version and backup from old DC) on New DC.

    >We start sql profiler on old DC and start the application and collect the trace. Also collect the similar trace in the new DC.

    >I see that trace in old DC shows it took 35sec whereas new DC took 65 Sec.

    > Checked for Showplan XML and both have exact same plans.

    >Only difference I see is in terms of read count logged in trace files where it is more in new DC than in old DC.

    My question would be if they are exact same database why would read count be more in DC and what can be done to optimize it.

    Thanks,

  • First: Are you moving from SQL Server 2012 (or older) to SQL Server 2014 (or newer)? In that case, the issues are most likely caused by the changed to cardinality estimation that were introduced in SQL Server 2014. Google "SQL Server 2014 new cardinality estimator" to find a lot of information and some possible fixes.

    Probably incomplete, but here are a few thoughts on possible explanations for the differences you are seeing.

    Different execution times:

    * Slower hardware (obviously, though unlikely in your case)

    * Data was already in the buffer pool on one server but not in cache on the other

    * Execution plans were already in cache on one server but had to be compiled on the other

    ... and of course it is possible that the extra execution time was a direct result of the extra IO and nothing else.

    Different amount of IO:

    * Did you look at logical reads or physical reads? If physical, we are back to data being or not being in the buffer pool.

    * Did you double check that all data is exactly identical on both servers?

    * There may be differences in fragmentation - an index that is just rebuilt with a 100% fill factor takes less pages than a fragmented index, which in turns takes less pages than an index that is just rebuilt with a low fill factor.

    If none of these explain the differences, then I suggest first checking the trace settings to make sure that you really capture all activity on the server, and then to compare the traces step by step. Look at the IO for each query, find the ones with a difference, then compare the execution plans for those. Perhaps you overlooked a difference? If they do have the same shape, look at differences in rowcounts - both actual and estimated. If the new server has a too low estimate, it is possible that a memory grant was too low which causes spilling to tempdb (which means IO). If the actual rowcount is different, then either the data is different after all, or the order of the data is it is read and processed is different and some weird combination of factors in the plan makes that cause a diifferent amount of activity, or there is a difference in the plan that is not visible in the graphical representation but burried deep into the properties of the various operators.


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • Have you updated statistics and indices all the tables after the move? If not that is the first thing i would do before comparing baselines.

  • harikumar.mindi (1/8/2016)


    Hi Geeks,

    I have a situation where our customers are planning to migrate their application from one Datacenter to another. We are currently performing a test to understand any performance impact it might have in new DC with respect to Database. Hence we are trying to compare sql trace from both the DC's.

    So the setup is App and DB installed on separate servers in old DC and the same App and DB (with the exact version and backup from old DC) on New DC.

    >We start sql profiler on old DC and start the application and collect the trace. Also collect the similar trace in the new DC.

    >I see that trace in old DC shows it took 35sec whereas new DC took 65 Sec.

    > Checked for Showplan XML and both have exact same plans.

    >Only difference I see is in terms of read count logged in trace files where it is more in new DC than in old DC.

    My question would be if they are exact same database why would read count be more in DC and what can be done to optimize it.

    Thanks,

    You are using Loran-C when everyone else is using GPS, a trace of a task tells you twat if isn't like for like, the impact of doing a trace could potentially explain the difference in execution time. Look into using extended events which have far less impact on a system, especially if the system is bogged.

    😎

  • Those are some clear directions i can start from. I am sure the data is exactly the same because i restored it from source to destination. However fragmentation, updating statistics and few others are something i would need to try first and then take fresh trace.

    Even better if i start using Extended Events.

    I will keep posting my results after i compare!

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply