Memory_Parallelism Performance Problem

  • 1) How is it you can spend that much money on a server but can't hire a professional to help you tune/configure it? Hunting and pecking on a news group is like being penny wise and pound foolish!

    2) NUMA (which this box probably is) requires VERY special handling to get optimal performance.

    3) Did you bounce sql server after dropping the max memory setting back down? sql server doesn't necessarily give back the ram after you change that setting. Remember that you have lock pages in memory on.

    4) Have you spent any time tuning your queries/data structures?

    5) If this is a pure OLTP box, I would set maxdop to 1 (once you have things tuned out). If OLAP box, consider trying MAX DOP = 8 and increasing the cost threshold for parallelism setting.

    6) Has your wait-stat analysis, I/O stall analysis, profiler analysis or perf mon analysis revealed any areas for improvement?

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • If you manage to get some free time on your prod platform you may want to try the following...

    1. set your ss affinity to use just 4 CPUs.

    2. shutdown the server.

    3. disable core multiplexing within the bios (that is, if ibm didn't manage to remove opt).

    4. boot to the OS and make sure the sql services start cleanly (logs, etc.).

    We were able to prove our ss HT problem by doing this test by disabling the HT within the BIOS.

    ---

    I haven't seen any other reports like yours (not that they don't exist). We are running a quad-corex4, 128GB, Dual 1066 FSBs, 8xPCI-Express DELL R900 which is on a single chassis. I believe your sockets span 2 chassis which means the units are syncing over a bootstrap. You may want to have your hardware service provider take a second look.

  • See here (http://sqlblog.com/blogs/linchi_shea/archive/2007/08/07/reduce-the-contention-on-tempdb-with-trace-flag-1118-to-enable-or-not-to-enable.aspx) and read the associated links too. T1118 doesn't seem to affect performance on 2005 like it did with 2000.

    Also, unless you have a REALLY amazing I/O subsystem for tempdb, I would consider doing only 8 (or possibly even 6 or 4) files for tempdb, unless you are seeing the indications that you are getting SGAM page latch issues.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • Frank Brennan (3/24/2008)


    No problem.

    Additially, setup the -T1118 trace flag for sql startup. The result is that your tempdb allocations will be scattered across multiple files all having their own allocation tables. This works even if the files are on the same LUN.

    I've been told by Microsoft that the -T1118 is no longer needed in 2005 due to the improvements made to the tempdb. Is this not true? I'd be curious to know ...

    Thanks

  • Yes, there have been improvements to the management of tempdb in ss 2k5 (most cache). However, on high volume system cache does not resolve the conflict of allocations on the physical files.

    On an IBM 3850/Win3k3EE-x64 my tempdb spans 6 database LUNs all properly alligned on 64k track boundary with AUs of 64k. There are 2 files on each LUN which (12) as the ss instances uses 14 of 16 CPUs. The reduction in contention is significant on this platform. Additionally, the primary objective here is to create multiple allocation tables by separating the files. Placement of the file for optimal i/o performance is very important but is a secondary objective. The i/o subsystem is as follows, emc/dmx3, 120 dedicated 15k spindles which a horizontal hyper striping with 4/2Gbit paths load balanced.

    Note:

    Not all Microsoft product support engineers are created equal as I'm sure you realize.:)

    ---

    http://www.microsoft.com/technet/prodtechnol/sql/2005/workingwithtempdb.mspx

    Troubleshooting contention caused by to DML operations

    If you determine that the throughput of your application has degraded because of contention in allocation structures, you can use the following techniques to minimize it.

    • Evaluate your application and the query plans to see if you can minimize the creation of work tables and temporary tables. Monitor the perfmon counters as described in Monitoring contention caused by DML operations. Then, use SQL Profiler to correlate the values of these counters with the currently running queries. This helps you identify the queries that are causing the contention in allocation structures.

    • Divide tempdb into multiple data files of equal size. These multiple files don’t necessarily be on different disks/spindles unless you are also encountering I/O bottlenecks as well. The general recommendation is to have one file per CPU because only one thread is active per CPU at one time. SQL Server allocates pages for tempdb objects in a round-robin fashion (also referred to as proportional fill) so that the latches on PFS and SGAM pages are distributed among multiple files. This is supported both in SQL Server 2000 and SQL Server 2005. There are improvements to the proportional fill algorithm in SQL Server 2005.

    • Use TF-1118. Under this trace flag SQL Server allocates full extents to each tempdb object, thereby eliminating the contention on SGAM page. This is done at the expense of some waste of disk space in tempdb. This trace flag has been available since SQL Server 2000. With improvements in tempdb object caching in SQL Server 2005, there should be significantly less contention in allocation structures. If you see contention in SGAM pages, you may want to use this trace flag. Cached tempdb objects may not always be available. For example, cached tempdb objects are destroyed when the query plan with which they are associated is recompiled or removed from the procedure cache.

  • MS PSS Solution to the discussed problem:

    I am a DBA with Noel (no longer with us) and we have a solution from PSS that seems to have worked (for the last 3 days).

    We turned on (undocumented) trace Flags 4618 & 4615. This apparently limits the amount of memory allocated to USERSTORE_TOKENPERM. You can view Knowledge Base article "Queries take a longer time to finish running when the size of the TokenAndPermUserStore cache grows in SQL Server 2005

    http://support.microsoft.com/default.aspx?scid=kb;EN-US;927396" to find info about the problem.

    Only the MS internal view of this knowledge base article gives the suggestion for the trace flags.

    Thanks for all ya'll's info, it did help us review the issue.

    Thanks and God Bless,

    ThomasLL

    Thomas LeBlanc, MVP Data Platform Consultant

  • Thank you for the info.

  • Other info:

    IBM and CMA (reseller) of these IBM machines can not find one customer using these machines for production server sql machines that will share information about their configuration.

    1 customer has them in a testing phase, another using them as virtual machines...

    SQLGuru also questioned the use of an "expert" if we can afford machines like this.

    We used Premier Software support from MS (highest level) and experts from IBM. We have even hired Scalibility Experts to help on other upgrades (recommended from MS), but they are limited on their knowledge as well.

    We post to news groups for 1) in case someone else has had this problem/solution, and 2) to have the solution posted for others to view.

    ThomasLL

    Thomas LeBlanc, MVP Data Platform Consultant

Viewing 8 posts - 16 through 22 (of 22 total)

You must be logged in to reply to this topic. Login to reply