|
|
|
SSC Rookie
      
Group: General Forum Members
Last Login: Monday, March 24, 2008 1:38 PM
Points: 32,
Visits: 69
|
|
Running Sql server 2005 on a 4XQuadcore Intel processor w128 gig of ram. Max server memory set to 120 Gig. We did a server migration from a 8Xdual core machine with less ram last week
Since this migration we have seen a sudden jump on the 3rd day in producton of locks, especially those involving Parallelism where we had no issues before. All other configuration settings on the box and in sql are set the same as the old server and have been verified. Here are my sp_config values:
Here is a copy of the memory standard report in .txt format name minimum maximum config_value run_value Ad Hoc Distributed Queries 0 1 0 0 affinity I/O mask -2147483648 2147483647 0 0 affinity mask -2147483648 2147483647 0 0 Agent XPs 0 1 1 1 allow updates 0 1 0 0 awe enabled 0 1 0 0 blocked process threshold 0 86400 0 0 c2 audit mode 0 1 0 0 clr enabled 0 1 1 1 common criteria compliance enabled 0 1 0 0 cost threshold for parallelism 0 32767 5 5 cross db ownership chaining 0 1 0 0 cursor threshold -1 2147483647 -1 -1 Database Mail XPs 0 1 1 1 default full-text language 0 2147483647 1033 1033 default language 0 9999 0 0 default trace enabled 0 1 1 1 disallow results from triggers 0 1 0 0 fill factor (%) 0 100 0 0 ft crawl bandwidth (max) 0 32767 100 100 ft crawl bandwidth (min) 0 32767 0 0 ft notify bandwidth (max) 0 32767 100 100 ft notify bandwidth (min) 0 32767 0 0 index create memory (KB) 704 2147483647 0 0 in-doubt xact resolution 0 2 0 0 lightweight pooling 0 1 0 0 locks 5000 2147483647 0 0 max degree of parallelism 0 64 0 0 max full-text crawl range 0 256 4 4 max server memory (MB) 16 2147483647 2147483647 2147483647 max text repl size (B) 0 2147483647 65536 65536 max worker threads 128 32767 0 0 media retention 0 365 0 0 min memory per query (KB) 512 2147483647 1024 1024 min server memory (MB) 0 2147483647 0 8 nested triggers 0 1 1 1 network packet size (B) 512 32767 4096 4096 Ole Automation Procedures 0 1 0 0 open objects 0 2147483647 0 0 PH timeout (s) 1 3600 60 60 precompute rank 0 1 0 0 priority boost 0 1 0 0 query governor cost limit 0 2147483647 0 0 query wait (s) -1 2147483647 -1 -1 recovery interval (min) 0 32767 0 0 remote access 0 1 1 1 remote admin connections 0 1 0 0 remote login timeout (s) 0 2147483647 20 20 remote proc trans 0 1 0 0 remote query timeout (s) 0 2147483647 600 600 Replication XPs 0 1 0 0 scan for startup procs 0 1 0 0 server trigger recursion 0 1 1 1 set working set size 0 1 0 0 show advanced options 0 1 1 1 SMO and DMO XPs 0 1 1 1 SQL Mail XPs 0 1 0 0 transform noise words 0 1 0 0 two digit year cutoff 1753 9999 2049 2049 user connections 0 32767 0 0 user options 0 32767 0 0 Web Assistant Procedures 0 1 0 0 xp_cmdshell 0 1 0 0
Memory Consumption
on svr-sql-prod03\prod at 3/21/2008 12:20:03 PM
This report provides detailed data on the memory consumption of components within the Instance as well as historical data f ri rddb h Df lT
Memory Grants Outstanding 0 Memory Grants Pending 0
Page life expectancy
Memory Usage By Components
Component Type Allocated memory(KB) Virtual Memory Reserved(KB) Virtual Memory Committed(KB) CACHESTORE_SQLCP 9,510,904 0 0 CACHESTORE_OBJCP 572,504 0 0 USERSTORE_TOKENPERM 286,320 0 0 MEMORYCLERK_SOSNODE 209,272 0 0 MEMORYCLERK_SQLGENERAL 147,568 0 0 OBJECTSTORE_LOCK_MANAGER 102,760 524,288 524,288 MEMORYCLERK_SQLUTILITIES 43,192 840 840 CACHESTORE_PHDR 42,976 0 0 USERSTORE_SCHEMAMGR 41,848 0 0 MEMORYCLERK_SQLCLR 41,824 6,312,960 56,204 OBJECTSTORE_SNI_PACKET 39,752 0 0 MEMORYCLERK_SQLSTORENG 34,592 12,096 12,096 OBJECTSTORE_LBSS 33,952 0 0 USERSTORE_DBMETADATA 27,640 0 0 MEMORYCLERK_SQLCONNECTIONPOOL 20,752 0 0 MEMORYCLERK_BHF 14,176 0 0 USERSTORE_SXC 5,192 0 0 CACHESTORE_SYSTEMROWSET 5,096 0 0 USERSTORE_OBJPERM 3,072 0 0 MEMORYCLERK_SNI 2,552 0 0 MEMORYCLERK_SQLOPTIMIZER 2,328 0 0 MEMORYCLERK_SQLBUFFERPOOL 2,056 134,365,184 131,072 CACHESTORE_BROKERTBLACS 640 0 0 OBJECTSTORE_SERVICE_BROKER 464 0 0 MEMORYCLERK_SQLSERVICEBROKER 440 0 0 CACHESTORE_TEMPTABLES 440 0 0 MEMORYCLERK_HOST 232 0 0 CACHESTORE_XPROC 232 0 0 CACHESTORE_BROKERREADONLY 128 0 0 MEMORYCLERK_SQLSERVICEBROKERTRANSPO 80 0 0 CACHESTORE_CLRPROC 72 0 0 CACHESTORE_XMLDBTYPE 32 0 0 MEMORYCLERK_SQLQUERYEXEC 32 0 0
CACHESTORE_VIEWDEFINITIONS 16 0 0 MEMORYCLERK_SQLXP 16 0 0 CACHESTORE_EVENTS 16 0 0 CACHESTORE_BROKERRSB 16 0 0 CACHESTORE_NOTIF 16 0 0 CACHESTORE_STACKFRAMES 16 0 0 MEMORYCLERK_FULLTEXT 16 0 0 CACHESTORE_XMLDBELEMENT 8 0 0 CACHESTORE_XMLDBATTRIBUTE 8 0 0 MEMORYCLERK_SQLHTTP 8 0 0 CACHESTORE_BROKERUSERCERTLOOKUP 8 0 0 CACHESTORE_BROKERDSH 8 0 0 CACHESTORE_BROKERTO 8 0 0 CACHESTORE_BROKERKEK 8 0 0 MEMORYCLERK_SQLCLRASSEMBLY 0 4,244 4,244 MEMORYCLERK_SQLXML 0 0 0 MEMORYCLERK_FULLTEXT_SHMEM 0 0 0 MEMORYCLERK_SQLSOAPSESSIONSTORE 0 0 0 MEMORYCLERK_SQLQERESERVATIONS 0 0 0 MEMORYCLERK_SQLQUERYCOMPILE 0 0 0 MEMORYCLERK_SQLQUERYPLAN 0 0 0 MEMORYCLERK_SQLSOAP 0 0 0 MEMORYCLERK_QSRANGEPREFETCH 0 0 0
11,193,288 141,219,612 728,744
46066
AWE Memory Allocated(KB) Shared Memory Reserved(KB) Shared Memory Committed(KB) 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 124,830,976 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Total of all memory columns 277972620 which is just over 265 GIGs. This is way above our max memory setting of 120 Gig. How can this be? Could this be part of the problem?
The system slows way way down with cpu usage NOT going up. The waits are showing up with high numbers in both EC, EX wait types and Parallelism. sorry I dont have exact numbers but they both pegged the graph and are triple or more our normal. We rarely saw either on performance dashboard at all. Our norm is Cachestore and Broker_recieve_wait_for
This is on our production box as there was no effective way to test this level of hardware in a prod environment prior to rollout (It was not done)
The fixes we have implemented so far that are helping was two days ago set max parellelism to 8 then today to 2 and its alleviated the issue but not fixed the underlying causation. The server ran fine for 2 days after setting it to 8 which is how long it ran fine after introduction to production and if this happens again in two days there is nowhere else to go but set it to 1.
What could be causing this is the question1 and 2 is how can we show memory usage higher than that set for max server memory? Are the two related or is it cause and effect. We have an open ticket with Microsoft but I think you guys can beat them to the right answer both in quality and speed. They didnt think we should change the Parallelism setting but that got us this far.
|
|
|
|
|
Ten Centuries
      
Group: General Forum Members
Last Login: Thursday, January 03, 2013 4:44 PM
Points: 1,336,
Visits: 595
|
|
Output of sp_configure indicates that you have not limited max memory setting for SQL Server.
max server memory (MB) 16 2147483647 2147483647 2147483647
If you want to limit max memory to 120 GB then you need to configure 122880. You may use following statement to do this:
sp_configure 'max server memory (MB)', 122880 GO reconfigure with override GO
--www.sqlvillage.com
|
|
|
|
|
SSC-Dedicated
           
Group: Administrators
Last Login: 2 days ago @ 1:47 PM
Points: 31,406,
Visits: 13,722
|
|
You've got a lot of info there; too much to really go through in details, but can you answer a few questions.
What is the amount of memory in the new machine? You say much less, but didn't specify what.
If you open SSMS, what do you have max memory configured to.
My guess is that your system was using lots of memory. The additional CPUs do not necessarily help here.
Also, I'm not sure that you can total up all the memory there as the memory being used. There will be things paged out to disk, which might be counted as memory in some counters.
Follow me on Twitter: @way0utwest
 Forum Etiquette: How to post data/code on a forum to get the best help
|
|
|
|
|
SSC Rookie
      
Group: General Forum Members
Last Login: Monday, March 24, 2008 1:38 PM
Points: 32,
Visits: 69
|
|
interesting. I'm sure I checked it as well as my boss and two other DBAs. How could we all have missed it. Its now set and I didnt change it. I need to check with the others and see who did make the change. I have since found out some more info. The old server had a large L3 Cache and the new one has larger L2 cache but no L3. I've read up that L2 can be shared across multiple cores but not across physical processors but the L3 is shared across the processors as well as the cores. I've looked up the specs for high end data warehousing and Database servers and all have large L3 caches. Being as L3 is twice as fast as L2 and is shared across the processors could this be a contributing factor?
The old server had 64 gig of ram with 124 L2 and 16mg L3 The new server has 400k+ L2 and no L3
Also we are now running with 4 quad cores vs 8 dual cores (two boxes configured by IBM to run one OS) I think this also would give us 2 L3 caches since its two boxed configured to work together.
Processor use drops from norm of 30 percent to 15 percent with locks high (not parallelism this time)
|
|
|
|
|
SSC-Dedicated
           
Group: Administrators
Last Login: 2 days ago @ 1:47 PM
Points: 31,406,
Visits: 13,722
|
|
In terms of CPU drop, is that between the two boxes (old and new) or after you set max memory?
Not sure about L3, have to ask someone else about that, but 4 quads should be slightly less efficient than 8 duals, though I'm not sure how much. Depending on how the workload goes, if there are some really separate tasks (not parallelism of one query), I'd think the 8 CPUs would be much better.
Is the old box still around for testing?
So the old server was 64GB of RAM and the new server is 128GB of RAM? It wasn't quite clear from how you're posting if this is the case.
Follow me on Twitter: @way0utwest
 Forum Etiquette: How to post data/code on a forum to get the best help
|
|
|
|
|
SSC Rookie
      
Group: General Forum Members
Last Login: Monday, March 24, 2008 1:38 PM
Points: 32,
Visits: 69
|
|
[color=R]In terms of CPU drop, is that between the two boxes (old and new) or after you set max memory?
Not sure about L3, have to ask someone else about that, but 4 quads should be slightly less efficient than 8 duals, though I'm not sure how much. Depending on how the workload goes, if there are some really separate tasks (not parallelism of one query), I'd think the 8 CPUs would be much better.
Is the old box still around for testing?
So the old server was 64GB of RAM and the new server is 128GB of RAM? It wasn't quite clear from how you're posting if this is the case. [/color]
That is the difference in when we are experiencing issues and when things are running smoothly on the NEW machine. When things are smooth it runs at 35-40%. When things are going nuts its throttled DOWN to 15%. Looks like the CPUs are waiting on something (MEMORY READ OR Disk I/O?) I am sceptical on the disk I/O as they are the same ones we used on the old system, we just switched the LUNS to the new machines.
the old box is still around but not currently online (were using the same LUNs on the new machine)
The old server was 64 gig and the new one is 128 gig.
recent perfmon trace shows that 3 of 4 cores on each physical processor are running at 5% and the other on on each physical processor is running at about 40%.
|
|
|
|
|
Grasshopper
      
Group: General Forum Members
Last Login: Thursday, March 01, 2012 6:02 AM
Points: 10,
Visits: 103
|
|
Depending upon your actual ss memory usage you could be negatively impacting the OS's ability to service resource (CPU, MEMORY, I/O) requests. I would reduce the max memory in ss to about 110 (of real memory) assuming you are only using the server for ss.
Also,
1. If you aren't already, you should think about implementing x64 on this hardware to take max advantage (cache above 4GB, etc.). It looks like you may be running x64 since you don't have AWE enabled. 2. You should be running on ss2k5 build 3175 or better. 3. You should also check to see if you have "lock pages in memory" local privileges on the OS granted to the NT account running your engine. Otherwise, you may be getting some VM paging.
A couple of notes:
The problem that you are describing is a bit like the issue that was found regarding thread ss thread management on smp systems with hyper-threading. Performance takes a hit when ever there is a high thread count. The waits spike and very little gets done. However, this was due to contention of background threads while sharing the floating point process and L2 cache on each core. On the quad-core there is no HT and thus no sharing. I also believe that you have dual FSBs (probably about 1066Mhz each) which provide much better real memory access performance. This helps a log in support of request coming from those threads.
Another issue which was found is a problem involving NUMA on various servers. On some IBM system like the 3950 I believe they still employ the old architecture from x445. The sticks are connected to separate expansion cards and sync'ed across a relative slow BUS.
I'm editing this post to add one more thing which I feel may be of some help...
One major contributor to increased thread count (maxdop permitting) is the creation of not-so-optimal execution plans. A simple way to verify this is by simply running one of the sp's which isn't performing well on your new box on both the old box and the new box. capture the execution plans for each via trace and compare the two. I was able to do this while working a like problem and after capturing the optimal plan on one server i forced it on the slow performing server and even though the hardware was different (CPU, memory, etc.) the sp execute optimally.
If you haven't done so already, you should run sp_createstats and update stats with a full sample. Then DBCC FREEPROCCACHE to ensure that you pickup new plans.
Good luck.
|
|
|
|
|
SSC Rookie
      
Group: General Forum Members
Last Login: Monday, March 24, 2008 1:38 PM
Points: 32,
Visits: 69
|
|
1. If you aren't already, you should think about implementing x64 on this hardware to take max advantage (cache above 4GB, etc.). It looks like you may be running x64 since you don't have AWE enabled. 2. You should be running on ss2k5 build 3175 or better. 3. You should also check to see if you have "lock pages in memory" local privileges on the OS granted to the NT account running your engine. Otherwise, you may be getting some VM paging.
1. We are running x64 2. checking into the build level upgrade we are at 3186 now 3. please explain more about this and how to check
Old server: X3950 CPUS Intel Zeon PM 7140N Intel Xeon CPU 3.33 ghz Famil F Model 6 Stepping 8 Ext. Family F Ext. Model 6 Revision B0 instructions MMX, SSE, SSE2, SSE3, EM64T Core Speed 3336.3 MHz Multiplier x20 Bus speed 166.8 Mhz Rated FSB 667.3 MHz
Cache L1 Data 2x16 kbytes L2 Trace 2x12Kuops Level2 2 2x1024 kbytes level 3 16 MBytes
Cores 2 Threads 2
New servers M2s Intel Zeon X7350 Socket 604 mpga
Intel Xeon x7350 @ 2.93 GHz Family 6 Model F Stepping B Ext Family 6 Ext Model F revision G0 Instructions MMX, SSE, SSE2, SSE3, SSSE3 EM64T
Core Speed 1599.3 MHz Multplier X6.0 bus Speed 266.5 Rated FSB 1066.2 MHz
cache L1 4 x 32 kb L1 inst 4x32 kb L2 2x4096 kb
Cores 4 Threads 4
|
|
|
|
|
Grasshopper
      
Group: General Forum Members
Last Login: Thursday, March 01, 2012 6:02 AM
Points: 10,
Visits: 103
|
|
| "lock pages in memory" privileges are set within the "Local Security Policy" mmc snapin. Within the snapin you will find this setting in the "User Rights Assignment" node of the tree. Select this polity from the right window pane and add either the nt account which is running your sql engine or the nt group which contain the account. With these privilges set the account will now have the local privileges necessary to lock real memory pages and prevent the OS from paging you out to vm. Restart the ss engine. However, be careful not to set your ss max server memory too high as you will negatively impact the OS and other apps.
|
|
|
|
|
SSC Rookie
      
Group: General Forum Members
Last Login: Monday, March 24, 2008 1:38 PM
Points: 32,
Visits: 69
|
|
| I checked and we have it set correctly with local rights. This is a dedicated SS box and we have max server mem to 120 gig of 128. 8 should be plenty for the OS but I do still see paging occurring. I dont understand that with this much more memory...
|
|
|
|