NUMA

Question

Post reply

NUMA

Ant-Green

SSC Guru

Points: 113445
More actions
June 27, 2012 at 4:46 am

#262900

In preparation for my MCITP exam tomorrow I have been reading through the books and just crossing the t's and dotting the i's. One thing which I just want to get right in my head is NUMA as I had a feeling that I had the right answers then changed them to wrong answers so this is more of a confirmation than an actual question.
1. SQL ignores NUMA when Hard-NUMA is <=4 CPUs and at least 1 node has only 1 CPU.
So.....
1 physical proc with 4 cores, then SQL will ignore NUMA
2 physical procs with 2 cores, ignore
4 physical procs with 1 core ignore
Anything other than the above then SQL will use Hard-NUMA as long as its not interleaved memory configured
This is the one mainly confusing me
2. Use typical use for Soft-NUMA is when there is no Hard-NUMA, but can be used to split Hard-NUMA into more NUMA nodes
3. Soft-NUMA doesnt provide memory to CPU affinity
4. Soft-NUMA can increase performance in relation to I/O as each NUMA node creates a new I/O path and new LazyWriter thread
5. Instead of doing point 4, you could CPU affinity instead to spread workload across multiple CPU's
6. Use SSCM to configure port to NUMA affinity
And now to cross my fingers and hope tomorrow is a good day.

Viewing 2 posts - 1 through 1 (of 1 total)

You must be logged in to reply to this topic. Login to reply

Jeremy Brown SSCarpal Tunnel Points: 4223 More actions · Answer 1

I really want to know the answer to this as well.

I have to say, most of the information on SQL Server and NUMA that I've found online is sparse at best.

I'm dealing with a performance issue with stats gathering in production right now that I suspect NUMA is a big part of the equation. All my lower lane servers support 2 -4 NUMA nodes. Even my DR server shows 4 distinct memory clerks to use. However my production server is only coming up with 1 node - and of course that's where my stats fullscan jobs are running slowly. Even restoring a full backup from production to a lower lane with 2 - 4 nodes and running stats runs faster than my prod box.

The crazy thing is - all these boxes are literally the same hardware class / model of servers. Some have less memory / cpu, and some have the same as my production server. However every single box has more than one memory clerk to use - except my production box. Frustrating.

Also, I'm curious to see how hyperthreading affects NUMA. Does it make a difference for localization?