Instance won't start after changing Processor Affinity

  • cryovenom: based on many things in your various posts you (and your company) REALLY REALLY need to do themselves a favor and get a good professional on board to help you get things set up optimally. You have made some VERY poor decisions so far, and if you continue down this path even if you get stuff working you will likely be very unhappy.

    As an aside I am simply astounded that you ponied up the cost for Enterprise Edition and then got such a crappy piece of hardware to run your SQL Server(s) on!!! My LAPTOP will process data MUCH better than that "server" you bought!!

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • In SQL Server 2008, SSMS didn't separate things into NUMA nodes at all. It just presented a list of Cores, 0 to 15, one column for Processor Affinity, one for I/O Affinity. In 2012 it breaks it into 4 NUMA nodes, but I didn't know what those were, so I just carried my settings over.

    So on this server, the GTA instance (for example) has its processor affinity set on three cores in NUMA Node 2, and its I/O affinity set on a core in NUMA node 3. The ENT and DOCINDEX instances both use cores in NUMA Node 4.

    If there are restrictions or limitations that SQL Server has related to which cores it can use across which nodes, I would like to know what they are. I just assumed that it was a hardware construct like "sockets" that it was exposing to me, but didn't think that it made a difference when it came to assigning affinity.

  • Thought just occurred to me: have you checked the BIOS settings in both servers to see if there is a difference there that could be causing "identical" machines to not behave identically?? NUMA (or at least memory interleaving) may be disabled in the BIOS on one of them and not the other...

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • TheSQLGuru (3/15/2013)


    cryovenom: based on many things in your various posts you (and your company) REALLY REALLY need to do themselves a favor and get a good professional on board to help you get things set up optimally. You have made some VERY poor decisions so far, and if you continue down this path even if you get stuff working you will likely be very unhappy.

    As an aside I am simply astounded that you ponied up the cost for Enterprise Edition and then got such a crappy piece of hardware to run your SQL Server(s) on!!! My LAPTOP will process data MUCH better than that "server" you bought!!

    I realize that this is the internet, and that in all forums there's a tendency to make replies like this. But all your post does is denigrate my decisions, makes no attempt to educate me on which ones were bad and why, and also makes no attempt to help with the problem at hand. Then, for fun, it drives the point home with an exaggerated comparison and copious amounts of exclamation marks.

    Yes, I'm not an expert. Yes, my company and I may not have made all the right decisions. I would much prefer to learn what's wrong, why, what do to going forward, and even maybe get some help with the actual problem as opposed to reading the kind of unhelpful and negative post you just made.

    People aren't born experts, and often they get thrown into situations where they have to make decisions based on the information they can gather and the resources they have at hand. I came here to learn, troubleshoot, fix, and become better, not to be given crap.

  • TheSQLGuru (3/15/2013)


    Thought just occurred to me: have you checked the BIOS settings in both servers to see if there is a difference there that could be causing "identical" machines to not behave identically?? NUMA (or at least memory interleaving) may be disabled in the BIOS on one of them and not the other...

    This could definitely be the case. We had some BIOS issues with the machines when they came in and had to make changes to a few of them to get them working. I'm going to read up on NUMA (seeing as I know so little about it) and at my next maintenance window I'll compare BIOS settings between the two. Thanks!

  • Post the results of the following two queries from BOTH servers for each instance:

    select name, value, value_in_use

    from sys.configurations

    where name = 'affinity mask';

    select

    dos.scheduler_id,

    dos.cpu_id,

    dos.is_online,

    don.cpu_affinity_mask,

    don.online_scheduler_count,

    don.online_scheduler_mask,

    don.memory_node_id

    from sys.dm_os_schedulers as dos

    inner join sys.dm_os_nodes as don

    on dos.parent_node_id = don.node_id

    where dos.status = 'visible online';

    Jonathan Kehayias | Principal Consultant | MCM: SQL Server 2008
    My Blog | Twitter | MVP Profile
    Training | Consulting | Become a SQLskills Insider
    Troubleshooting SQL Server: A Guide for Accidental DBAs[/url]

  • cryovenom (3/15/2013)


    Then, for fun, it drives the point home with an exaggerated comparison and copious amounts of exclamation marks.

    It actually isn't an exaggerated comparison at all. My Dell Precision M6700 laptop would outperform your server head to head. My M6700 has the i7-3940XM, 32GB RAM and 1.25TB of SSD storage and scores over 16000 for a GeekBench benchmark, where a AMD 6212 system scores in the 12000 range.

    Yes, I'm not an expert. Yes, my company and I may not have made all the right decisions. I would much prefer to learn what's wrong, why, what do to going forward, and even maybe get some help with the actual problem as opposed to reading the kind of unhelpful and negative post you just made.

    This was exactly what the recommendation was geared at, you've just taken it as a personal attack. Above you admit that you aren't an expert here and some decisions may not have been the right ones. Bringing an industry expert in can help you understand the what and why and can be the fastest way to solve your problem while also obtaining knowledge transfer. I used to bring consultants in as a DBA all the time because no one can be an expert in everything and having the right resource available prevents problems from occurring. If you are dedicated to doing it on your own, you have to expect that at some points people are going to point out mistakes you are making or have made, and that you might hit a limit to what information is practical to put into a forum thread. I used to write multi-page explanations of SQL Server internals on forum threads, I won't do that anymore because it requires to much time.

    The configuration you are trying to obtain doesn't make sense. With 4 NUMA nodes and 4 instances I would just use ALTER SERVER CONFIGURATION and set the process affinity for each instance to a specific node. The theory behind reserving CPU 0 and 15 for the OS is good in theory, but not in the reality of how OS scheduling occurs or other process scheduling is going to occur. In all my time working with SQL Server I've only once had to reserve CPUs for the OS and it was for a separate application that was running on the server and it had it's own process affinity settings to make it run on those CPUs. You are over complicating things here for no real net gain other than you promised something that I wouldn't have recommended promising.

    Jonathan Kehayias | Principal Consultant | MCM: SQL Server 2008
    My Blog | Twitter | MVP Profile
    Training | Consulting | Become a SQLskills Insider
    Troubleshooting SQL Server: A Guide for Accidental DBAs[/url]

  • Production Server\NCR:

    affinity mask1414

    111153140

    221153140

    331153140

    Development Server\NCR:

    affinity mask1414

    111153140

    221153140

    331153140

    Production Server\GTA:

    affinity mask224224

    55124032241

    66124032241

    77124032241

    Development Server\GTA:

    affinity mask224224

    55124032241

    66124032241

    77124032241

    Production Server\ENT:

    affinity mask35843584

    9913840335842

    101013840335842

    111113840335842

    Development Server\ENT:

    affinity mask35843584

    9913840335842

    101013840335842

    111113840335842

    Production Server\DOCINDEX:

    affinity mask81928192

    1313161440181923

    Development Server\DOCINDEX:

    affinity mask81928192

    1313161440181923

    And this is with the settings where if I stop and restart GTA or DOCINDEX on the Development Server, it will not start.

  • In my experience the best way BY FAR to become an expert (certainly at such a complex system as SQL Server) is to be MENTORED by an expert. So I stand by my recommendation 100%. Your company would come out with an optimal environment and you would learn along the way. Win-win.

    Oh, I did think of another thing, although it is reaching a bit: maybe try reseating the memory. I have experienced or heard about some very weird things happening with poorly-seated memory.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • Jonathan Kehayias (3/15/2013)


    The configuration you are trying to obtain doesn't make sense. With 4 NUMA nodes and 4 instances I would just use ALTER SERVER CONFIGURATION and set the process affinity for each instance to a specific node. The theory behind reserving CPU 0 and 15 for the OS is good in theory, but not in the reality of how OS scheduling occurs or other process scheduling is going to occur. In all my time working with SQL Server I've only once had to reserve CPUs for the OS and it was for a separate application that was running on the server and it had it's own process affinity settings to make it run on those CPUs. You are over complicating things here for no real net gain other than you promised something that I wouldn't have recommended promising.

    [/quote]

    Makes sense. 4 instances, 4 NUMA nodes, I'll give it a shot and see if that works on DEV while I'm researching a better solution long-term, be it virtualization, Windows System Resource Manager, or something else.

    And I'll still check BIOS settings between the two boxes at my next maintenance window just to make sure.

  • TheSQLGuru (3/15/2013)


    In my experience the best way BY FAR to become an expert (certainly at such a complex system as SQL Server) is to be MENTORED by an expert. So I stand by my recommendation 100%. Your company would come out with an optimal environment and you would learn along the way. Win-win.

    I enjoy mentoring under people with more experience, and I don't disagree with your recommendation. Trying to sell it to management is always another story. It would take too long here to get into details, but suffice it to say that convincing management to do anything from sending you on a day course, to actual training, to bringing in outside experts is a difficult proposition at best, so I make do with what I've got. Booksonline, friends in the industry, and whatever research I can do on the internet or at the bookstore.

    TheSQLGuru (3/15/2013)


    Oh, I did think of another thing, although it is reaching a bit: maybe try reseating the memory. I have experienced or heard about some very weird things happening with poorly-seated memory.

    [/quote]

    I've noticed that memory keeps getting mentioned along side the NUMA nodes and cores. I haven't read up yet, but are these things connected? Could a memory issue or configuration cause problems with setting core preference? I really have to take a quick few minutes to google what the heck a NUMA node is.

  • And less than two minutes later I realize how silly my last post seems.

    For anybody finding this thread through google, NUMA (or Non-Uniform Memory Access) definitely has a lot to do with both. See here for a start:

    http://msdn.microsoft.com/en-ca/library/ms178144%28v=sql.105%29.aspx

  • Pull the cover on the Dev box and look at where the RDIMMs are located and see if they are all populated on one socket or balanced across both of them.

    To get started on learning about NUMA:

    http://www.sqlskills.com/blogs/jonathan/understanding-non-uniform-memory-accessarchitectures-numa/[/url]

    and then read the following and all the links at the bottom of it:

    http://blogs.msdn.com/b/psssql/archive/2011/11/11/sql-server-clarifying-the-numa-configuration-information.aspx

    Jonathan Kehayias | Principal Consultant | MCM: SQL Server 2008
    My Blog | Twitter | MVP Profile
    Training | Consulting | Become a SQLskills Insider
    Troubleshooting SQL Server: A Guide for Accidental DBAs[/url]

  • cryovenom (3/15/2013)


    And less than two minutes later I realize how silly my last post seems.

    For anybody finding this thread through google, NUMA (or Non-Uniform Memory Access) definitely has a lot to do with both. See here for a start:

    http://msdn.microsoft.com/en-ca/library/ms178144%28v=sql.105%29.aspx%5B/quote%5D

    To make this one easier: http://msdn.microsoft.com/en-ca/library/ms178144%28v=sql.105%29.aspx

  • Given the hardware budget squeeze I didn't expect "management" to understand the real value of a good consultant/mentor. Damn shame is about all you can say to that.

    Good luck finding much useful guidance on SQL Server and NUMA. There is a dearth of good information out there, especially deep/technical guidance, although it is certainly better than it used to be. IIRC there was a pretty good session at one of the SQL Bits conferences that you might be able to download.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

Viewing 15 posts - 16 through 30 (of 31 total)

You must be logged in to reply to this topic. Login to reply