• We've watched trace flag 8048 go into numerous systems now, and have not observed any negative effects. We've been watching stolen memory, all waits and spinlocks in 5 minute increments, etc.

    It helps that we are concerned with only two primary workflows, a batch ETL and a batch report window, both with query concurrency higher than server core count.

    In our case, the best solution was to disable sql server NUMA optimization with trace flag 8015 and eliminate query memory spinlock contention with trace flag 8048. A single large database buffer pool and a single large scheduler group is much more advantageous for our workflows than smaller buffer pools and scheduler groups per NUMA node. Too easy to unevenly stress CPU and memory across the NUMA nodes, given the round robin distribution of connections, the tendency of parallel query workers to remain in the same NUMA node with the connection, and the insertion of all database blocks for a thread that aren't referenced from cache on another NUMA node into the cache on the NUMA node associated with the worker.

    Our internal testing (query concurrency of 120 on 24 core server, 4 hex cores) showed 25% drop in total io and 10% drop in execution time with trace flag 8048 and 8015 vs no trace flags.