September 10, 2010 at 7:05 pm
hirro (9/10/2010)
Another interesting bit of information:(Physical RAM Config) SQL Allocation - Time it took
(6x8GB) 48GB allocated to SQL - Query takes 6 seconds
(12x8GB) 84 GB allocated to SQL – Query takes 6 seconds
(12x8GB) 96 GB allocated to SQL - Query takes 30 seconds
(18x8GB) 144 GB allocated to SQL – Query takes… still waiting… a long time. Ended up throwing an IO error before finishing.
Msg 823, Level 24, State 2, Line 3
The operating system returned error 1117(The request could not be performed because of an I/O device error.) to SQL Server during a read at offset 0x0000000314e000
BWAA-HAA!!!! Remember when 512K was good enough and 640K was a luxury? 😛 Now we have servers that pitch a fit when you give them a tenth of a Tera-Byte. This is insane.
--Jeff Moden
Change is inevitable... Change for the better is not.
September 10, 2010 at 7:14 pm
hirro (9/10/2010)
Ran a test while tracking swap file disk usage.SQL set to use 84 GB max memory
6 seconds for the query
SQL set to use 92 GB max memory
30 seconds for the query
same swap file disk usage on both instances,
[font="Arial Black"]I don't think has to do with memory not being available for Windows.[/font]
Said this in an above post but if I change the maxdop to 8 instead of 0 (SQL decides and uses all 24) performance is perfect even with 92 GB of memory set. It has to be tied between the processor count and max memory settings.
Installing 2008 instead of 2008 R2 to see if this is only tied to R2.
Considering the other information you've provided, I don't believe it's simply memory not being available for Windows, now either.
BTW, thanks for the tests and the great feedback you're giving. Not many of us get a chance to put our hands on a system that has 100GB+ of RAM never mind 8 processors. I really appreciate it.
--Jeff Moden
Change is inevitable... Change for the better is not.
September 10, 2010 at 7:42 pm
Here is what I've gotten from Microsoft.
Looking through the plan I see that there is a difference in how SQL is generating the query plan for the same query under different memory setting configurations. For the bad plan I see that it is using MERGE Join->SORT->Parallelism whereas in the good plan it is using HASH MATCH-Parallelism. Because the Bad plan is using Merge Join, it is then sorting one additional step which is contributing to the slowness.
By the way the exact memory amount that causes this issue is 86489MB. If we allocate 86488MB our performance is fine. Isn't that nuts?
September 10, 2010 at 7:52 pm
hirro (9/10/2010)
Here is what I've gotten from Microsoft.Looking through the plan I see that there is a difference in how SQL is generating the query plan for the same query under different memory setting configurations. For the bad plan I see that it is using MERGE Join->SORT->Parallelism whereas in the good plan it is using HASH MATCH-Parallelism. Because the Bad plan is using Merge Join, it is then sorting one additional step which is contributing to the slowness.
By the way the exact memory amount that causes this issue is 86489MB. If we allocate 86488MB our performance is fine. Isn't that nuts?
And setting the number of processors to something other than 0 fixes it. Yeah... that's a little crazy.
--Jeff Moden
Change is inevitable... Change for the better is not.
September 11, 2010 at 11:23 am
you have been playing with maxdop, setting the max parallelism for a query, not restricting the max proc for the instance.
What happens if you restrict the numbers of proc for the instance ?
Like Jeff said, not many of us have the occasion to that kind of big servers.
Thank you for your feedback.
Johan
Learn to play, play to learn !
Dont drive faster than your guardian angel can fly ...
but keeping both feet on the ground wont get you anywhere :w00t:
- How to post Performance Problems
- How to post data/code to get the best help[/url]
- How to prevent a sore throat after hours of presenting ppt
press F1 for solution, press shift+F1 for urgent solution 😀
Need a bit of Powershell? How about this
Who am I ? Sometimes this is me but most of the time this is me
September 20, 2010 at 9:14 am
The "fix" for this was enabling trace flag 2335
DBCC TRACEON (2335, -1)
More notes from MS:
Looked through the plans that you sent me it looks like this is the query within the stored procedure that is taking 3570 milli seconds to run in one case and took 38049 milliseconds to run when the max server memory was set to over 84GB.
Looking through the plan I see that there is a difference in how SQL is generating the query plan for the same query under different memory setting configurations. For the bad plan I see that it is using MERGE Join->SORT->Parallelism whereas in the good plan it is using HASH MATCH-Parallelism. Because the Bad plan is using Merge Join, it is then sorting one additional step which is contributing to the slowness.
September 30, 2010 at 11:13 pm
So what did they say that particular trace flag does? It's not in BOL.
--Jeff Moden
Change is inevitable... Change for the better is not.
October 1, 2010 at 11:11 am
From MS
"SQL server will choose a plan with less cost, in your case SQL choses merge join plan as its cost it less than hash join... this is by design. Having said that we have noticed that even though the cost of the plan is less the overall execution time is higher in many cases. Till the product group finishes its testing I do not know if this will be classified as a ‘bug’.
The trace flag that you have to enable is to alter the optimizer behavior. Trace flags are part of normal fixes that Microsoft SQL team releases. Since this optimizer behavior is within the specs for SQL 2008 there is a strong possibility that the behavior will not be altered in the updates for SQL 2008 but will be altered in the next major release. Use of this trace flag is supported and you can enable this trace in production without any issue."
Viewing 8 posts - 16 through 23 (of 23 total)
You must be logged in to reply to this topic. Login to reply