April 19, 2011 at 12:00 pm
I suspect that would only be the case if the server was under physical memory pressure. In reality, we've seen nothing of the OP's code and a unit test is very different to how it will operate as a CLR, so we really have gone far down a rabbit hole that probably has little to do with the actual issue 🙂
OP - could you explain in more detail what your code's doing and whether you've actually attempted to introduce threading - upload the code if you can. Might give a better idea where the problem may be. As many have said though, it sounds like an inefficient way of tackling the process anyway - if you really don't want to do it out of .Net code, I'd suggest ssis or converting to a CLR function if it's a row by row calculation anyway.
April 19, 2011 at 12:43 pm
Something else worth checking immediately after an instance restart, as your process is running and after your process completes:
-- credit to Bob B: http://social.msdn.microsoft.com/Forums/en-US/sqlnetfx/thread/bbc1c753-c5df-4d52-a567-c982cf506051
SELECT SUM(pages_allocated_count) / 128.0 AS CLR_memory_MBs
FROM sys.dm_os_memory_objects
WHERE type LIKE '%CLR%' ;
There are no special teachers of virtue, because virtue is taught by the whole community.
--Plato
April 20, 2011 at 12:33 am
I'm always considering to re-consider 🙂 But I don't think I will rewrite it in TSQL, my thought was to make it a console application and run it with xp_cmdshell instead. But I'm still very interested in finding out an explanation to the performance problem.
April 20, 2011 at 1:34 am
opc.three (4/19/2011)
Something else worth checking immediately after an instance restart, as your process is running and after your process completes:
-- credit to Bob B: http://social.msdn.microsoft.com/Forums/en-US/sqlnetfx/thread/bbc1c753-c5df-4d52-a567-c982cf506051
SELECT SUM(pages_allocated_count) / 128.0 AS CLR_memory_MBs
FROM sys.dm_os_memory_objects
WHERE type LIKE '%CLR%' ;
I'll try that as soon as I'm allowed to restart the server. But I might have to wait until after the weekend.
April 20, 2011 at 1:47 am
Tomas Bergvall (4/20/2011)
I'm always considering to re-consider 🙂 But I don't think I will rewrite it in TSQL, my thought was to make it a console application and run it with cmdshell instead. But I'm still very interested in finding out an explanation to the performance problem.
Post your code and I'll have a look - the fact of the matter is that it was already effectively a console app in the first place, just deployed as a CLR.
In the same way we can't answer a question like "Why's my T-SQL code running slowly?" without the DDL, sample data and the statement in question, it's impossible to do so with a CLR.
April 20, 2011 at 2:47 am
-->HowardW
Yep, I'll try to post the code in the next post.
Btw. I tried running the code as a console app with the following results
100.000 rows read to memory and the calculations done on the rows took the following amount of time
UnitTest on local box: 7.5s
SQL CLR on server: 290s
Console App on server: 124s
To me it's a bit strange that it is twice as fast as a console app than as a CLR
April 20, 2011 at 3:24 am
C# CLR code: This is an excerpt since the actual code is probably 1000+ lines.
An example of what is return from the LoadReports() and loaded in the ReportInfo dictionary
#Id 1
#country ESP
#date 1986-03-31
#gender 1
#age 74
#deathDate NULL
#Foo <foos><foo><fooId>1</fooId><foodate>191801</date><outcome>1</outcome></foo></foos>
#Bar <bars><bar><barId>1</barId><barDate>198303</barDate><barStopDate>198401</barStopDate></bar></bars>
#Text <Texts><Text><chapter>16</chapter><aText>YYYY YYYY: XXXXXXXXXXX.</aText></Text></Texts>
[SqlProcedure]
public static void DoCalc(SqlInt16 threads)
{
Factory DGF = new Factory()
{
NThreads = threads
};
DGF.MainNew();
}
public void MainNew()
{
SqlConnection conn = new SqlConnection("context connection=true");
conn.Open();
SqlCommand sqlCmd = new SqlCommand();
sqlCmd.Connection = conn;
ReportInfoDict = new Dictionary<int, ReportInfo>(1000000);
for (int i = 0; i <= maxReportId / 1000000; i++)
{
// *********************************
// Load Reports
// *********************************
ReportInfoDict.Clear();
LoadReports((i * 1000000) + 1, (i + 1) * 1000000, sqlCmd);
// *********************************
// Initiate threads
// *********************************
for (int k = 0; k < NThreads; k++)
{
th[k] = new Thread(new ThreadStart(DGStartCalcNew));
th[k].Priority = ThreadPriority.BelowNormal;
ArrThreadIDs[k] = (Int16)th[k].ManagedThreadId;
th[k].Start();
}
// *********************************
// Synchronize threads
// *********************************
for (int k = 0; k < NThreads; k++)
{
th[k].Join();
}
}
private void DGStartCalcNew()
{
ReportScoreFactory rsf = new ReportScoreFactory();
for (Int32 k = threadId; k < ReportInfoDict.Keys.Count; k = k + NThreads)
{
// Calculate score
rsf.AddScore(CompletenessNew(ReportInfoDict.ElementAt(k).Value));
}
// Batch inserts calculated scores into table foo.bar
rsf.StoreScores("foo.bar");
}
// ******************************
// This is the method that performs slowly
// but it basically looks at different values
// stored in ReportInfo and makes a decision
// of how to score that fields.
// There is one special field (#Text) which requires
// some more advanced string manipulations
// like regular expressions.
// ******************************
private Scores[] CompletenessNew(ReportInfo reportInfo)
{
}
}
April 20, 2011 at 3:58 am
How many threads are you calling the procedure with?
The CLR Managed environment is designed to try to prevent poorly implemented code from taking down the server, so if you spawn threads that don't yield they're 'punished'. Run this while your code's running and post the results:
select os.task_address, os.state, os.last_wait_type, clr.state, clr.forced_yield_count from sys.dm_os_workers os join sys.dm_clr_tasks clr on (os.task_address = clr.sos_task_address) where clr.type = 'E_TYPE_USER'
April 20, 2011 at 4:50 am
It is usually done with 4 threads, but for test purpoases everything I've mentioned previously is done with only onw thread.
Results from you qeury
task_addressstatelast_wait_typestateforced_yield_count
0x000000086E15C508SUSPENDEDCLR_JOINE_TASK_ATTACHED_TO_CLR0
0x00000005801E0E08RUNNINGCLR_MANUAL_EVENTE_TASK_ATTACHED_TO_CLR0
April 20, 2011 at 5:05 am
If this isn't a production server, could you also run the following (which clears the cumulative wait stats):
DBCC SQLPERF ('sys.dm_os_wait_stats', CLEAR);
GO
Then run the CLR again and send the results of the below after it has finished:
SELECT * FROM sys.dm_os_wait_stats
WHERE wait_type LIKE '%CLR%'
ORDER BY wait_time_ms desc
Although I suspect it won't tell us much. I'd also suggest taking the threading out entirely and running the code in line
April 20, 2011 at 5:17 am
wait_typewaiting_tasks_countwait_time_msmax_wait_time_mssignal_wait_time_ms
CLR_AUTO_EVENT 52285838414020656
CLR_MANUAL_EVENT 6164612300143
CLR_JOIN 1318654185980
CLR_CRST 00 00
CLR_SEMAPHORE 00 00
CLR_TASK_START 130 00
CLR_MONITOR 00 00
CLR_RWLOCK_READER 00 00
CLR_RWLOCK_WRITER 0000
SQLCLR_QUANTUM_PUNISHMENT0000
SQLCLR_APPDOMAIN 0000
SQLCLR_ASSEMBLY 0000
SQLCLR_DEADLOCK_DETECTION0000
CLR_MEMORY_SPY 0000
CLRHOST_STATE_ACCESS 0000
PREEMPTIVE_OS_SQLCLROPS 0000
I'll try to remove the threading as well, will return with those results in a bit
April 20, 2011 at 7:14 am
There's a blog post here that warns against holding large allocations of memory in CLR's as they'll get placed in the Large Object Heap which can be bad for performance:
http://blogs.msdn.com/b/sqlclrperf/archive/2007/06/01/scalable-memory-usage-in-sqlclr.aspx
I'd use the console version of the app as the benchmark rather than the unit test (as it's a like for like comparison) and probably put the difference down to the differences in how memory is managed within the hosted CLR environment.
It seems like it's a row-by-row problem, so why not convert it to a scalar CLR function or stream your data in and out if you don't have to iterate over things multiple times and one row doesn't affect another?
April 20, 2011 at 3:00 pm
HowardW (4/19/2011)
No, it really is 8TB. VAS != available memory on the server.64 bit systems can address 2^64 bytes of memory, but as this is hugely above any likely scale to be reached, it's capped to 8TB.
Thanks Howard, following this thread prompted me to go down the memory management rabbit hole...end result, VAS != physical memory finally sunk in! I have used -g on 32-bit systems plenty to solve issues with CLR and Linked Servers, but never really understood the internals of why it was necessary...and the relationship/difference between VAS and physical memory. You set it off, and Gail schooled me. Thanks.
http://www.sqlservercentral.com/Forums/Topic1095860-391-1.aspx
There are no special teachers of virtue, because virtue is taught by the whole community.
--Plato
April 21, 2011 at 2:56 am
Excellent - I'll have a watch of the video's Gail posted myself. To be honest, SQL Server Memory Management seems to be quite sparsely documented and lots of articles were written before 64-bit was widely prevalent so it's difficult to find definitive information on whether lots of the 32-bit limitation work-arounds are still relevent or not.
AWE is a good example. The BOL article on Memory Architecture states that AWE is not applicable to 64-bit versions, however there are blogs from the MS CSS team that say it's still the mechanism used to lock pages in memory and has performance gains over allocating memory through the Virtual Address Space. This is compounded by the fact there's still an AWE Enabled option on 64-bit versions, but it's ignored (although when they say ignored, they just seem to mean, they'll always use AWE if they have the lock pages in memory privelage). Go figure.
Viewing 14 posts - 31 through 44 (of 44 total)
You must be logged in to reply to this topic. Login to reply