C# CLR performance

Question

C# CLR performance

Viewing 14 posts - 31 through 44 (of 44 total)

You must be logged in to reply to this topic. Login to reply

HowardW One Orange Chip Points: 29109 More actions · Answer 1

I suspect that would only be the case if the server was under physical memory pressure. In reality, we've seen nothing of the OP's code and a unit test is very different to how it will operate as a CLR, so we really have gone far down a rabbit hole that probably has little to do with the actual issue 🙂

OP - could you explain in more detail what your code's doing and whether you've actually attempted to introduce threading - upload the code if you can. Might give a better idea where the problem may be. As many have said though, it sounds like an inefficient way of tackling the process anyway - if you really don't want to do it out of .Net code, I'd suggest ssis or converting to a CLR function if it's a row by row calculation anyway.

Orlando Colamatteo SSC Guru Points: 182293 More actions · Answer 2

Something else worth checking immediately after an instance restart, as your process is running and after your process completes:

-- credit to Bob B: http://social.msdn.microsoft.com/Forums/en-US/sqlnetfx/thread/bbc1c753-c5df-4d52-a567-c982cf506051

SELECT SUM(pages_allocated_count) / 128.0 AS CLR_memory_MBs

FROM sys.dm_os_memory_objects

WHERE type LIKE '%CLR%' ;

There are no special teachers of virtue, because virtue is taught by the whole community.
--Plato

Tomas Bergvall SSC Enthusiast Points: 113 More actions · Answer 3

I'm always considering to re-consider 🙂 But I don't think I will rewrite it in TSQL, my thought was to make it a console application and run it with xp_cmdshell instead. But I'm still very interested in finding out an explanation to the performance problem.

Tomas Bergvall SSC Enthusiast Points: 113 More actions · Answer 4

opc.three (4/19/2011)
Something else worth checking immediately after an instance restart, as your process is running and after your process completes:
-- credit to Bob B: http://social.msdn.microsoft.com/Forums/en-US/sqlnetfx/thread/bbc1c753-c5df-4d52-a567-c982cf506051
SELECT SUM(pages_allocated_count) / 128.0 AS CLR_memory_MBs
FROM sys.dm_os_memory_objects
WHERE type LIKE '%CLR%' ;

I'll try that as soon as I'm allowed to restart the server. But I might have to wait until after the weekend.

HowardW One Orange Chip Points: 29109 More actions · Answer 5

Tomas Bergvall (4/20/2011)
I'm always considering to re-consider 🙂 But I don't think I will rewrite it in TSQL, my thought was to make it a console application and run it with cmdshell instead. But I'm still very interested in finding out an explanation to the performance problem.

Post your code and I'll have a look - the fact of the matter is that it was already effectively a console app in the first place, just deployed as a CLR.

In the same way we can't answer a question like "Why's my T-SQL code running slowly?" without the DDL, sample data and the statement in question, it's impossible to do so with a CLR.

Tomas Bergvall SSC Enthusiast Points: 113 More actions · Answer 6

-->HowardW

Yep, I'll try to post the code in the next post.

Btw. I tried running the code as a console app with the following results

100.000 rows read to memory and the calculations done on the rows took the following amount of time

UnitTest on local box: 7.5s

SQL CLR on server: 290s

Console App on server: 124s

To me it's a bit strange that it is twice as fast as a console app than as a CLR

Tomas Bergvall SSC Enthusiast Points: 113 More actions · Answer 7

C# CLR code: This is an excerpt since the actual code is probably 1000+ lines.

An example of what is return from the LoadReports() and loaded in the ReportInfo dictionary

#Id 1

#country ESP

#date 1986-03-31

#gender 1

#age 74

#deathDate NULL

#Foo <foos><foo><fooId>1</fooId><foodate>191801</date><outcome>1</outcome></foo></foos>

#Bar <bars><bar><barId>1</barId><barDate>198303</barDate><barStopDate>198401</barStopDate></bar></bars>

#Text <Texts><Text><chapter>16</chapter><aText>YYYY YYYY: XXXXXXXXXXX.</aText></Text></Texts>

[SqlProcedure]

public static void DoCalc(SqlInt16 threads)

{

Factory DGF = new Factory()

{

NThreads = threads

};

DGF.MainNew();

}

public void MainNew()

{

SqlConnection conn = new SqlConnection("context connection=true");

conn.Open();

SqlCommand sqlCmd = new SqlCommand();

sqlCmd.Connection = conn;

ReportInfoDict = new Dictionary<int, ReportInfo>(1000000);

for (int i = 0; i <= maxReportId / 1000000; i++)

{

// *********************************

// Load Reports

// *********************************

ReportInfoDict.Clear();

LoadReports((i * 1000000) + 1, (i + 1) * 1000000, sqlCmd);

// *********************************

// Initiate threads

// *********************************

for (int k = 0; k < NThreads; k++)

{

th[k] = new Thread(new ThreadStart(DGStartCalcNew));

th[k].Priority = ThreadPriority.BelowNormal;

ArrThreadIDs[k] = (Int16)th[k].ManagedThreadId;

th[k].Start();

}

// *********************************

// Synchronize threads

// *********************************

for (int k = 0; k < NThreads; k++)

{

th[k].Join();

}

private void DGStartCalcNew()

{

ReportScoreFactory rsf = new ReportScoreFactory();

for (Int32 k = threadId; k < ReportInfoDict.Keys.Count; k = k + NThreads)

{

// Calculate score

rsf.AddScore(CompletenessNew(ReportInfoDict.ElementAt(k).Value));

}

// Batch inserts calculated scores into table foo.bar

rsf.StoreScores("foo.bar");

}

// ******************************

// This is the method that performs slowly

// but it basically looks at different values

// stored in ReportInfo and makes a decision

// of how to score that fields.

// There is one special field (#Text) which requires

// some more advanced string manipulations

// like regular expressions.

// ******************************

private Scores[] CompletenessNew(ReportInfo reportInfo)

{

}

HowardW One Orange Chip Points: 29109 More actions · Answer 8

How many threads are you calling the procedure with?

The CLR Managed environment is designed to try to prevent poorly implemented code from taking down the server, so if you spawn threads that don't yield they're 'punished'. Run this while your code's running and post the results:

select os.task_address, os.state, os.last_wait_type, clr.state, clr.forced_yield_count from sys.dm_os_workers os join sys.dm_clr_tasks clr on (os.task_address = clr.sos_task_address) where clr.type = 'E_TYPE_USER'

Tomas Bergvall SSC Enthusiast Points: 113 More actions · Answer 9

It is usually done with 4 threads, but for test purpoases everything I've mentioned previously is done with only onw thread.

Results from you qeury

task_addressstatelast_wait_typestateforced_yield_count

0x000000086E15C508SUSPENDEDCLR_JOINE_TASK_ATTACHED_TO_CLR0

0x00000005801E0E08RUNNINGCLR_MANUAL_EVENTE_TASK_ATTACHED_TO_CLR0

HowardW One Orange Chip Points: 29109 More actions · Answer 10

If this isn't a production server, could you also run the following (which clears the cumulative wait stats):

DBCC SQLPERF ('sys.dm_os_wait_stats', CLEAR);

GO

Then run the CLR again and send the results of the below after it has finished:

SELECT * FROM sys.dm_os_wait_stats

WHERE wait_type LIKE '%CLR%'

ORDER BY wait_time_ms desc

Although I suspect it won't tell us much. I'd also suggest taking the threading out entirely and running the code in line

Tomas Bergvall SSC Enthusiast Points: 113 More actions · Answer 11

wait_typewaiting_tasks_countwait_time_msmax_wait_time_mssignal_wait_time_ms

CLR_AUTO_EVENT 52285838414020656

CLR_MANUAL_EVENT 6164612300143

CLR_JOIN 1318654185980

CLR_CRST 00 00

CLR_SEMAPHORE 00 00

CLR_TASK_START 130 00

CLR_MONITOR 00 00

CLR_RWLOCK_READER 00 00

CLR_RWLOCK_WRITER 0000

SQLCLR_QUANTUM_PUNISHMENT0000

SQLCLR_APPDOMAIN 0000

SQLCLR_ASSEMBLY 0000

SQLCLR_DEADLOCK_DETECTION0000

CLR_MEMORY_SPY 0000

CLRHOST_STATE_ACCESS 0000

PREEMPTIVE_OS_SQLCLROPS 0000

I'll try to remove the threading as well, will return with those results in a bit

HowardW One Orange Chip Points: 29109 More actions · Answer 12

There's a blog post here that warns against holding large allocations of memory in CLR's as they'll get placed in the Large Object Heap which can be bad for performance:

http://blogs.msdn.com/b/sqlclrperf/archive/2007/06/01/scalable-memory-usage-in-sqlclr.aspx

I'd use the console version of the app as the benchmark rather than the unit test (as it's a like for like comparison) and probably put the difference down to the differences in how memory is managed within the hosted CLR environment.

It seems like it's a row-by-row problem, so why not convert it to a scalar CLR function or stream your data in and out if you don't have to iterate over things multiple times and one row doesn't affect another?

Orlando Colamatteo SSC Guru Points: 182293 More actions · Answer 13

HowardW (4/19/2011)
No, it really is 8TB. VAS != available memory on the server.
64 bit systems can address 2^64 bytes of memory, but as this is hugely above any likely scale to be reached, it's capped to 8TB.

Thanks Howard, following this thread prompted me to go down the memory management rabbit hole...end result, VAS != physical memory finally sunk in! I have used -g on 32-bit systems plenty to solve issues with CLR and Linked Servers, but never really understood the internals of why it was necessary...and the relationship/difference between VAS and physical memory. You set it off, and Gail schooled me. Thanks.

http://www.sqlservercentral.com/Forums/Topic1095860-391-1.aspx

There are no special teachers of virtue, because virtue is taught by the whole community.
--Plato

HowardW One Orange Chip Points: 29109 More actions · Answer 14

Excellent - I'll have a watch of the video's Gail posted myself. To be honest, SQL Server Memory Management seems to be quite sparsely documented and lots of articles were written before 64-bit was widely prevalent so it's difficult to find definitive information on whether lots of the 32-bit limitation work-arounds are still relevent or not.

AWE is a good example. The BOL article on Memory Architecture states that AWE is not applicable to 64-bit versions, however there are blogs from the MS CSS team that say it's still the mechanism used to lock pages in memory and has performance gains over allocating memory through the Virtual Address Space. This is compounded by the fact there's still an AWE Enabled option on 64-bit versions, but it's ignored (although when they say ignored, they just seem to mean, they'll always use AWE if they have the lock pages in memory privelage). Go figure.