RE: Understanding and Using APPLY (Part 1)

SSC Guru

Points: 1003875

January 9, 2012 at 8:39 pm

SAinCA (1/9/2012)
As requested, Jeff:

I thought the problem may have been with the split function you had listed but, looking at the code you provided for the split functions, I don't see a particular problem with them other than they're slower (because of the delimiter concatenation) than the new split function available at the URL coming up. The attachments to the article have been updated, so be sure to get the code from the attachments and not the article itself. The code in the attachments is about 20% faster after someone was good enough to provide yet another optimization.

Here's the link to the article with the new splitter functions...

http://www.sqlservercentral.com/articles/Tally+Table/72993/

One of the "problems" (in quotes because it's a human error, not a problem with CROSS APPLY or iTVFs themselves) with iTVFs and CROSS APPLY is a problem that is very similar to a problem with VIEWs. That is, it's "hidden" code or at least "out of site" code AND the code becomes a part of the execution plan as if it had been written as a part of the code. Because of that, you can miss some of the most terrible performance problems due to a very simple oversight... mismatched datatypes which, of course, will cause table and index scans and even full "rewinds" which are worse than mere table scans. That was the same problem that Brad Shultz ran into in the link that Paul previously posted (I never did post back on that thread to show what the problem was). In that case (IIRC... it was a while ago), it was caused by CHARINDEX returning BIGINT instead of just an INT because the first operand of the CHARINDEX Brad used was MAX datatype. The datatype mismatch was a real killer there. Instead of the Numbers table selecting only the numbers it needed, it "rewound" the table 121,316 times and generated a total of 121,317,000 rows internally through a Lazy Table Spool (I have the execution plan but have misplaced the analysis that showed the precise reason, so operating as best I can from memory).

Because you haven't included any information as to the datatypes you used for variables or any of the columns used in your code example, I can't even begin to guess where the datatype mismatch (or, possibly, accidental many-to-many join) may be. It may be in the splitters or it (more likely) is in the fn_GetPermissionByUser function you actually used in the CROSS APPLY. And, yeah... because of the lack of information about your code, I can only guess but anytime something like what you've described happens with a conversion to an iTVF and a CROSS APPLY, it's usually due to a datatype mismatch. You have to remember... iTVFs and VIEWs are a lot alike... they "become one" with the calling code and, as with any fully inline code, data type matches become rather critical for any decent performance.

I agree with your "word of caution" but I wouldn't blame the iTVF or the CROSS APPLY. Chances are it's something like a simple datatype mismatch and that's likely what you have to be most cautious about. Why does multi-line and scalar TVFs let you get away with it (datatype mismatches and accidental many-to-many joins)? Because the execution plan doesn't incorporate the plans for such things in the main-stream execution plan.

Paul is much more of an expert on execution plans than I am. If you were to follow the procedure (see the second link in my signature line at the end of this post) for posting performance problems and opened a new thread for the problem (we've diverged too much from the intent of this thread already), I'm sure that someone, or even Paul himself, would be happy to show you how to correctly incorporate the very high speed methods of combining CROSS APPLY and iTVFs for this particular problem.

As a side bar, you've got a huge amount of unnecessary code in your code example that you posted. Take the following snippet, for example...

WHERE GETDATE() BETWEEN eff_date AND term_date

AND ( @CompanyID IS NULL

OR ( @CompanyID IS NOT NULL

AND @CompanyID > 0

AND company_id = @CompanyID

)

Because NULLs cannot be related to anything else including other NULLs and assuming that you're not actually using negative CompanyIDs, the code can be simplified quite a bit as follows...

WHERE GETDATE() BETWEEN eff_date AND term_date

AND (@CompanyID IS NULL OR company_id = NULLIF(@CompanyID,0))

The fact that you're using BETWEEN to compare dates is a whole other chapter in performing SQL Harakiri but this thread isn't the place for that particular discussion. 🙂

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)