Understanding the Expression Service - avoiding expression limits

  • Hi Sql Experts,

    At my company, we have authored a piece of software that allows users to examine data by viewing it in a grid, and authoring expressions against that data. Our software ultimately distills those expressions authored in the application to sql expressions used in a SELECT. These expressions are all linear (no aggregations/grouping), and once a user has authored an expression, they can reference that expression in future expressions. This allows users to eventually (and quite accidentally) author an expression that is too complex for sql server. In that case, we will see the typical expression services limit error -

    Internal error: An expression services limit has been reached. Please look for potentially complex expressions in your query, and try to simplify them.

    The obvious answer is to simplify/reduce the expressions used in the query (and ultimately, in the application), but, that's difficult because from a technical perspective, it's difficult to offer advice on how the user can know they are writing something excessively complex (because we cannot predict the complexity of the expression they are writing), and also from a business perspective, when we do tell them to simplify the expressions, it makes the tool less useful.

    We've uncovered different techniques of carving up what would have been large expressions into digestible chunks. Pre-2019, taking those expressions and stuffing them in scalar functions worked. But in 2019, those scalar functions are inlined and negates the usefulness of those. Temp tables work, but it's very difficult to know how to pre-calculate certain expressions to break them up programatically. It also slows down the query.

    So I'd like to reach out to this community for opinions, or even optimistically, answers. Do you have insight into how the expression service behaves (where in the query planning/execution process the 65k limit is applied for instance), or have ideas on how to carve up expressions using querying techniques (CTE's, subqueries, etc), in a way that is reliable in the arbitrary nature of the expressions defined by the user?

    Thanks in advance.

  • ...By the way... we did ask Microsoft support for their thoughts on this, and they offered one nugget that didn't help us, but might help someone else.

    They suggested that in SQL Server 2019, using Trace Flag 8788 may increase that limit. We haven't had success with that, but maybe it will work for you.

  • Well, that's clearly a tough nut to crack.   Nothing could realistically guarantee a lack of complexity, save a hard limit on the total number of expressions, operators, and parentheses.   As you stated, such a limit would inherently make your tool less useful.   However, what about trying a TRY CATCH block on the resulting SQL ?   You could detect that specific error and at least be able to suggest that the user simplify their overall expression.   Finally, if you have the ability to break the expression down into pieces, you might be able to use CROSS APPLY to select the pieces separately within the sub-query you construct within the CROSS APPLY.   Very interesting problem, though...

    Steve?(aka sgmunson)?:) 🙂 :)?
    Health & Nutrition
    Make Guaranteed Income

  • Thanks a bunch for the reply. the TRY CATCH block is an awesome idea, i think we can start doing that right away. Great suggestion.

    We had been looking at the CROSS APPLY option, but have had limited success with it, as it appears that the expression engine in sql server isn't calculating the total number of expressions until after it has done its joins. so in many of our cases, the CROSS APPLY still caused us to hit a 65k limit just the same.

    Are there any other techniques like CROSS APPLY that might cause sql server to parse the query in pieces? similar to what scalar functions did prior to 2019?

  • Outside of CROSS APPLY, you'd probably have to use Global Temp Tables - they're that way when the table name starts with "##" (2 pound signs).   Meaning that you run a query to compute one result and store it in a global temp table, and you repeat this process for each expression you evaluate.   Then you join all the global temp tables in a final query.   It would be iterative, and might be slower, but it might work where other alternatives don't...   Global temp tables, however, also require deletion, as they don't just "disappear", like normal temp tables do at the end of a stored procedure, so keep that in mind.

    Steve?(aka sgmunson)?:) 🙂 :)?
    Health & Nutrition
    Make Guaranteed Income

  • Also, you might be able to use scalar functions you create in tempdb, based on making your parsing logic smart enough to detect a query length long enough to cause a problem, and then forcing the process to create a tempdb scalar function for each sub-expression that your parser then creates when the overall expression is too long or too complex.   You might also use a cascade of declared variables within a scalar function to deliver the overall computational result.   Each successive variable would further build the calculation on top of the previous one.    Not going to be easy to do, mind you...

    Steve?(aka sgmunson)?:) 🙂 :)?
    Health & Nutrition
    Make Guaranteed Income

  • That suggestion isn't too far-fetched; like you said it sounds difficult but I can imagine how that could solve the issue. We'd have to disable function inlining in 2019 to accomplish that, but that seems reasonable.

    I'll pursue that last approach with a few of our examples and see if i can come up with something that works - if it does I'll be happy to share the algorithm in case it helps the next person. Thanks for the suggestion!

Viewing 7 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply