I was writing some dynamic SQL that generates some SQL statements to load my facts and dimensions into a data warehouse. Some of those SQL statements can become very long, for example if a dimension has a lot of columns. When debugging, I noticed a couple of statements failing with various errors. Turns out, they were truncated after 4000 characters. What was going on?
I’m using the CONCAT_WS function to dynamically build my SQL statement. As a separator, I use a variable holding a carriage return and a line feed, so I get nice multi-line statements. Something like this:
DECLARE @crlf CHAR(2) = (CHAR(13) + CHAR(10)); SELECT SQLStatement = CONCAT_WS(@crlf ,'SELECT *' ,'FROM dbo.' + m.TableName) FROM dbo.MyMetadata m;
Well, obviously a bit longer than that, because such a short statement won’t be truncated. I checked the documentation, but it doesn’t state CONCAT_WS can only hold 4000 chars, so the issue is not with the function itself. The problem is that it’s a bit hard to debug, because when you run the query in SSMS there’s also a limit on how much characters the result window shows. And the PRINT command is limited as well (4000 chars for nvarchar, 8000 for varchar). So I created an SSIS package that runs the dynamic SQL and outputs the result to a CSV file (which was more painful than I expected. SSIS can be really frustrating sometimes). There I could verify the data was really truncated, and that it wasn’t just the output windows of SSMS or ADF.
But what’s causing the truncation? After some searching, I found out it has to do with data type precedence in the CONCAT_WS function (or any concatenation function in SQL Server). The rules are a bit better explained in the documentation of CONCAT. Specifically:
Since none of the input arguments to CONCAT_WS was larger than 4000 chars, the result was a NVARCHAR(4000). The fix is luckily quite simple:
DECLARE @crlf CHAR(2) = (CHAR(13) + CHAR(10)); SELECT SQLStatement = CONCAT_WS(@crlf ,CONVERT(NVARCHAR(MAX),'SELECT *') ,'FROM dbo.' + m.TableName) FROM dbo.MyMetadata m;
By doing an explicit conversion on one of the input parameters, we can force the function to return a NVARCHAR(MAX), which means the data is no longer truncated.