String concatenation when defining a varchar(max) variable with in-line assignment

  • Hi everyone,

    I've not been able to find much on what I'm interpreting as a bug in SQL Server 2008, so I'll try my best to describe it and see if it rings any bells with you folks...

    I've been looking at some of the Project Euler (http://projecteuler.net) problems, and found a peculiar thing when trying to solve problem 67.

    The problem asks you to take the contents of a file (one hundred rows of two-digit numbers in the shape of a triangle) and work out maximal value of the sum of the numbers following any path from top to bottom of the triangle.

    Now, I've solved the problem itself, that not my issue.

    My issue is that using the DECLARE @variable <data_type> = <value> syntax introduced in SQL Server 2008, I was not getting the string I expected. Instead of a string somewhere in the region of 15000 characters, I was getting only the first 4000 characters.

    My particular DECLARE looked something like:

    DECLARE @TriangleString NVARCHAR(MAX) = N'59' + NCHAR(9) +

    N'73 41' + NCHAR(9) +

    and so on for all the values from the file.

    Once I changed my syntax choice such that I was adding each new line of data to the string in the older "SET @TriangleString = @TriangleString + ..." format, everything worked as I expected.

    So I guess my question is:

    Has anyone else seen truncation of a varchar(max) or nvarchar(max) variable to just the first 4000 characters when using the DECLARE @variable <data_type> = <value> syntax for a sufficiently long string, or is this a known and not-very-well-documented part of the design?

    Thanks, Rob

  • A colleague suggested that my issue might be that the string literal I was trying to assign to the NVARCHAR(MAX) variable was being interpreted by SQL Server as an NVARCHAR(4000) before the assignment took place.

    In which case, it might be worked around by placing the concatenated string literal within an explicit cast to NVARCHAR(MAX).

    I have a feeling it will probably not work like I want until each individual sub-string is explicitly cast before the concatenation, but I'll try both later tonight and reply with my findings.

    Rob

  • You're on the right track.

    Please see this post for some more info:

    http://www.sqlservercentral.com/Forums/Topic1025419-1292-1.aspx#bm1025536

    -Eddie

    Eddie Wuerch
    MCM: SQL

  • @eddie-2 - Thanks, that's what I figured out last night, not obvious but makes sense when I stop to think about it. I'll bear this in mind when I'm dealing with long strings in the future.

    @celko - Your solution will get the correct answer, however there are two problems in Project Euler that are two variants of the same problem. The first is a triangle of numbers with 15 rows to it, equating to 16,384 paths to be evaluated. Your solution will get the answer within the desired one minute of execution (it was similar to my first attempt).

    The second is a triangle of numbers with 100 rows, giving a total number of POWER(2, 99) paths to evaluate. That might take some time to work though from the top down!

    Instead, I went with this:

    DECLARE @TriangleTable TABLE

    (

    RowId INT NOT NULL,

    CellId INT NOT NULL,

    CellValue INT NOT NULL,

    PRIMARY KEY (RowId, CellId)

    );

    INSERT INTO @TriangleTable (RowId, CellId, CellValue)

    SELECT[Rows].Position AS RowId,

    Cells.Position AS CellId,

    CONVERT(INT, Cells.Value) AS CellValue

    FROMdbo.InlineCTE_StringSplitter(@TriangleString, NCHAR(9)) AS [Rows]

    CROSS APPLYdbo.InlineCTE_StringSplitter([Rows].Value, N' ') AS Cells

    OPTION (MAXRECURSION 0);

    DECLARE @Row INT = (SELECT MAX(RowId) FROM @TriangleTable);

    WHILE @Row > 1

    BEGIN;

    WITH RolledUpRows AS

    (

    SELECTNextRow.RowId,

    NextRow.CellId,

    MAX(NextRow.CellValue + CurrentRow.CellValue) AS CellValue

    FROM@TriangleTable AS CurrentRow

    INNER JOIN@TriangleTable AS NextRow

    ONNextRow.RowId = CurrentRow.RowId - 1

    AND(

    NextRow.CellId = CurrentRow.CellId

    ORNextRow.CellId = CurrentRow.CellId - 1

    )

    WHERECurrentRow.RowId = @Row

    GROUP BYNextRow.RowId,

    NextRow.CellId

    )

    UPDATE@TriangleTable

    SETCellValue = RolledUpRows.CellValue

    FROM@TriangleTable AS TriangleTable

    INNER JOINRolledUpRows

    ONTriangleTable.CellId = RolledUpRows.CellId

    ANDTriangleTable.RowId = RolledUpRows.RowId;

    SET @Row -= 1;

    END;

    SELECTCellValue AS Result

    FROM@TriangleTable

    WHERERowId = 1;

    Where @TriangleString is the rows of numbers where each row is separated by a tab character and each number is separated by a space.

    As you can see, it's taking each row from the bottom up and adding each cell in the row above with the one cell in the previous row that gives the highest sum, all the way to the top of the triangle.

  • This was removed by the editor as SPAM

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply