SQLServerCentral Article

Parsing a Pairs of Values in a String While Preserving Order

,

Introduction

In a project I'm currently working on, I need to modify a stored procedure that returns a list of items based on criteria set by the procedure parameters. The new requirement is as follows: the stored procedure must be able to retrieve the list of items that should be included in the result set. That is, if the list is empty, the goods selected according to the given criteria are included in the result set. If the list is not empty, then the result set includes only items from the list.

Moreover, in the list are set pairs of values: the product identifier (integer) - the primary key of the table of goods, and the name of the product (string), which should replace the original name of the product. For example, the list of values: 10000, "Item #10000", 10100, "Item #10100",10200, "Item #10200". This list should be interpreted as a set of pairs (tuples) [10000, "Item #10000"], [10100, "Item #10100"] and [10200, "Item #10200"]. Important note: the order of the pairs is meaningful, and must be preserved in the result set.

Research

How to pass a list of value pairs to a stored procedure? Of course, you can use data structures such as XML or JSON to pass this kind of data to a stored procedure. But what if you pass the list of pairs as a delimited string? Strings are easier to form and recognise. Let's try to pass the list of pairs as a string passed to a stored procedure.

It would be good to formulate the list recognition in the form of a common table expression (CTE). This CTE should return a table, the columns of which are the identifier and the name of the item. Firstly, it would be beautiful and interesting. And secondly, it will allow you to easily modify the original, rather voluminous, query for forming the list of goods. All you need to do is add the CTE definition to the beginning of the query and slightly modify the FROM, WHERE and ORDER BY clauses. And thus, the query will work in two modes: with a list and without a list.

One more remark. Probably, you can somehow use the STRING_SPLIT function. But there is one problem: before version 2022 MS SQL Server does not support the Ordinal field. I am using version 2017. Therefore, it is impossible to provide the original order of pairs.

CTE of string parsing. Parts

So, it was decided that the procedure of recognising a string of value pairs will be implemented by means of common table expression (CTE). It seems that the query declaration should look as follows:

WITH List(field0, field1, rest, delimiter, ordinal) AS
( ... )

Where the parameter assignment is as follows:

  • field0  - the first field of the tuple. Type - integer;
  • field1 - he second field of the tuple. Type - string;
  • rest - remainder of the string to be parsed, passed to the next iteration. Initial value - the initial string to be parsed;
  • delimiter - delimiter of values in the initial string, it is better to set it as a parameter. A parameterised delimiter will allow, if necessary, to specify the necessary delimiter in only one place;
  • ordinal - ordinal number of the tuple

CTE of string parsing. Anchor part

The anchor part of the query is the actual string to be parsed and the initial values of the query parameters:

SELECT
      NULL field0
     ,CAST(N'' AS NVARCHAR(MAX)) field1
     ,CASE WHEN CHARINDEX(',', @OriginalString, LEN(@OriginalString)) <> LEN(@OriginalString)
          THEN @OriginalString + N','
          ELSE @OriginalString
      END rest
     ,',' delimiter
     , -1 ordinal

It should be noted that there is some magic: a delimiter is added to the initial string at the end. This is necessary, otherwise the query will not work. Yes, I deceived you - the delimiter value is set in three places, but in the same query string. The initial value for ordinal is -1 because we want to number tuples starting from 0, as it is customary in the programming world.

CTE of string parsing. Recursive part

The recursive part of the query computes:

  • first tuple field - the first left substring of the string that came from the previous iteration;
  • the second tuple field - the second left substring of the string that came from the previous iteration;
  • the rest of the string from the previous iteration without the first two left substrings;
  • separator - taken from the previous iteration;
  • sequence number - incremented by one the sequence number from the previous iteration.

The first tuple field is the first left substring of the string that came from the previous iteration:

CAST(
     LEFT(
          rest,
          CHARINDEX(delimiter, rest, 1) - 1
     ) AS INT
) field0

The second tuple field is the second left substring of the string that came from the previous iteration:

CASE WHEN CHARINDEX(delimiter, RIGHT(rest, LEN(rest) - CHARINDEX(delimiter, rest, 1)), 1) = 0
     THEN NULL
     ELSE
          CAST(
               LEFT(
                    RIGHT(rest, LEN(rest) - CHARINDEX(delimiter, rest, 1)),
                    CHARINDEX(delimiter, RIGHT(rest, LEN(rest) - CHARINDEX(delimiter, rest, 1)), 1) - 1
               ) AS NVARCHAR(MAX)
          )
END field1

The value to be passed to the next iteration is the remainder of the string from the previous iteration without the first two left substrings (field0 and field1):

CAST(
     RIGHT(
          RIGHT(rest, LEN(rest) - CHARINDEX(delimiter, rest, 1)),
          LEN(RIGHT(rest, LEN(rest) - CHARINDEX(delimiter, rest, 1)))
               - CHARINDEX(delimiter, RIGHT(rest, LEN(rest) - CHARINDEX(delimiter, rest, 1)), 1)
     ) AS NVARCHAR(MAX)
) rest

The delimiter is taken from the previous iteration.

The sequence number is the number from the previous iteration increased by one:

ordinal + 1 ordinal

The condition for exiting recursion is the absence of a delimiter in the remainder of the parsed string of the current iteration:

WHERE (CHARINDEX(delimiter, rest, 1) > 0)

Now let's put it all together.

CTE of string parsing. Solution

Below is the solution to the problem. This solution does not work with a space as separator. This is because the LEN function excludes end spaces. Thus, parsing a string using LEN will return an error.

WITH List(field0, field1, rest, delimiter, ordinal) AS (
     SELECT
           NULL field0
          ,CAST(N'' AS NVARCHAR(MAX)) field1
          ,CASE WHEN CHARINDEX(',', @OriginalString, LEN(@OriginalString)) <> LEN(@OriginalString)
               THEN @OriginalString + N','
               ELSE @OriginalString
          END rest
          ,',' delimiter
          , -1 ordinal
     UNION ALL
     SELECT
         CAST(
              LEFT(
                   rest,
                   CHARINDEX(delimiter, rest, 1) - 1
              ) AS INT
         ) field0
         ,CASE WHEN CHARINDEX(delimiter, RIGHT(rest, LEN(rest) - CHARINDEX(delimiter, rest, 1)), 1) = 0
              THEN NULL
              ELSE
                  CAST(
                       LEFT(
                            RIGHT(rest, LEN(rest) - CHARINDEX(delimiter, rest, 1)),
                            CHARINDEX(delimiter, RIGHT(rest, LEN(rest) - CHARINDEX(delimiter, rest, 1)), 1) - 1
                       ) AS NVARCHAR(MAX)
                  )
          END field1
          ,CAST(
               RIGHT(
                   RIGHT(rest, LEN(rest) - CHARINDEX(delimiter, rest, 1)),
                   LEN(RIGHT(rest, LEN(rest) - CHARINDEX(delimiter, rest, 1)))
                        - CHARINDEX(delimiter, RIGHT(rest, LEN(rest) - CHARINDEX(delimiter, rest, 1)), 1)
               ) AS NVARCHAR(MAX)
          ) rest
          ,delimiter
          ,ordinal + 1 ordinal
     FROM List
     WHERE (CHARINDEX(delimiter, rest, 1) > 0)
)

CTE of string parsing. Testing

The (real data) is used as the test string:

DECLARE @OriginalString NVARCHAR(MAX) = N'69328 ,Toner blue TN-227C Toner Cyan,64598,
Toner blue TN-328C Toner Cyan, 71209 ,Toner blue TONER-C-C834/844-EU-10K,72077,
Toner blue TN-227C H Toner cyan f. bizhub C257i,66048,
Cartridge Roland Eco-UV4 ( EUV4-CY ) Cyan (blue) 220ml,44227,Toner blue TN-619C,17272,
Monitor 20 LaCie 720 / Hood / Blue Eye Pro Calibrator / Blue Eye Pro Software,17288,
Monitor 30 LaCie 730 / Hood / Blue Eye Pro Calibrator / Blue Eye Pro Software,67304,
Toner blue TN-626C Toner C,1000001639,Toner blue Toner-C 6k C650 6000 pages.';

Test query:

WITH List(field0, field1, rest, delimiter, ordinal) AS
(
SELECT
 NULL field0
,CAST(N'' AS NVARCHAR(MAX)) field1
,CASE WHEN CHARINDEX(',', @OriginalString, LEN(@OriginalString)) <> LEN(@OriginalString)
THEN @OriginalString + N','
ELSE @OriginalString
END rest
,',' delimiter
, -1 ordinal

UNION ALL
SELECT
 CAST(
LEFT(
rest,
CHARINDEX(delimiter, rest, 1) - 1
) AS INT
) field0
,CASE WHEN CHARINDEX(delimiter, RIGHT(rest, LEN(rest) - CHARINDEX(delimiter, rest, 1)), 1) = 0
THEN NULL
ELSE
CAST(
LEFT(
RIGHT(rest, LEN(rest) - CHARINDEX(delimiter, rest, 1)),
CHARINDEX(delimiter, RIGHT(rest, LEN(rest) - CHARINDEX(delimiter, rest, 1)), 1) - 1
) AS NVARCHAR(MAX)
)
END field1
,CAST(
RIGHT(
RIGHT(rest, LEN(rest) - CHARINDEX(delimiter, rest, 1)),
LEN(RIGHT(rest, LEN(rest) - CHARINDEX(delimiter, rest, 1)))
- CHARINDEX(delimiter, RIGHT(rest, LEN(rest) - CHARINDEX(delimiter, rest, 1)), 1)
) AS NVARCHAR(MAX)
) rest
,delimiter
,ordinal + 1 ordinal
FROM List
WHERE (CHARINDEX(delimiter, rest, 1) > 0)
)
SELECT field0, field1, rest, ordinal
FROM List
WHERE ordinal >= 0
ORDER BY ordinal
OPTION (MAXRECURSION 0);

The result of the query execution:

Test results

 

The result is as per our expectation.

Modification of the initial request of the stored procedure

The initial query describes selection of goods from the table of goods according to the criteria passed as parameters. The query is quite complex and contains several connections. Schematically the query looks like this (Materials - table of goods, MaterialID - product identifier, PartID - product article):

SELECT
M.MaterialID
.
.
.
FROM
dbo.Materials M
INNER JOIN
.
.
.
WHERE
         <Criteria>
ORDER BY
CASE WHEN UPPER(@SortBy)=N'PARTID' AND UPPER(@SortType)=N'ASC' THEN M.PartID END ASC,
CASE WHEN UPPER(@SortBy)=N'PARTID' AND UPPER(@SortType)=N'DESC' THEN M.PartID END DESC,
.
.
.

I realise that the ORDER BY clause seems to raise questions for you. It is necessary for dynamic sorting - sorting by a field set by a parameter. But this is a separate story.

To solve the problem, the following changes (marked with *) should be made to the base query:

*WITH List(field0, field1, rest, delimiter, ordinal) AS (
.
.
.
*)
SELECT
.
.
.
FROM
dbo.Materials M
INNER JOIN
.
.
.
        *LEFT OUTER JOIN List ON List.field0 = M.MaterialID
WHERE
*(LEN(@TestString) > 0 AND Ordinal >= 0)
*OR
(
     *LEN(@TestString) = 0
     *AND
     <Criteria>
        )
ORDER BY
      *CASE WHEN LEN(@Pairs) <> 0 THEN IDT.ORDINAL END ASC,
      CASE WHEN UPPER(@SortBy)=N'PARTID' AND UPPER(@SortType)=N'ASC' THEN M.PartID END ASC,
      CASE WHEN UPPER(@SortBy)=N'PARTID' AND UPPER(@SortType)=N'DESC' THEN M.PartID END DESC,
.
.
.
*OPTION (MAXRECURSION 0);

It is worth noting the WHERE clause separately. The criterion for selecting results now consists of two parts joined by OR. The first part will work when the list of pairs is specified - the string (@TestString) of the list of pairs is not empty. The additional condition: Ordinal >= 0 - will allow to select items only from the list. The second part contains initial selection conditions and will work when the list is not specified - the list string is empty.

Conclusion

The proposed solution, with minor modifications, can be adapted both for recognizing a list of single values and for tuples with more than two elements.

Rate

5 (1)

You rated this post out of 5. Change rating

Share

Share

Rate

5 (1)

You rated this post out of 5. Change rating