Click here to monitor SSC
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


How to find values in a comma delimeted string with len>3


How to find values in a comma delimeted string with len>3

Author
Message
labri
labri
Forum Newbie
Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)

Group: General Forum Members
Points: 5 Visits: 54
Hi everyone,

can you please help me with the following? I have a nvarchar column with data like this: ', 55,85,1,4,9888,6587,'
How can I found the rows that include in their comma delimeted string, values with len()>3?

I.e. if I had the following rows:

', 55,85,1,4,9888,6587,'
', 55,85,1,4,98,65,'

I would like a statement that returns only the first row (beacuse it has values: 9888 and 6587) or only the values: 9888, 6587.

Thank you in advance
John Mitchell-245523
John Mitchell-245523
SSCrazy Eights
SSCrazy Eights (9.5K reputation)SSCrazy Eights (9.5K reputation)SSCrazy Eights (9.5K reputation)SSCrazy Eights (9.5K reputation)SSCrazy Eights (9.5K reputation)SSCrazy Eights (9.5K reputation)SSCrazy Eights (9.5K reputation)SSCrazy Eights (9.5K reputation)

Group: General Forum Members
Points: 9518 Visits: 15558
Here's one way of doing it. It may not be the most efficient way, so you'll want to test for performance if you're going to use it in production.

SELECT Mynvarcharcol
FROM Mytable
WHERE Mynvarcharol LIKE '%[^,][^,][^,][^,]%'


John
wolfkillj
wolfkillj
Ten Centuries
Ten Centuries (1.2K reputation)Ten Centuries (1.2K reputation)Ten Centuries (1.2K reputation)Ten Centuries (1.2K reputation)Ten Centuries (1.2K reputation)Ten Centuries (1.2K reputation)Ten Centuries (1.2K reputation)Ten Centuries (1.2K reputation)

Group: General Forum Members
Points: 1230 Visits: 2582
[I think what the OP wants is to return only the comma-delimited strings where any one of the values in the string has LEN(value) > 3. If that's the case, this works, but may need performance tweaking depending on the OP's database schema and data:



-- NOTE: Be sure to remove the lowercase letter 'x' from the CREATE and DROP T-SQL keywords in this code - my company blocks internet traffic that includes certain T-SQL keywords.

CxREATE TABLE #TEMP (rowID int identity(1,1), delimitedString varchar(60))

GO

INSERT INTO #TEMP

VALUES (', 55,85,1,4,9888,6587,'), (', 55,85,1,4,98,65,' )

GO

WITH cte1 AS (

SELECT c.rowID, s.Item, LEN(s.Item) as valueLen

FROM #TEMP c

OUTER APPLY dbo.DelimitedSplit8K(c.delimitedString,',') s

WHERE LEN(s.Item) > 3

)

SELECT t.rowID, t.delimitedString

FROM #TEMP t

WHERE EXISTS (SELECT 1 FROM cte1 c WHERE t.rowID = c.rowID)

GO

DxROP TABLE #TEMP



OP, you'll need to have some column to uniquely identify each row of your input - if the table has a primary key, you can use that in place of the rowID column I created in my #TEMP table. In cte1, I parsed each string into its values using my version of Jeff Moden's string splitter function and included the rowID where LEN(<parsed value>Wink > 3. Then I selected rows from #TEMP where the rowID exists in cte1, which gives me only the rows where the LEN() > 3 for any parsed value in the comma-delimited string.

If the CTE in my code causes performance issues with your data, you could convert it to a temp table with an appropriate index to speed things up. No matter what you do, though, parsing delimited strings for any number of rows will be pretty slow.

Hope that helps!

Jason Wolfkill
Blog: SQLSouth
Twitter: @SQLSouth
Lynn Pettis
Lynn Pettis
One Orange Chip
One Orange Chip (27K reputation)One Orange Chip (27K reputation)One Orange Chip (27K reputation)One Orange Chip (27K reputation)One Orange Chip (27K reputation)One Orange Chip (27K reputation)One Orange Chip (27K reputation)One Orange Chip (27K reputation)

Group: General Forum Members
Points: 27774 Visits: 38157
labri (4/17/2013)
Hi everyone,

can you please help me with the following? I have a nvarchar column with data like this: ', 55,85,1,4,9888,6587,'
How can I found the rows that include in their comma delimeted string, values with len()>3?

I.e. if I had the following rows:

', 55,85,1,4,9888,6587,'
', 55,85,1,4,98,65,'

I would like a statement that returns only the first row (beacuse it has values: 9888 and 6587) or only the values: 9888, 6587.

Thank you in advance


Which do you want, the entire row in which there is at least one numeric value with a length greater than 3 or just those numeric values from that row with a length greater than 3?

Cool
Lynn Pettis

For better assistance in answering your questions, click here
For tips to get better help with Performance Problems, click here
For Running Totals and its variations, click here or when working with partitioned tables
For more about Tally Tables, click here
For more about Cross Tabs and Pivots, click here and here
Managing Transaction Logs

SQL Musings from the Desert Fountain Valley SQL (My Mirror Blog)
Alan.B
Alan.B
SSCrazy
SSCrazy (2.9K reputation)SSCrazy (2.9K reputation)SSCrazy (2.9K reputation)SSCrazy (2.9K reputation)SSCrazy (2.9K reputation)SSCrazy (2.9K reputation)SSCrazy (2.9K reputation)SSCrazy (2.9K reputation)

Group: General Forum Members
Points: 2935 Visits: 7475
It's work noting that wolfkill and I are using the same splitter (I mine just has a different name).


-- sample data
IF OBJECT_ID('tempdb..#x') IS NOT NULL
DROP TABLE #x;

CREATE TABLE #x (id int identity primary key, val nvarchar(100) NOT NULL);

INSERT INTO #x
SELECT '55,85,1,4,9888,6587' UNION ALL --we want this one
SELECT '55,85,1,4,98,65' UNION ALL --we don't want this
SELECT '1122,33333,22,11,40' UNION ALL --we want this one
SELECT '312,9,8,7' --we don't want this
GO

-- code to get what you need
WITH legitIDs AS
( SELECT id
FROM #x x
CROSS APPLY dbo.splitString(x.val,',')
WHERE LEN(item)>=4
GROUP BY id)
SELECT val
FROM #x x
JOIN legitIDs l ON x.id=l.id;

--cleanup
DROP TABLE #x;
GO



-- Alan Burstein



Best practices for getting help on SQLServerCentral
Need to split a string? Try DelimitedSplit8K or DelimitedSplit8K_LEAD (SQL 2012+)
Need a pattern-based splitter? Try PatternSplitCM
Need to remove or replace those unwanted characters? Try PatExclude8K and PatReplace8K.

"I can't stress enough the importance of switching from a 'sequential files' mindset to 'set-based' thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code. " -- Itzek Ben-Gan 2001
labri
labri
Forum Newbie
Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)

Group: General Forum Members
Points: 5 Visits: 54
Thank you all for your answers! I found what I needed using the
Alan.B's query.
wolfkillj
wolfkillj
Ten Centuries
Ten Centuries (1.2K reputation)Ten Centuries (1.2K reputation)Ten Centuries (1.2K reputation)Ten Centuries (1.2K reputation)Ten Centuries (1.2K reputation)Ten Centuries (1.2K reputation)Ten Centuries (1.2K reputation)Ten Centuries (1.2K reputation)

Group: General Forum Members
Points: 1230 Visits: 2582
labri (4/19/2013)
Thank you all for your answers! I found what I needed using the
Alan.B's query.


Glad you got what you needed. Thanks for posting back to let us know you did!

Jason Wolfkill
Blog: SQLSouth
Twitter: @SQLSouth
Lynn Pettis
Lynn Pettis
One Orange Chip
One Orange Chip (27K reputation)One Orange Chip (27K reputation)One Orange Chip (27K reputation)One Orange Chip (27K reputation)One Orange Chip (27K reputation)One Orange Chip (27K reputation)One Orange Chip (27K reputation)One Orange Chip (27K reputation)

Group: General Forum Members
Points: 27774 Visits: 38157
I was doing some testing, code below:



set nocount on;
-- sample data
IF OBJECT_ID('tempdb..#x') IS NOT NULL
DROP TABLE #x;

CREATE TABLE #x (id int identity primary key, val nvarchar(100) NOT NULL);
GO

INSERT INTO #x
SELECT '55,85,1,4,9888,6587' UNION ALL --we want this one
SELECT '55,85,1,4,98,65' UNION ALL --we don't want this
SELECT '1122,33333,22,11,40' UNION ALL --we want this one
SELECT '312,9,8,7' --we don't want this
GO 10000

set nocount on;
--===== Create the timer and bit-bucket variables and start the timer.
DECLARE @BitBucket nvarchar(100); --< change or add variables needed here to eliminate display distortion

Declare @cpu_ int,
@lreads_ bigint,
@eMsec_ bigint,
@Reads_ bigint,
@Writes_ bigint;
declare @CpuMs int,
@LogRds bigint,
@Elapsed bigint,
@Reads bigint,
@Writes bigint;

dbcc freeproccache with no_infomsgs;
dbcc freesystemcache('ALL') with no_infomsgs;
dbcc dropcleanbuffers with no_infomsgs;


Select
@cpu_ = cpu_time
, @lreads_ = logical_reads
, @eMsec_ = total_elapsed_time
, @Reads_ = reads
, @Writes_ = writes
From
sys.dm_exec_requests
Where
session_id = @@spid;

---
WITH legitIDs AS
( SELECT id
FROM #x x
CROSS APPLY dbo.DelimitedSplit8K(x.val,',')
WHERE LEN(item)>=4
GROUP BY id)
SELECT
@BitBucket = val
FROM
#x x
JOIN legitIDs l ON x.id=l.id;
---

Select
@CpuMs = cpu_time - @cpu_
, @LogRds = logical_reads - @lreads_
, @Elapsed = total_elapsed_time - @eMsec_
, @Reads = reads - @Reads_
, @Writes = writes - @Writes_
From
sys.dm_exec_requests
Where
session_id = @@spid;
--===== Display the duration
PRINT '----- Using DelimitedSplit/CROSS APPLY -----';
PRINT 'CPU(ms): ' + right(' ' + cast(@CpuMs as varchar(10)),10) +
' Logical Reads: ' + right(' ' + cast(@LogRds as varchar(10)),10) +
' Elapsed(ms): ' + right(' ' + cast(@Elapsed as varchar(10)),10) +
' Reads: ' + right(' ' + cast(@Reads as varchar(10)),10) +
' Writes: ' + right(' ' + cast(@Writes as varchar(10)),10);

set nocount off;
GO

set nocount on;
--===== Create the timer and bit-bucket variables and start the timer.
DECLARE @BitBucket nvarchar(100); --< change or add variables needed here to eliminate display distortion

Declare @cpu_ int,
@lreads_ bigint,
@eMsec_ bigint,
@Reads_ bigint,
@Writes_ bigint;
declare @CpuMs int,
@LogRds bigint,
@Elapsed bigint,
@Reads bigint,
@Writes bigint;

dbcc freeproccache with no_infomsgs;
dbcc freesystemcache('ALL') with no_infomsgs;
dbcc dropcleanbuffers with no_infomsgs;


Select
@cpu_ = cpu_time
, @lreads_ = logical_reads
, @eMsec_ = total_elapsed_time
, @Reads_ = reads
, @Writes_ = writes
From
sys.dm_exec_requests
Where
session_id = @@spid;

---
select
@BitBucket = val
FROM
#x x
where
patindex('%[^,][^,][^,][^,]%',val) > 0;
---

Select
@CpuMs = cpu_time - @cpu_
, @LogRds = logical_reads - @lreads_
, @Elapsed = total_elapsed_time - @eMsec_
, @Reads = reads - @Reads_
, @Writes = writes - @Writes_
From
sys.dm_exec_requests
Where
session_id = @@spid;
--===== Display the duration
PRINT '----- Using PATINDEX -----';
PRINT 'CPU(ms): ' + right(' ' + cast(@CpuMs as varchar(10)),10) +
' Logical Reads: ' + right(' ' + cast(@LogRds as varchar(10)),10) +
' Elapsed(ms): ' + right(' ' + cast(@Elapsed as varchar(10)),10) +
' Reads: ' + right(' ' + cast(@Reads as varchar(10)),10) +
' Writes: ' + right(' ' + cast(@Writes as varchar(10)),10);

set nocount off;
GO

--cleanup
DROP TABLE #x;
GO





Results:



Beginning execution loop
Batch execution completed 10000 times.

----- Using DelimitedSplit/CROSS APPLY -----
CPU(ms): 422 Logical Reads: 80564 Elapsed(ms): 1667 Reads: 80 Writes: 0

----- Using PATINDEX -----
CPU(ms): 109 Logical Reads: 245 Elapsed(ms): 110 Reads: 0 Writes: 0



Cool
Lynn Pettis

For better assistance in answering your questions, click here
For tips to get better help with Performance Problems, click here
For Running Totals and its variations, click here or when working with partitioned tables
For more about Tally Tables, click here
For more about Cross Tabs and Pivots, click here and here
Managing Transaction Logs

SQL Musings from the Desert Fountain Valley SQL (My Mirror Blog)
wolfkillj
wolfkillj
Ten Centuries
Ten Centuries (1.2K reputation)Ten Centuries (1.2K reputation)Ten Centuries (1.2K reputation)Ten Centuries (1.2K reputation)Ten Centuries (1.2K reputation)Ten Centuries (1.2K reputation)Ten Centuries (1.2K reputation)Ten Centuries (1.2K reputation)

Group: General Forum Members
Points: 1230 Visits: 2582
Lynn Pettis (4/19/2013)
I was doing some testing, code below:

<snipped>

Results:



Beginning execution loop
Batch execution completed 10000 times.

----- Using DelimitedSplit/CROSS APPLY -----
CPU(ms): 422 Logical Reads: 80564 Elapsed(ms): 1667 Reads: 80 Writes: 0

----- Using PATINDEX -----
CPU(ms): 109 Logical Reads: 245 Elapsed(ms): 110 Reads: 0 Writes: 0




I see the cleverness of using PATINDEX to check for the existence of a substring of four consecutive non-comma characters and agree that it works for the minimal specifications that the OP provided.

If, however, the values in these comma-delimited strings are supposed to be numbers (and not strings of numerals) and what the OP really wants is any row where the comma-delimited string includes a value >= 1000 (on which I wish the OP had provided some clarification or that I had asked for such), the PATINDEX solution won't work. It will return 1 for the string '100, 192, 38, 3' because the second value consists of [space][1][9][2] - four non-comma characters - but 192 is not >= 1000. The string-split method using LEN() as both Alan.B and I wrote it will do the same thing, but it can be easily modified to convert the parsed values to numeric datatypes and compare them to the reference value as numbers to get only rows where at least one parsed, converted value >= 1000.

Jason Wolfkill
Blog: SQLSouth
Twitter: @SQLSouth
Lynn Pettis
Lynn Pettis
One Orange Chip
One Orange Chip (27K reputation)One Orange Chip (27K reputation)One Orange Chip (27K reputation)One Orange Chip (27K reputation)One Orange Chip (27K reputation)One Orange Chip (27K reputation)One Orange Chip (27K reputation)One Orange Chip (27K reputation)

Group: General Forum Members
Points: 27774 Visits: 38157
wolfkillj (4/19/2013)
Lynn Pettis (4/19/2013)
I was doing some testing, code below:

<snipped>

Results:



Beginning execution loop
Batch execution completed 10000 times.

----- Using DelimitedSplit/CROSS APPLY -----
CPU(ms): 422 Logical Reads: 80564 Elapsed(ms): 1667 Reads: 80 Writes: 0

----- Using PATINDEX -----
CPU(ms): 109 Logical Reads: 245 Elapsed(ms): 110 Reads: 0 Writes: 0




I see the cleverness of using PATINDEX to check for the existence of a substring of four consecutive non-comma characters and agree that it works for the minimal specifications that the OP provided.

If, however, the values in these comma-delimited strings are supposed to be numbers (and not strings of numerals) and what the OP really wants is any row where the comma-delimited string includes a value >= 1000 (on which I wish the OP had provided some clarification or that I had asked for such), the PATINDEX solution won't work. It will return 1 for the string '100, 192, 38, 3' because the second value consists of [space][1][9][2] - four non-comma characters - but 192 is not >= 1000. The string-split method using LEN() as both Alan.B and I wrote it will do the same thing, but it can be easily modified to convert the parsed values to numeric datatypes and compare them to the reference value as numbers to get only rows where at least one parsed, converted value >= 1000.


I had asked the same question and received no answer as well. I do have another piece of code that will deal with that.

Cool
Lynn Pettis

For better assistance in answering your questions, click here
For tips to get better help with Performance Problems, click here
For Running Totals and its variations, click here or when working with partitioned tables
For more about Tally Tables, click here
For more about Cross Tabs and Pivots, click here and here
Managing Transaction Logs

SQL Musings from the Desert Fountain Valley SQL (My Mirror Blog)
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search