# How to compare strings by letters in certain places

• BOR15K

SSCertifiable

Points: 5780

Hello All,

I need help to compare strings, please. For example, I have two strings with various words: one has six letters long words and the other - with five. I need to find out the words which have certain letters in certain positions. For example I want to see Betty for five letters and Bertha for six, if there is a request to find all the words / strings with same letters in the positions 1,2 and 4. I need some basic help, which I will then try to extend, please.

So far I have done only the following and obviously far away from what I need. Any help will be truly appreciated:

`DECLARE @six_letters NVARCHAR(MAX) = 'hurrah, buzzed, Bertha, patter, mettle, holler, collie, yuppie, rabble, tallow, brooks, tassel, meddle, callus, peddle, sobbed, kneels, getter, brooms, dulled, maggot, webbed'  ;DECLARE @five_letters NVARCHAR(MAX) = ' floor, beTtY, carry, trees, sleep, tells'; SELECT UPPER(LTRIM(value)) AS six_legthFROM STRING_SPLIT(@six_letters, ',') WHERE UPPER(LEFT(LTRIM(value),2)) IN (SELECT UPPER(LEFT(LTRIM(value),2))  FROM STRING_SPLIT(@five_letters, ',') )ORDER BY 1;SELECT UPPER(LTRIM(value)) AS five_lengthFROM STRING_SPLIT(@five_letters, ',') WHERE UPPER(LEFT(LTRIM(value),2)) IN (SELECT UPPER(LEFT(LTRIM(value),2)) FROM STRING_SPLIT(@six_letters, ',') )ORDER BY 1;`
• Phil Parkin

SSC Guru

Points: 244733

Based on the example data provided, what output are you hoping for?

If the answer to your question can be found with a brief Google search, please perform the search yourself, rather than expecting one of the SSC members to do it for you.
See https://www.sqlservercentral.com/articles/forum-etiquette-how-to-post-datacode-on-a-forum-to-get-the-best-help/ for details of how to post T-SQL code-related questions.

• Steve Collins

Ten Centuries

Points: 1080

`DECLARE @six_letters NVARCHAR(MAX) = 'hurrah, buzzed, Bertha, patter, mettle, holler, collie, yuppie, rabble, tallow, brooks, tassel, meddle, callus, peddle, sobbed, kneels, getter, brooms, dulled, maggot, webbed'  ;DECLARE @five_letters NVARCHAR(MAX) = ' floor, beTtY, carry, trees, sleep, tells'; declare @input_letter_pairs             table (letter       nchar(1),                                               position     int,                                               unique(letter, position));insert @input_letter_pairs(letter, position) values('b', 1),('e', 2);;with six_letters_cte(ItemNumber, Item, n, split_letter) as (    select dsl1.*, t.n, substring(dsl1_trim.item_trim, t.n, 1) split_letter    from      dbo.DelimitedSplitN4K(trim(@six_letters), ',') dsl1     cross apply      (select iif(left(dsl1.Item, 1)=' ', right(dsl1.Item, len(dsl1.Item)-1), dsl1.Item) item_trim ) dsl1_trim     cross apply      fnTally(1, len(dsl1_trim.item_trim)) t)select * /*count(*) matched_pair_count*/from  six_letters_cte slc join  @input_letter_pairs ilp on slc.n=ilp.position                             and slc.split_letter=ilp.letter;`

Aus dem Paradies, das Cantor uns geschaffen, soll uns niemand vertreiben können

• BOR15K

SSCertifiable

Points: 5780

I would expect to see Bertha and Betty (doesn't matter the order nor if it is all lower / UPPER case).

Thank you.

• Steve Collins

Ten Centuries

Points: 1080

Here's for six letters.  It returns Bertha.  To do for five letters would follow a similar pattern.

`DECLARE @six_letters NVARCHAR(MAX) = 'hurrah, buzzed, Bertha, patter, mettle, holler, collie, yuppie, rabble, tallow, brooks, tassel, meddle, callus, peddle, sobbed, kneels, getter, brooms, dulled, maggot, webbed'  ; declare @input_letter_pairs             table (letter       nchar(1),                                               position     int,                                               unique(letter, position));insert @input_letter_pairs(letter, position) values('b', 1),('e', 2);;withsix_letters_cte(ItemNumber, Item, n, split_letter) as (    select dsl1.*, t.n, substring(dsl1_trim.item_trim, t.n, 1) split_letter    from      dbo.DelimitedSplitN4K(trim(@six_letters), ',') dsl1     cross apply      (select iif(left(dsl1.Item, 1)=' ', right(dsl1.Item, len(dsl1.Item)-1), dsl1.Item) item_trim ) dsl1_trim     cross apply      dbo.fnTally(1, len(dsl1_trim.item_trim)) t)select ItemNumber, Item from    six_letters_cte slc    join    @input_letter_pairs ilp on slc.n=ilp.position                                and slc.split_letter=ilp.lettergroup by    ItemNumber, Itemhaving count(*)>1;`

Aus dem Paradies, das Cantor uns geschaffen, soll uns niemand vertreiben können

• BOR15K

SSCertifiable

Points: 5780

Thank you, Steve

Sadly none of your examples works for me from the Studio - getting various error messages.

• Steve Collins

Ten Centuries

Points: 1080

Maybe you do not have an fnTally function?  If so that's easily remedied.  Have a look at this thread.  What are the specific messages?

Aus dem Paradies, das Cantor uns geschaffen, soll uns niemand vertreiben können

• Jeff Moden

SSC Guru

Points: 997124

BOR15K wrote:

I need some basic help, which I will then try to extend, please.

Since very few people ever come back to tell us, I'll make you a deal...  Tell us what the extended use is for and I'll show you the solution I just wrote that might already be extended to do what you ultimately need it for.  😉

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
"Change is inevitable... change for the better is not".
"Dear Lord... I'm a DBA so please give me patience because, if you give me strength, I'm going to need bail money too!"

How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

• Jeff Moden

SSC Guru

Points: 997124

BOR15K wrote:

Thank you, Steve

Sadly none of your examples works for me from the Studio - getting various error messages.

The code probably worked just fine for Steve.  When you get such messages, you should post them so we can help you figure things out. 😉

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
"Change is inevitable... change for the better is not".
"Dear Lord... I'm a DBA so please give me patience because, if you give me strength, I'm going to need bail money too!"

How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

• Jason A. Long

SSC-Insane

Points: 23711

Take this for a spin...

`DECLARE	@array_6 varchar(MAX) = 'hurrah, buzzed, Bertha, patter, mettle, holler, collie, yuppie, rabble, tallow, brooks, tassel, meddle, callus, peddle, sobbed, kneels, getter, brooms, dulled, maggot, webbed',	@array_5 varchar(MAX) = ' floor, beTtY, carry, trees, sleep, tells',	@search_positions varchar(12) = '1,2,4';	WITH		cte_a6 AS (			SELECT 				val_6 = CONVERT(char(6), TRIM(ss6.value)),				match_on = STRING_AGG(lp6.letter_in_pos, '') WITHIN GROUP (ORDER BY t6.n)			FROM				STRING_SPLIT(@array_6, ',') ss6				CROSS APPLY (VALUES (1),(2),(3),(4),(5),(6) ) t6 (n)				CROSS APPLY ( VALUES (SUBSTRING(TRIM(ss6.value), t6.n, 1)) ) lp6 (letter_in_pos)			WHERE 				t6.n IN (SELECT TRY_CONVERT(int, sp.value) FROM STRING_SPLIT(@search_positions, ',') sp)			GROUP BY 				ss6.value			),		cte_a5 AS (			SELECT 				val_5 = CONVERT(char(5), TRIM(ss5.value)),				match_on = STRING_AGG(lp5.letter_in_pos, '') WITHIN GROUP (ORDER BY t5.n)			FROM				STRING_SPLIT(@array_5, ',') ss5				CROSS APPLY (VALUES (1),(2),(3),(4),(5) ) t5 (n)				CROSS APPLY ( VALUES (SUBSTRING(TRIM(ss5.value), t5.n, 1)) ) lp5 (letter_in_pos)			WHERE 				t5.n IN (SELECT TRY_CONVERT(int, sp.value) FROM STRING_SPLIT(@search_positions, ',') sp)			GROUP BY 				ss5.value			)SELECT 	a6.val_6,	--a6.match_on,	a5.val_5	--a5.match_onFROM	cte_a6 a6	JOIN cte_a5 a5		ON a6.match_on = a5.match_on;`

Returns...

`val_6  val_5------ -----Bertha beTtY`
• Jeff Moden

SSC Guru

Points: 997124

Jason A. Long wrote:

Take this for a spin...

`DECLARE	@array_6 varchar(MAX) = 'hurrah, buzzed, Bertha, patter, mettle, holler, collie, yuppie, rabble, tallow, brooks, tassel, meddle, callus, peddle, sobbed, kneels, getter, brooms, dulled, maggot, webbed',	@array_5 varchar(MAX) = ' floor, beTtY, carry, trees, sleep, tells',	@search_positions varchar(12) = '1,2,4';	WITH		cte_a6 AS (			SELECT 				val_6 = CONVERT(char(6), TRIM(ss6.value)),				match_on = STRING_AGG(lp6.letter_in_pos, '') WITHIN GROUP (ORDER BY t6.n)			FROM				STRING_SPLIT(@array_6, ',') ss6				CROSS APPLY (VALUES (1),(2),(3),(4),(5),(6) ) t6 (n)				CROSS APPLY ( VALUES (SUBSTRING(TRIM(ss6.value), t6.n, 1)) ) lp6 (letter_in_pos)			WHERE 				t6.n IN (SELECT TRY_CONVERT(int, sp.value) FROM STRING_SPLIT(@search_positions, ',') sp)			GROUP BY 				ss6.value			),		cte_a5 AS (			SELECT 				val_5 = CONVERT(char(5), TRIM(ss5.value)),				match_on = STRING_AGG(lp5.letter_in_pos, '') WITHIN GROUP (ORDER BY t5.n)			FROM				STRING_SPLIT(@array_5, ',') ss5				CROSS APPLY (VALUES (1),(2),(3),(4),(5) ) t5 (n)				CROSS APPLY ( VALUES (SUBSTRING(TRIM(ss5.value), t5.n, 1)) ) lp5 (letter_in_pos)			WHERE 				t5.n IN (SELECT TRY_CONVERT(int, sp.value) FROM STRING_SPLIT(@search_positions, ',') sp)			GROUP BY 				ss5.value			)SELECT 	a6.val_6,	--a6.match_on,	a5.val_5	--a5.match_onFROM	cte_a6 a6	JOIN cte_a5 a5		ON a6.match_on = a5.match_on;`

Returns...

`val_6  val_5------ -----Bertha beTtY`

Heh... it's amazing how similar that is to the code I wrote (still holding out on that because I really am curious what the end game is for this very interesting problem).  We even used similar column names.  You used "Match_On" and I used "MatchOn".

One difference is that I don't have a limit on the word widths and I don't use a numeric sequence.  I also used a single CTE for the "positions" whereas you used one in each WHERE clause... both will produce similar execution plans though because a CTE is re-executed every time it's called.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
"Change is inevitable... change for the better is not".
"Dear Lord... I'm a DBA so please give me patience because, if you give me strength, I'm going to need bail money too!"

How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

• Steve Collins

Ten Centuries

Points: 1080

Jeff Moden wrote:

BOR15K wrote:

Thank you, Steve

Sadly none of your examples works for me from the Studio - getting various error messages.

The code probably worked just fine for Steve.  When you get such messages, you should post them so we can help you figure things out. 😉

Yes, thank you for the benefit of the doubt here.  I'm afraid I've made a mess of it though.  I was playing around with splitters and I've updated the code several times with different ones.  At one point I changed it to string_split and that would've caused errors.  It should've been (the whole time) dbo.DelimitedSplitN4K.  Anyway, Jason's code looks great and it's doesn't use a tally tvf or an ordinal splitter so it's more appropriate

Aus dem Paradies, das Cantor uns geschaffen, soll uns niemand vertreiben können

• BOR15K

SSCertifiable

Points: 5780

Thank you All!

Steve, apologies - I had no intention to offend you. I have looked into fnTally / the link you have kindly provided.

Much obliged.

• Jason A. Long

SSC-Insane

Points: 23711

Jeff Moden wrote:

Heh... it's amazing how similar that is to the code I wrote (still holding out on that because I really am curious what the end game is for this very interesting problem).  We even used similar column names.  You used "Match_On" and I used "MatchOn".

One difference is that I don't have a limit on the word widths and I don't use a numeric sequence.  I also used a single CTE for the "positions" whereas you used one in each WHERE clause... both will produce similar execution plans though because a CTE is re-executed every time it's called.

Jeff - Considering the fact that that I'm consistently impressed by the solutions that you come up with, I'm forced to take that as a compliment. 😀  I hope you decide to post your solution no matter how the OP responds. I always seem to find some new nugget of gold in your solutions.

Anyway, your comment regarding the limit on word length got me thinking... I initially did it that way because the OP was pretty clear about the word lengths in the criteria so I figured that would be a safe move to make at the time. That said, your comment did shake something loose and it occurred to me that I didn't need the inline tally at all!!!

It, in fact, occurred to me that using a sequential tally to produce rows for every character and then filtering it later with another list of numbers was just plain stupid! Why not just use the list of numbers all by itself?

It eliminates the word length limits, the unneeded tally rows, the tally itself and the WHERE clause... and it makes for a cleaner bit of code and faster compile times.

`DECLARE	@array_6 varchar(MAX) = 'hurrah, buzzed, Bertha, patter, mettle, holler, collie, yuppie, rabble, tallow, brooks, tassel, meddle, callus, peddle, sobbed, kneels, getter, brooms, dulled, maggot, webbed',	@array_5 varchar(MAX) = ' floor, beTtY, carry, trees, sleep, tells',	@search_positions varchar(12) = '1,2,4';	WITH		cte_a6 AS (			SELECT 				val_6 = CONVERT(varchar(20), TRIM(ss6.value)),				match_on = STRING_AGG(lp6.letter_in_pos, '') WITHIN GROUP (ORDER BY sp6.n)			FROM				STRING_SPLIT(@array_6, ',') ss6				CROSS APPLY (SELECT TRY_CONVERT(int, sp.value) FROM STRING_SPLIT(@search_positions, ',') sp) sp6 (n)				CROSS APPLY ( VALUES (SUBSTRING(TRIM(ss6.value), sp6.n, 1)) ) lp6 (letter_in_pos)			GROUP BY 				ss6.value			),		cte_a5 AS (			SELECT 				val_5 = CONVERT(varchar(20), TRIM(ss5.value)),				match_on = STRING_AGG(lp5.letter_in_pos, '') WITHIN GROUP (ORDER BY sp5.n)			FROM				STRING_SPLIT(@array_5, ',') ss5				CROSS APPLY (SELECT TRY_CONVERT(int, sp.value) FROM STRING_SPLIT(@search_positions, ',') sp) sp5 (n)				CROSS APPLY ( VALUES (SUBSTRING(TRIM(ss5.value), sp5.n, 1)) ) lp5 (letter_in_pos)			GROUP BY 				ss5.value			)SELECT 	a6.val_6,	--a6.match_on,	a5.val_5	--a5.match_onFROM	cte_a6 a6	JOIN cte_a5 a5		ON a6.match_on = a5.match_on;GO`

• This reply was modified 1 month, 1 week ago by  Jason A. Long.
• Steve Collins

Ten Centuries

Points: 1080

BOR15K wrote:

Thank you All!

Steve, apologies - I had no intention to offend you. I have looked into fnTally / the link you have kindly provided.

Much obliged.

Thank you for the kind words!  I think I slightly misunderstood the requirements as well.  Jason's code made it clear.  Please keep at it with the questions if there's anything additional.  You said you wanted to extend it yourself and I respect that 🙂

But so... it's not over yet!  Or I hope it's not.  What would lead you to investigate a set based solution to a problem of this nature?  Apparently, Jeff has some code he references " both will produce similar execution plans..." but besides mine I only see one set of code.  Also, I have an alternate approach I'm trying as well and we'll see.  If it works then I want a speed test against these guys.

Aus dem Paradies, das Cantor uns geschaffen, soll uns niemand vertreiben können

Viewing 15 posts - 1 through 15 (of 41 total)