Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase

Split String into 3 parts with a twist! Expand / Collapse
Author
Message
Posted Tuesday, January 29, 2013 10:44 AM


Say Hey Kid

Say Hey KidSay Hey KidSay Hey KidSay Hey KidSay Hey KidSay Hey KidSay Hey KidSay Hey Kid

Group: General Forum Members
Last Login: Wednesday, April 09, 2014 2:57 AM
Points: 704, Visits: 3,270
I have the following sample data (original and desired results)

with cteOriginal (CompanyName, CompName1, CompName2, CompName3)
as
(
select 'A V H S L ALQUILER Y VENTA DE HOGARES', NULL, NULL, NULL union all
select 'TAXI GIL PIERRE', NULL, NULL, NULL union all
select 'DE GOUDSMID M HEIJKOOP', NULL, NULL, NULL union all
select 'S P S S', NULL, NULL, NULL union all
select 'BOUCHERIE CHARCUTERIE ST HENRI', NULL, NULL, NULL union all
select 'A A M RODERKERKEN', NULL, NULL, NULL union all
select 'O MUNDO E A NOSSA CASA - C H E', NULL, NULL, NULL)
select * from cteOriginal

;with cteResult (CompanyName, CompName1, CompName2, CompName3)
as
(
select 'A V H S L ALQUILER Y VENTA DE HOGARES', 'AVHS', 'ALQUILER', 'Y' union all
select 'TAXI GIL PIERRE', 'TAXI', 'GIL', 'PIERRE' union all
select 'DE GOUDSMID M HEIJKOOP', 'DE', 'GOUDSMID', 'M' union all
select 'S P S S', 'SPSS', NULL, NULL union all
select 'BOUCHERIE CHARCUTERIE ST HENRI', 'BOUCHERIE', 'CHARCUTERIE', 'ST' union all
select 'A A M RODERKERKEN', 'AAM', 'RODERKERKEN', NULL union all
select 'O MUNDO E A NOSSA CASA - C H E', 'O', 'MUNDO', 'E')
select * from cteResult

I have functions that extract first and second words but this isn't working for records like number 4.

Any clever suggestions?

Thanks.


---------------------------------------------------------


It takes a minimal capacity for rational thought to see that the corporate 'free press' is a structurally irrational and biased, and extremely violent, system of elite propaganda.
David Edwards - Media lens

Society has varying and conflicting interests; what is called objectivity is the disguise of one of these interests - that of neutrality. But neutrality is a fiction in an unneutral world. There are victims, there are executioners, and there are bystanders... and the 'objectivity' of the bystander calls for inaction while other heads fall.
Howard Zinn
Post #1413192
Posted Wednesday, January 30, 2013 1:31 AM


Hall of Fame

Hall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of Fame

Group: General Forum Members
Last Login: Today @ 4:39 AM
Points: 3,594, Visits: 5,105
Not exactly clever nor pretty but this might get you close:

with cteOriginal (CompanyName, CompName1, CompName2, CompName3)
as
(
select 'A V H S L ALQUILER Y VENTA DE HOGARES', NULL, NULL, NULL union all
select 'TAXI GIL PIERRE', NULL, NULL, NULL union all
select 'DE GOUDSMID M HEIJKOOP', NULL, NULL, NULL union all
select 'S P S S', NULL, NULL, NULL union all
select 'BOUCHERIE CHARCUTERIE ST HENRI', NULL, NULL, NULL union all
select 'A A M RODERKERKEN', NULL, NULL, NULL union all
select 'O MUNDO E A NOSSA CASA - C H E', NULL, NULL, NULL)
SELECT CompanyName, CompName1, CompName2
,CompName3=LTRIM(LEFT(CompName3, CHARINDEX(' ', CompName3 + ' ')))
FROM (
SELECT CompanyName, CompName1
,CompName2=LTRIM(LEFT(CompName2, CHARINDEX(' ', CompName2 + ' ')))
,CompName3=LTRIM(RIGHT(CompName2, LEN(CompName2)-LEN(LEFT(CompName2, CHARINDEX(' ', CompName2 + ' ')))))
FROM (
SELECT CompanyName, CompName1=REPLACE(CompName1, ' ', '')
,CompName2=LTRIM(SUBSTRING(CompanyName, DATALENGTH(CompName1)+1, LEN(CompanyName)))
FROM (
SELECT CompName1=
CASE WHEN PATINDEX('[A-Z][ ][A-Z][ ][A-Z][ ][A-Z][ ]%', CompanyName + ' ') = 1
THEN LEFT(CompanyName, 8)
WHEN PATINDEX('[A-Z][ ][A-Z][ ][A-Z][ ]%', CompanyName + ' ') = 1
THEN LEFT(CompanyName, 6)
WHEN PATINDEX('[A-Z][ ][A-Z][ ]%', CompanyName + ' ') = 1
THEN LEFT(CompanyName, 4)
WHEN PATINDEX('[A-Z][ ]%', CompanyName + ' ') = 1
THEN LEFT(CompanyName, 2)
ELSE LEFT(CompanyName, CHARINDEX(' ', CompanyName)-1) END
,CompanyName
FROM cteOriginal) a) a) a





My mantra: No loops! No CURSORs! No RBAR! Hoo-uh!

My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.


Need to UNPIVOT? Why not CROSS APPLY VALUES instead?
Since random numbers are too important to be left to chance, let's generate some!
Learn to understand recursive CTEs by example.
Splitting strings based on patterns can be fast!
Post #1413508
Posted Wednesday, January 30, 2013 3:44 AM


Say Hey Kid

Say Hey KidSay Hey KidSay Hey KidSay Hey KidSay Hey KidSay Hey KidSay Hey KidSay Hey Kid

Group: General Forum Members
Last Login: Wednesday, April 09, 2014 2:57 AM
Points: 704, Visits: 3,270
Thanks dwain.

You should see what I came up with lol... ii's shockingly bad compared to yours.

Mine has so many IF statments lol!


---------------------------------------------------------


It takes a minimal capacity for rational thought to see that the corporate 'free press' is a structurally irrational and biased, and extremely violent, system of elite propaganda.
David Edwards - Media lens

Society has varying and conflicting interests; what is called objectivity is the disguise of one of these interests - that of neutrality. But neutrality is a fiction in an unneutral world. There are victims, there are executioners, and there are bystanders... and the 'objectivity' of the bystander calls for inaction while other heads fall.
Howard Zinn
Post #1413551
Posted Wednesday, January 30, 2013 6:10 AM


Say Hey Kid

Say Hey KidSay Hey KidSay Hey KidSay Hey KidSay Hey KidSay Hey KidSay Hey KidSay Hey Kid

Group: General Forum Members
Last Login: Wednesday, April 09, 2014 2:57 AM
Points: 704, Visits: 3,270
I should have been clearer with my problem definition!

The rules are as follows:

I have a string = 'A B C D EFG ProperWord QR S T'

I want to generate a comma delimited 3 part string with the following rules:


1) Part1 = Concatenate all single chacaters in the string until you reach part of string where two or more characters are joined

So based on the above string Part1 would be ABCD

2) Part2 = EFG

3) Part3 = ProperWord

This is some of the code I'be writing but I just can't get this to work well!

ALTER FUNCTION dbo.fn_NormaliseOrgName(@OrgName NVARCHAR(200))
RETURNS NVARCHAR(200)
AS
BEGIN

declare @OrgName NVARCHAR(200) = 'RA M I R A DI PETTI C & C SAS'
DECLARE @Part1 NVARCHAR(200) = ''
DECLARE @Part2 NVARCHAR(200) = ''
DECLARE @Part3 NVARCHAR(200) = ''

IF PATINDEX('%[A-Z][A-Z]%', @OrgName) > 1
BEGIN
SET @Part1 = LEFT(@OrgName, PATINDEX('%[A-Z][A-Z]%', @OrgName)-1)
IF LTRIM(RTRIM(REPLACE(@OrgName, @Part1, '') = ''
BEGIN
RETURN @Part1 + ',' + @Part2 + ',' + @Part3
END
ELSE




select @Part1 = case when PATINDEX('%[A-Z][A-Z]%', @OrgName) > 1
then left(@OrgName, PATINDEX('%[A-Z][A-Z]%', @OrgName)-1)
else LEFT(@OrgName, patindex('%[ ]%', @OrgName + ' ')-1) end
select @part1


select @Part2 = substring(@OrgName, len(@Part1)+1 , LEN(@OrgName))
select @Part2
--select @Part1

select @Part2 = left( ltrim(substring(@OrgName, len(@Part1)+1 , LEN(@OrgName))), charindex(' ', ltrim(substring(@OrgName, len(@Part1)+1 , LEN(@OrgName))))-1)
select @Part2
--RETURN @Part1
RETURN @Part2
--RETURN @Part3

END

select dbo.fn_NormaliseOrgName('R A M I R A DI PETTI C & C SAS')



---------------------------------------------------------


It takes a minimal capacity for rational thought to see that the corporate 'free press' is a structurally irrational and biased, and extremely violent, system of elite propaganda.
David Edwards - Media lens

Society has varying and conflicting interests; what is called objectivity is the disguise of one of these interests - that of neutrality. But neutrality is a fiction in an unneutral world. There are victims, there are executioners, and there are bystanders... and the 'objectivity' of the bystander calls for inaction while other heads fall.
Howard Zinn
Post #1413598
Posted Wednesday, January 30, 2013 5:23 PM


Hall of Fame

Hall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of Fame

Group: General Forum Members
Last Login: Today @ 4:39 AM
Points: 3,594, Visits: 5,105
Abu,

I think my code will handle the sample you provided in your text, albeit it doesn't join the 3 parts into a comma delimited string (but that's easy).

On the other hand, I am unsure of how to handle the example in your code:

R A M I R A DI PETTI C & C SAS



If your intent is to combine to RAMIRA, you could do that by introducing two additional WHEN clauses at the top of my CASE on PATINDEX.



My mantra: No loops! No CURSORs! No RBAR! Hoo-uh!

My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.


Need to UNPIVOT? Why not CROSS APPLY VALUES instead?
Since random numbers are too important to be left to chance, let's generate some!
Learn to understand recursive CTEs by example.
Splitting strings based on patterns can be fast!
Post #1413803
Posted Wednesday, January 30, 2013 8:00 PM
SSC Eights!

SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!

Group: General Forum Members
Last Login: Thursday, April 17, 2014 4:39 AM
Points: 818, Visits: 2,485
Hi

This isn't exactly pretty ... but it's another way to skin the cat
;with cteOriginal (CompanyName, CompName1, CompName2, CompName3)
as
(
select 'A V H S L ALQUILER Y VENTA DE HOGARES', NULL, NULL, NULL union all
select 'TAXI GIL PIERRE', NULL, NULL, NULL union all
select 'DE GOUDSMID M HEIJKOOP', NULL, NULL, NULL union all
select 'S P S S', NULL, NULL, NULL union all
select 'BOUCHERIE CHARCUTERIE ST HENRI', NULL, NULL, NULL union all
select 'A A M RODERKERKEN', NULL, NULL, NULL union all
select 'O MUNDO E A NOSSA CASA - C H E', NULL, NULL, NULL)
-- Compress up the first single letters in the string
,preprocess as (
select CompanyName
,stuff(
CompanyName
,1
,isnull(nullif(patindex('% [a-z][a-z]%',substring(companyname,1,8)),0),isnull(nullif(charindex(' ',CompanyName,7),0),999))
,replace(substring(companyname,1,isnull(nullif(patindex('% [a-z][a-z]%',substring(companyname,1,8)),0),isnull(nullif(charindex(' ',CompanyName,7),0),999))),' ','') + ' '
) StuffedCompanyName
from cteOriginal
)
-- extract each company name in turn.
,r1 as (
select CompanyName
,rtrim(substring(StuffedCompanyName,1,isnull(nullif(charindex(' ',StuffedCompanyName),0),999))) Company1
,ltrim(substring(StuffedCompanyName,isnull(nullif(charindex(' ',StuffedCompanyName),0),999),999)) leftover
from preprocess
)
,r2 as (
select CompanyName
,Company1
,nullif(rtrim(substring(leftover,1,isnull(nullif(charindex(' ',leftover),0),999))),'') Company2
,ltrim(substring(leftover,isnull(nullif(charindex(' ',leftover),0),999),999)) leftover
from r1
)
,r3 as (
select CompanyName
,Company1
,Company2
,nullif(rtrim(substring(leftover,1,isnull(nullif(charindex(' ',leftover),0),999))),'') Company3
from r2
)
select * from r3

Micky
Post #1413823
Posted Thursday, January 31, 2013 7:13 AM


Say Hey Kid

Say Hey KidSay Hey KidSay Hey KidSay Hey KidSay Hey KidSay Hey KidSay Hey KidSay Hey Kid

Group: General Forum Members
Last Login: Wednesday, April 09, 2014 2:57 AM
Points: 704, Visits: 3,270
mickyT thanks so much for this! I forgot to thank you in my other thread!

Your help is much appreciated!


---------------------------------------------------------


It takes a minimal capacity for rational thought to see that the corporate 'free press' is a structurally irrational and biased, and extremely violent, system of elite propaganda.
David Edwards - Media lens

Society has varying and conflicting interests; what is called objectivity is the disguise of one of these interests - that of neutrality. But neutrality is a fiction in an unneutral world. There are victims, there are executioners, and there are bystanders... and the 'objectivity' of the bystander calls for inaction while other heads fall.
Howard Zinn
Post #1414122
Posted Thursday, January 31, 2013 11:10 AM
SSC Eights!

SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!

Group: General Forum Members
Last Login: Thursday, April 17, 2014 4:39 AM
Points: 818, Visits: 2,485
No problem and you're welcome
Post #1414262
Posted Friday, February 01, 2013 9:03 PM
SSC-Addicted

SSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-Addicted

Group: General Forum Members
Last Login: Sunday, September 29, 2013 1:24 AM
Points: 429, Visits: 1,721
It took a little effort, but here's a replace function that will do the job and more. Won't swear it can't be improved upon or what kind of performance curve it may have. For this small example it's quick. But consider it nothing more than a first draft. If you have thoughts of using it in a production environment you should certainly do some stress testing since it's a scalar function. If you are going to use it (for example) to just import data once a day then it should work just fine. I just wouldn't stick it in the SELECT statement of a heavily-used query in production without testing it thoroughly.

EDIT: I read your notes above after posting this. The function should work fine as it is, but you might need to rearrange a few columns in the pivot table query to get exactly what you need.

CREATE FUNCTION dbo.svfSplitReplaceByDelimiter 
(
@str VARCHAR(8000)
,@delimiter CHAR(1)
,@subchar1 CHAR(1) = '@'
,@subchar2 CHAR(1) = '~'
,@replaceval VARCHAR(50) = '|'
)
RETURNS VARCHAR(MAX)
AS
BEGIN

/*
Parameters:

@str VARCHAR(8000) -- the input string
,@delimiter CHAR(1) -- the delimiter to use for the split
,@subchar1 CHAR(1) = '@' -- pick a character not in the input string (internal use)
,@subchar2 CHAR(1) = '~' -- pick a character not in the input string (internal use)
,@replaceval VARCHAR(50) = '|' -- the delimiter used for the final array; can be up to 50 chars
per tag and can be useful for building XML or HTML strings

*/

DECLARE
@pos INT
,@len INT
,@strRep VARCHAR(8000)
,@strTemp VARCHAR(8000)
,@strRFrag VARCHAR(50)
,@strOut VARCHAR(8000)
,@counter INT

SET @str = RTRIM(LTRIM(@str))
SET @strRep = REPLACE(@str,@delimiter,@subchar1)
SET @strTemp = @strRep
SET @strOut = ''

--get the last chunk for use later
SET @pos = CHARINDEX(@subchar1,REVERSE(@strTemp))-1
SET @strRFrag = RIGHT(@strTemp,@pos)


--start looking for single characters
--and if found aggregate them

SET @counter = 1
SET @pos = CHARINDEX(@subchar1,@strRep)
SET @len = LEN(@strTemp)

WHILE @counter <= @len
BEGIN

SET @strTemp = LEFT(@strRep,@pos)

IF RIGHT(@strTemp,1) = @subchar1
BEGIN

IF @pos = 2
SET @strTemp = REPLACE(@strTemp,@subchar1,@subchar2)
ELSE
SET @strTemp = REPLACE(@strTemp,@subchar1,@delimiter)

IF @pos > 2
SET @strOut = @strOut + @delimiter + @strTemp
ELSE
SET @strOut = @strOut + @strTemp

IF RIGHT(@strOut,1) = @subchar2
SET @strOut = LEFT(@strOut,LEN(@strOut)-1)

SET @pos = CHARINDEX(@subchar1,@strRep)
SET @strRep = RIGHT(@strRep,LEN(@strRep)-@pos)

END

SET @pos = CHARINDEX(@subchar1,@strRep)
SET @counter = @counter + 1

END


IF LEN(@strRFrag) = 1
SET @strOut = @strOut + @strRFrag
ELSE
SET @strOut = @strOut + @delimiter + @strRFrag

SET @strOut = REPLACE(REPLACE(@strOut,@delimiter+@delimiter,@delimiter),' ',' ')
SET @strOut = REPLACE(REPLACE(@strOut,@delimiter,@replaceval),' ',' ')

IF LEFT(@strOut,1) = @delimiter
SET @strOut = RIGHT(@strOut,LEN(@strOut)-1)

IF RIGHT(@strOut,1) = @delimiter
SET @strOut = LEFT(@strOut,LEN(@strOut)-1)

SET @strOut = RTRIM(LTRIM(@strOut))


RETURN @strOut

END
GO



Now load the sample data and test the function. (Requires the function DelimitedSplit8K which can be found by searching this site. It's been posted so many times and it's such an indispensable function that I figure most people already have it.)


IF OBJECT_ID('tempdb..#TempTable') IS NOT NULL
DROP TABLE #TempTable

CREATE TABLE #TempTable (
[ID] INT IDENTITY(1,1) NOT NULL,
[OrigCompanyName] NVARCHAR(150) NULL,
[CompanyName] NVARCHAR(150) NULL,
PRIMARY KEY (ID))

;WITH cteOriginal(CompanyName,CompName1,CompName2,CompName3)
AS (
SELECT
'A V H S L ALQUILER Y VENTA DE HOGARES'
,NULL
,NULL
,NULL
UNION ALL
SELECT
'TAXI GIL PIERRE'
,NULL
,NULL
,NULL
UNION ALL
SELECT
'DE GOUDSMID M HEIJKOOP'
,NULL
,NULL
,NULL
UNION ALL
SELECT
'S P S S'
,NULL
,NULL
,NULL
UNION ALL
SELECT
'BOUCHERIE CHARCUTERIE ST HENRI'
,NULL
,NULL
,NULL
UNION ALL
SELECT
'A A M RODERKERKEN'
,NULL
,NULL
,NULL
UNION ALL
SELECT
'O MUNDO E A NOSSA CASA - C H E'
,NULL
,NULL
,NULL
)
INSERT INTO #TempTable
SELECT
CompanyName AS OrigCompanyName
,(SELECT dbo.svfSplitReplaceByDelimiter(cte.CompanyName,' ','@','~','|')) AS CompanyName
FROM
cteOriginal AS cte

--get rid of stray leading and trailing characters
UPDATE #TempTable
SET CompanyName =
(CASE
WHEN LEFT(CompanyName,1) = '|' THEN RIGHT(CompanyName,LEN(CompanyName)-1)
WHEN RIGHT(CompanyName,1) = '|' THEN LEFT(CompanyName,LEN(CompanyName)-1)
ELSE CompanyName
END)

--create the pivot table result
SELECT
r1.ID
,(SELECT OrigCompanyName FROM #TempTable WHERE ID = r1.ID) AS CompanyName
,r1.Part1
,r1.Part2
,r1.Part3
,r1.Part4
,r1.Part5
,r1.Part6
FROM
(
SELECT
r.ID
,MAX(CASE WHEN r.ItemNumber = 1 THEN r.Item END) AS Part1
,MAX(CASE WHEN r.ItemNumber = 2 THEN r.Item END) AS Part2
,MAX(CASE WHEN r.ItemNumber = 3 THEN r.Item END) AS Part3
,MAX(CASE WHEN r.ItemNumber = 4 THEN r.Item END) AS Part4
,MAX(CASE WHEN r.ItemNumber = 5 THEN r.Item END) AS Part5
,MAX(CASE WHEN r.ItemNumber = 6 THEN r.Item END) AS Part6
FROM
(
SELECT
ID
,ItemNumber
,Item
FROM
#TempTable AS tt
CROSS APPLY
dbo.DelimitedSplit8K(tt.CompanyName,'|') AS dsk1
WHERE
dsk1.Item <> ''
) r
GROUP BY
r.ID
) r1


The output:


ID CompanyName Part1 Part2 Part3 Part4 Part5 Part6
1 A V H S L ALQUILER Y VENTA DE HOGARES AVHSL ALQUILER Y VENTA DE HOGARES
2 TAXI GIL PIERRE TAXI GIL PIERRE NULL NULL NULL
3 DE GOUDSMID M HEIJKOOP DE GOUDSMID M HEIJKOOP NULL NULL
4 S P S S SPSS NULL NULL NULL NULL NULL
5 BOUCHERIE CHARCUTERIE ST HENRI BOUCHERIE CHARCUTERIE ST HENRI NULL NULL
6 A A M RODERKERKEN AAM RODERKERKEN NULL NULL NULL NULL
7 O MUNDO E A NOSSA CASA - C H E O MUNDO EA NOSSA CASA -CHE


Post #1414900
« Prev Topic | Next Topic »

Add to briefcase

Permissions Expand / Collapse