|
|
|
SSC-Addicted
      
Group: General Forum Members
Last Login: Today @ 3:22 AM
Points: 494,
Visits: 2,153
|
|
I have the following sample data (original and desired results)
with cteOriginal (CompanyName, CompName1, CompName2, CompName3) as ( select 'A V H S L ALQUILER Y VENTA DE HOGARES', NULL, NULL, NULL union all select 'TAXI GIL PIERRE', NULL, NULL, NULL union all select 'DE GOUDSMID M HEIJKOOP', NULL, NULL, NULL union all select 'S P S S', NULL, NULL, NULL union all select 'BOUCHERIE CHARCUTERIE ST HENRI', NULL, NULL, NULL union all select 'A A M RODERKERKEN', NULL, NULL, NULL union all select 'O MUNDO E A NOSSA CASA - C H E', NULL, NULL, NULL) select * from cteOriginal
;with cteResult (CompanyName, CompName1, CompName2, CompName3) as ( select 'A V H S L ALQUILER Y VENTA DE HOGARES', 'AVHS', 'ALQUILER', 'Y' union all select 'TAXI GIL PIERRE', 'TAXI', 'GIL', 'PIERRE' union all select 'DE GOUDSMID M HEIJKOOP', 'DE', 'GOUDSMID', 'M' union all select 'S P S S', 'SPSS', NULL, NULL union all select 'BOUCHERIE CHARCUTERIE ST HENRI', 'BOUCHERIE', 'CHARCUTERIE', 'ST' union all select 'A A M RODERKERKEN', 'AAM', 'RODERKERKEN', NULL union all select 'O MUNDO E A NOSSA CASA - C H E', 'O', 'MUNDO', 'E') select * from cteResult I have functions that extract first and second words but this isn't working for records like number 4.
Any clever suggestions?
Thanks.
----------------------------------- http://www.SQL4n00bs.com
|
|
|
|
|
SSCrazy
      
Group: General Forum Members
Last Login: Yesterday @ 6:03 AM
Points: 2,345,
Visits: 3,191
|
|
Not exactly clever nor pretty but this might get you close:
with cteOriginal (CompanyName, CompName1, CompName2, CompName3) as ( select 'A V H S L ALQUILER Y VENTA DE HOGARES', NULL, NULL, NULL union all select 'TAXI GIL PIERRE', NULL, NULL, NULL union all select 'DE GOUDSMID M HEIJKOOP', NULL, NULL, NULL union all select 'S P S S', NULL, NULL, NULL union all select 'BOUCHERIE CHARCUTERIE ST HENRI', NULL, NULL, NULL union all select 'A A M RODERKERKEN', NULL, NULL, NULL union all select 'O MUNDO E A NOSSA CASA - C H E', NULL, NULL, NULL) SELECT CompanyName, CompName1, CompName2 ,CompName3=LTRIM(LEFT(CompName3, CHARINDEX(' ', CompName3 + ' '))) FROM ( SELECT CompanyName, CompName1 ,CompName2=LTRIM(LEFT(CompName2, CHARINDEX(' ', CompName2 + ' '))) ,CompName3=LTRIM(RIGHT(CompName2, LEN(CompName2)-LEN(LEFT(CompName2, CHARINDEX(' ', CompName2 + ' '))))) FROM ( SELECT CompanyName, CompName1=REPLACE(CompName1, ' ', '') ,CompName2=LTRIM(SUBSTRING(CompanyName, DATALENGTH(CompName1)+1, LEN(CompanyName))) FROM ( SELECT CompName1= CASE WHEN PATINDEX('[A-Z][ ][A-Z][ ][A-Z][ ][A-Z][ ]%', CompanyName + ' ') = 1 THEN LEFT(CompanyName, 8) WHEN PATINDEX('[A-Z][ ][A-Z][ ][A-Z][ ]%', CompanyName + ' ') = 1 THEN LEFT(CompanyName, 6) WHEN PATINDEX('[A-Z][ ][A-Z][ ]%', CompanyName + ' ') = 1 THEN LEFT(CompanyName, 4) WHEN PATINDEX('[A-Z][ ]%', CompanyName + ' ') = 1 THEN LEFT(CompanyName, 2) ELSE LEFT(CompanyName, CHARINDEX(' ', CompanyName)-1) END ,CompanyName FROM cteOriginal) a) a) a
No loops! No CURSORs! No RBAR! Hoo-uh!
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
Need to UNPIVOT? Why not CROSS APPLY VALUES instead? Since random numbers are too important to be left to chance, let's generate some! Are you too recursively challenged? Splitting strings based on patterns can be fast!
|
|
|
|
|
SSC-Addicted
      
Group: General Forum Members
Last Login: Today @ 3:22 AM
Points: 494,
Visits: 2,153
|
|
Thanks dwain.
You should see what I came up with lol... ii's shockingly bad compared to yours.
Mine has so many IF statments lol!
----------------------------------- http://www.SQL4n00bs.com
|
|
|
|
|
SSC-Addicted
      
Group: General Forum Members
Last Login: Today @ 3:22 AM
Points: 494,
Visits: 2,153
|
|
I should have been clearer with my problem definition!
The rules are as follows:
I have a string = 'A B C D EFG ProperWord QR S T'
I want to generate a comma delimited 3 part string with the following rules:
1) Part1 = Concatenate all single chacaters in the string until you reach part of string where two or more characters are joined
So based on the above string Part1 would be ABCD
2) Part2 = EFG
3) Part3 = ProperWord
This is some of the code I'be writing but I just can't get this to work well!
ALTER FUNCTION dbo.fn_NormaliseOrgName(@OrgName NVARCHAR(200)) RETURNS NVARCHAR(200) AS BEGIN declare @OrgName NVARCHAR(200) = 'RA M I R A DI PETTI C & C SAS' DECLARE @Part1 NVARCHAR(200) = '' DECLARE @Part2 NVARCHAR(200) = '' DECLARE @Part3 NVARCHAR(200) = ''
IF PATINDEX('%[A-Z][A-Z]%', @OrgName) > 1 BEGIN SET @Part1 = LEFT(@OrgName, PATINDEX('%[A-Z][A-Z]%', @OrgName)-1) IF LTRIM(RTRIM(REPLACE(@OrgName, @Part1, '') = '' BEGIN RETURN @Part1 + ',' + @Part2 + ',' + @Part3 END ELSE select @Part1 = case when PATINDEX('%[A-Z][A-Z]%', @OrgName) > 1 then left(@OrgName, PATINDEX('%[A-Z][A-Z]%', @OrgName)-1) else LEFT(@OrgName, patindex('%[ ]%', @OrgName + ' ')-1) end select @part1
select @Part2 = substring(@OrgName, len(@Part1)+1 , LEN(@OrgName)) select @Part2 --select @Part1
select @Part2 = left( ltrim(substring(@OrgName, len(@Part1)+1 , LEN(@OrgName))), charindex(' ', ltrim(substring(@OrgName, len(@Part1)+1 , LEN(@OrgName))))-1) select @Part2 --RETURN @Part1 RETURN @Part2 --RETURN @Part3
END
select dbo.fn_NormaliseOrgName('R A M I R A DI PETTI C & C SAS')
----------------------------------- http://www.SQL4n00bs.com
|
|
|
|
|
SSCrazy
      
Group: General Forum Members
Last Login: Yesterday @ 6:03 AM
Points: 2,345,
Visits: 3,191
|
|
|
|
|
|
Old Hand
      
Group: General Forum Members
Last Login: Yesterday @ 9:08 PM
Points: 301,
Visits: 1,130
|
|
Hi
This isn't exactly pretty ... but it's another way to skin the cat
;with cteOriginal (CompanyName, CompName1, CompName2, CompName3) as ( select 'A V H S L ALQUILER Y VENTA DE HOGARES', NULL, NULL, NULL union all select 'TAXI GIL PIERRE', NULL, NULL, NULL union all select 'DE GOUDSMID M HEIJKOOP', NULL, NULL, NULL union all select 'S P S S', NULL, NULL, NULL union all select 'BOUCHERIE CHARCUTERIE ST HENRI', NULL, NULL, NULL union all select 'A A M RODERKERKEN', NULL, NULL, NULL union all select 'O MUNDO E A NOSSA CASA - C H E', NULL, NULL, NULL) -- Compress up the first single letters in the string ,preprocess as ( select CompanyName ,stuff( CompanyName ,1 ,isnull(nullif(patindex('% [a-z][a-z]%',substring(companyname,1,8)),0),isnull(nullif(charindex(' ',CompanyName,7),0),999)) ,replace(substring(companyname,1,isnull(nullif(patindex('% [a-z][a-z]%',substring(companyname,1,8)),0),isnull(nullif(charindex(' ',CompanyName,7),0),999))),' ','') + ' ' ) StuffedCompanyName from cteOriginal ) -- extract each company name in turn. ,r1 as ( select CompanyName ,rtrim(substring(StuffedCompanyName,1,isnull(nullif(charindex(' ',StuffedCompanyName),0),999))) Company1 ,ltrim(substring(StuffedCompanyName,isnull(nullif(charindex(' ',StuffedCompanyName),0),999),999)) leftover from preprocess ) ,r2 as ( select CompanyName ,Company1 ,nullif(rtrim(substring(leftover,1,isnull(nullif(charindex(' ',leftover),0),999))),'') Company2 ,ltrim(substring(leftover,isnull(nullif(charindex(' ',leftover),0),999),999)) leftover from r1 ) ,r3 as ( select CompanyName ,Company1 ,Company2 ,nullif(rtrim(substring(leftover,1,isnull(nullif(charindex(' ',leftover),0),999))),'') Company3 from r2 ) select * from r3 Micky
|
|
|
|
|
SSC-Addicted
      
Group: General Forum Members
Last Login: Today @ 3:22 AM
Points: 494,
Visits: 2,153
|
|
mickyT thanks so much for this! I forgot to thank you in my other thread!
Your help is much appreciated!
----------------------------------- http://www.SQL4n00bs.com
|
|
|
|
|
Old Hand
      
Group: General Forum Members
Last Login: Yesterday @ 9:08 PM
Points: 301,
Visits: 1,130
|
|
| No problem and you're welcome
|
|
|
|
|
SSC Veteran
      
Group: General Forum Members
Last Login: Tuesday, May 21, 2013 7:19 AM
Points: 283,
Visits: 1,239
|
|
It took a little effort, but here's a replace function that will do the job and more. Won't swear it can't be improved upon or what kind of performance curve it may have. For this small example it's quick. But consider it nothing more than a first draft. If you have thoughts of using it in a production environment you should certainly do some stress testing since it's a scalar function. If you are going to use it (for example) to just import data once a day then it should work just fine. I just wouldn't stick it in the SELECT statement of a heavily-used query in production without testing it thoroughly.
EDIT: I read your notes above after posting this. The function should work fine as it is, but you might need to rearrange a few columns in the pivot table query to get exactly what you need.
CREATE FUNCTION dbo.svfSplitReplaceByDelimiter ( @str VARCHAR(8000) ,@delimiter CHAR(1) ,@subchar1 CHAR(1) = '@' ,@subchar2 CHAR(1) = '~' ,@replaceval VARCHAR(50) = '|' ) RETURNS VARCHAR(MAX) AS BEGIN
/* Parameters: @str VARCHAR(8000) -- the input string ,@delimiter CHAR(1) -- the delimiter to use for the split ,@subchar1 CHAR(1) = '@' -- pick a character not in the input string (internal use) ,@subchar2 CHAR(1) = '~' -- pick a character not in the input string (internal use) ,@replaceval VARCHAR(50) = '|' -- the delimiter used for the final array; can be up to 50 chars per tag and can be useful for building XML or HTML strings */
DECLARE @pos INT ,@len INT ,@strRep VARCHAR(8000) ,@strTemp VARCHAR(8000) ,@strRFrag VARCHAR(50) ,@strOut VARCHAR(8000) ,@counter INT
SET @str = RTRIM(LTRIM(@str)) SET @strRep = REPLACE(@str,@delimiter,@subchar1) SET @strTemp = @strRep SET @strOut = ''
--get the last chunk for use later SET @pos = CHARINDEX(@subchar1,REVERSE(@strTemp))-1 SET @strRFrag = RIGHT(@strTemp,@pos)
--start looking for single characters --and if found aggregate them SET @counter = 1 SET @pos = CHARINDEX(@subchar1,@strRep) SET @len = LEN(@strTemp)
WHILE @counter <= @len BEGIN SET @strTemp = LEFT(@strRep,@pos) IF RIGHT(@strTemp,1) = @subchar1 BEGIN
IF @pos = 2 SET @strTemp = REPLACE(@strTemp,@subchar1,@subchar2) ELSE SET @strTemp = REPLACE(@strTemp,@subchar1,@delimiter) IF @pos > 2 SET @strOut = @strOut + @delimiter + @strTemp ELSE SET @strOut = @strOut + @strTemp IF RIGHT(@strOut,1) = @subchar2 SET @strOut = LEFT(@strOut,LEN(@strOut)-1) SET @pos = CHARINDEX(@subchar1,@strRep) SET @strRep = RIGHT(@strRep,LEN(@strRep)-@pos) END SET @pos = CHARINDEX(@subchar1,@strRep) SET @counter = @counter + 1 END
IF LEN(@strRFrag) = 1 SET @strOut = @strOut + @strRFrag ELSE SET @strOut = @strOut + @delimiter + @strRFrag SET @strOut = REPLACE(REPLACE(@strOut,@delimiter+@delimiter,@delimiter),' ',' ') SET @strOut = REPLACE(REPLACE(@strOut,@delimiter,@replaceval),' ',' ') IF LEFT(@strOut,1) = @delimiter SET @strOut = RIGHT(@strOut,LEN(@strOut)-1)
IF RIGHT(@strOut,1) = @delimiter SET @strOut = LEFT(@strOut,LEN(@strOut)-1) SET @strOut = RTRIM(LTRIM(@strOut))
RETURN @strOut
END GO
Now load the sample data and test the function. (Requires the function DelimitedSplit8K which can be found by searching this site. It's been posted so many times and it's such an indispensable function that I figure most people already have it.)
IF OBJECT_ID('tempdb..#TempTable') IS NOT NULL DROP TABLE #TempTable
CREATE TABLE #TempTable ( [ID] INT IDENTITY(1,1) NOT NULL, [OrigCompanyName] NVARCHAR(150) NULL, [CompanyName] NVARCHAR(150) NULL, PRIMARY KEY (ID)) ;WITH cteOriginal(CompanyName,CompName1,CompName2,CompName3) AS ( SELECT 'A V H S L ALQUILER Y VENTA DE HOGARES' ,NULL ,NULL ,NULL UNION ALL SELECT 'TAXI GIL PIERRE' ,NULL ,NULL ,NULL UNION ALL SELECT 'DE GOUDSMID M HEIJKOOP' ,NULL ,NULL ,NULL UNION ALL SELECT 'S P S S' ,NULL ,NULL ,NULL UNION ALL SELECT 'BOUCHERIE CHARCUTERIE ST HENRI' ,NULL ,NULL ,NULL UNION ALL SELECT 'A A M RODERKERKEN' ,NULL ,NULL ,NULL UNION ALL SELECT 'O MUNDO E A NOSSA CASA - C H E' ,NULL ,NULL ,NULL ) INSERT INTO #TempTable SELECT CompanyName AS OrigCompanyName ,(SELECT dbo.svfSplitReplaceByDelimiter(cte.CompanyName,' ','@','~','|')) AS CompanyName FROM cteOriginal AS cte --get rid of stray leading and trailing characters UPDATE #TempTable SET CompanyName = (CASE WHEN LEFT(CompanyName,1) = '|' THEN RIGHT(CompanyName,LEN(CompanyName)-1) WHEN RIGHT(CompanyName,1) = '|' THEN LEFT(CompanyName,LEN(CompanyName)-1) ELSE CompanyName END)
--create the pivot table result SELECT r1.ID ,(SELECT OrigCompanyName FROM #TempTable WHERE ID = r1.ID) AS CompanyName ,r1.Part1 ,r1.Part2 ,r1.Part3 ,r1.Part4 ,r1.Part5 ,r1.Part6 FROM ( SELECT r.ID ,MAX(CASE WHEN r.ItemNumber = 1 THEN r.Item END) AS Part1 ,MAX(CASE WHEN r.ItemNumber = 2 THEN r.Item END) AS Part2 ,MAX(CASE WHEN r.ItemNumber = 3 THEN r.Item END) AS Part3 ,MAX(CASE WHEN r.ItemNumber = 4 THEN r.Item END) AS Part4 ,MAX(CASE WHEN r.ItemNumber = 5 THEN r.Item END) AS Part5 ,MAX(CASE WHEN r.ItemNumber = 6 THEN r.Item END) AS Part6 FROM ( SELECT ID ,ItemNumber ,Item FROM #TempTable AS tt CROSS APPLY dbo.DelimitedSplit8K(tt.CompanyName,'|') AS dsk1 WHERE dsk1.Item <> '' ) r GROUP BY r.ID ) r1
The output:
ID CompanyName Part1 Part2 Part3 Part4 Part5 Part6 1 A V H S L ALQUILER Y VENTA DE HOGARES AVHSL ALQUILER Y VENTA DE HOGARES 2 TAXI GIL PIERRE TAXI GIL PIERRE NULL NULL NULL 3 DE GOUDSMID M HEIJKOOP DE GOUDSMID M HEIJKOOP NULL NULL 4 S P S S SPSS NULL NULL NULL NULL NULL 5 BOUCHERIE CHARCUTERIE ST HENRI BOUCHERIE CHARCUTERIE ST HENRI NULL NULL 6 A A M RODERKERKEN AAM RODERKERKEN NULL NULL NULL NULL 7 O MUNDO E A NOSSA CASA - C H E O MUNDO EA NOSSA CASA -CHE
|
|
|
|