Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase 12»»

Extracting a specific number of words from a string in sql Expand / Collapse
Author
Message
Posted Wednesday, February 27, 2008 6:50 AM
SSC Veteran

SSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC Veteran

Group: General Forum Members
Last Login: Thursday, October 23, 2014 7:06 AM
Points: 247, Visits: 703
I'm currently trying to write a query that will extract the first two words from a string. I'm using charindex and substrings and it's getting quite messy! Has anyone ever written a function may be to extract a specific number of words i.e. text around spaces please? It would need to be able to cope with data where there are different lengths and number of words such as:

This is the first line
Second line
Word
This is the fourth line

and if I was asking it to return a string containing the first two words the results would be:

This is
Second line
Word
This is

Thanks.
Post #460837
Posted Wednesday, February 27, 2008 6:59 AM


SSChampion

SSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampion

Group: General Forum Members
Last Login: Today @ 2:35 PM
Points: 12,962, Visits: 32,498
I know that by using regular expressions, the \b is a word boundary; it might be a space/tab/Crlf, or it might be punctuation...period comma exclamation semicolon etc.
using regualr expressions, you'd be able to grab x number of words, based on word boundaries.

that would be the best way to go, but I've only installed the extended stored procedures from the SSC Toolkit on SQL2000.
let me install on 2005 and test a few expressions.


Lowell

--There is no spoon, and there's no default ORDER BY in sql server either.
Actually, Common Sense is so rare, it should be considered a Superpower. --my son
Post #460845
Posted Wednesday, February 27, 2008 7:06 AM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Today @ 2:12 PM
Points: 7,179, Visits: 15,771
Lowell - if you're on 2005 - use CLR instead to create the REGEX expressions instead. The XP is nice in 2000, but you can do better in 2005.

----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
Post #460853
Posted Wednesday, February 27, 2008 7:16 AM
SSCommitted

SSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommitted

Group: General Forum Members
Last Login: Thursday, January 9, 2014 3:33 AM
Points: 1,559, Visits: 672
This may look a little funky, but it does the job :)


declare @x table ( words varchar(50) not null )
insert @x
select 'This is the first line' union all
select 'Second line' union all
select 'Word' union all
select 'This is the fourth line'


select case when charindex(' ', words, charindex(' ', words) + 1) = 0
then words
else left(words, charindex(' ', words, charindex(' ', words) + 1))
end
from @x
go

--------------------------------------------------
This is
Second line
Word
This is

(4 row(s) affected)

/Kenneth



Post #460865
Posted Wednesday, February 27, 2008 7:22 AM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Today @ 7:27 AM
Points: 35,769, Visits: 32,437
martin.griffiths (2/27/2008)
I'm currently trying to write a query that will extract the first two words from a string. I'm using charindex and substrings and it's getting quite messy! Has anyone ever written a function may be to extract a specific number of words i.e. text around spaces please? It would need to be able to cope with data where there are different lengths and number of words such as:

This is the first line
Second line
Word
This is the fourth line

and if I was asking it to return a string containing the first two words the results would be:

This is
Second line
Word
This is

Thanks.


Just curious... what will this be used for? In other words, why do you need to do this?


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #460871
Posted Wednesday, February 27, 2008 7:30 AM
SSC Veteran

SSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC Veteran

Group: General Forum Members
Last Login: Thursday, October 23, 2014 7:06 AM
Points: 247, Visits: 703
I need to do this to parse out data required by an agency we send address details too to place the contents of our address line 1 to their separate Flat No, house number, and street name fields. So I'm basically going to try to select the first two words if address line 1 starts with Flat in to Flat No, if it begins with a number then place the number in to house number field, and then put the remainder (or the else) in to the street name field. Not entirely 100% reliable but will do. Kenneth's response above will work nicely if I'm only looking for two words but will not work with instances like my fourth line where there's multiple spaces. Would prefer a function though that I can re-use taking in a string and a number representing the number of words required. Like the way it works though thanks Kenneth.
Post #460876
Posted Wednesday, February 27, 2008 7:32 AM
SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Tuesday, December 16, 2014 9:58 AM
Points: 2,280, Visits: 3,069
I found a function that may be more along the lines of what you are looking for. This function requires two inputs and the number of words you would like returned. I did not write the code and will give credit to the programmer. His link is http://www.tom-muck.com/blog/index.cfm?newsid=72

CREATE FUNCTION udf_GetNumberOfWords (
@stringToSplit varchar(8000),
@numberOfWords int
)

RETURNS varchar(8000) AS

BEGIN

DECLARE @currentword varchar(8000)
DECLARE @returnstring varchar(8000)
DECLARE @wordcount int
SET @wordcount = 0
SET @returnstring = ''
SET @currentword = ''
SET @stringToSplit = ltrim(rtrim(@stringToSplit))
Declare @index int

WHILE @wordcount 0
BEGIN
Select @index = CHARINDEX(' ', @stringToSplit)
if @index = 0
BEGIN
SELECT @currentword = ltrim(rtrim(@stringToSplit))
SELECT @wordcount = @numberOfWords
END
else
BEGIN
IF (len(@stringToSplit) - @index > 0) BEGIN
SELECT @currentword = ltrim(rtrim(LEFT(@stringToSplit, @index-1)))--the new shortened string
SELECT @stringToSplit = RIGHT(@stringToSplit,LEN(@stringToSplit) - @index) -- the rest
END
END
SELECT @returnstring = @returnstring + ' ' + @currentword
SELECT @wordcount = @wordcount + 1
END

SET @returnstring = LTRIM(@returnstring)
RETURN @returnstring

END

The function can be called like this: (2 is the number of words to return)
select dbo.udf_GetNumberOfWords(mycolumn,2)
from mytable





My blog: http://jahaines.blogspot.com
Post #460880
Posted Wednesday, February 27, 2008 7:38 AM


SSChampion

SSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampion

Group: General Forum Members
Last Login: Today @ 2:35 PM
Points: 12,962, Visits: 32,498
Matt Miller (2/27/2008)
Lowell - if you're on 2005 - use CLR instead to create the REGEX expressions instead. The XP is nice in 2000, but you can do better in 2005.

agreed Matt; I think I'm waiting for someone to put together a package of "CLR must have functions", they way someone did for the Toolkit.
Regular expressions, with specific calls for common string manipulations (strip html for example) is something I'm playing with now, but i get distracted easily.


Lowell

--There is no spoon, and there's no default ORDER BY in sql server either.
Actually, Common Sense is so rare, it should be considered a Superpower. --my son
Post #460889
Posted Wednesday, February 27, 2008 7:40 AM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Today @ 2:12 PM
Points: 7,179, Visits: 15,771
I have the start to one. I need to spiffy them up and I will see what I can do about posting them.



----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
Post #460891
Posted Wednesday, February 27, 2008 8:02 AM
SSC Veteran

SSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC Veteran

Group: General Forum Members
Last Login: Thursday, October 23, 2014 7:06 AM
Points: 247, Visits: 703
That function's worked a dream thanks. I added a little while loop in to take care of the multiple spaces and it looks really good now. Many thanks.
Post #460912
« Prev Topic | Next Topic »

Add to briefcase 12»»

Permissions Expand / Collapse