Click here to monitor SSC
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Extracting a specific number of words from a string in sql


Extracting a specific number of words from a string in sql

Author
Message
Griffster
Griffster
SSC Veteran
SSC Veteran (251 reputation)SSC Veteran (251 reputation)SSC Veteran (251 reputation)SSC Veteran (251 reputation)SSC Veteran (251 reputation)SSC Veteran (251 reputation)SSC Veteran (251 reputation)SSC Veteran (251 reputation)

Group: General Forum Members
Points: 251 Visits: 713
I'm currently trying to write a query that will extract the first two words from a string. I'm using charindex and substrings and it's getting quite messy! Has anyone ever written a function may be to extract a specific number of words i.e. text around spaces please? It would need to be able to cope with data where there are different lengths and number of words such as:

This is the first line
Second line
Word
This is the fourth line

and if I was asking it to return a string containing the first two words the results would be:

This is
Second line
Word
This is

Thanks.
Lowell
Lowell
SSChampion
SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)

Group: General Forum Members
Points: 14912 Visits: 38896
I know that by using regular expressions, the \b is a word boundary; it might be a space/tab/Crlf, or it might be punctuation...period comma exclamation semicolon etc.
using regualr expressions, you'd be able to grab x number of words, based on word boundaries.

that would be the best way to go, but I've only installed the extended stored procedures from the SSC Toolkit on SQL2000.
let me install on 2005 and test a few expressions.

Lowell

--
help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

Matt Miller (#4)
Matt Miller (#4)
SSCertifiable
SSCertifiable (7.6K reputation)SSCertifiable (7.6K reputation)SSCertifiable (7.6K reputation)SSCertifiable (7.6K reputation)SSCertifiable (7.6K reputation)SSCertifiable (7.6K reputation)SSCertifiable (7.6K reputation)SSCertifiable (7.6K reputation)

Group: General Forum Members
Points: 7624 Visits: 18043
Lowell - if you're on 2005 - use CLR instead to create the REGEX expressions instead. The XP is nice in 2000, but you can do better in 2005.

----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
Kenneth Wilhelmsson
Kenneth Wilhelmsson
SSCommitted
SSCommitted (1.6K reputation)SSCommitted (1.6K reputation)SSCommitted (1.6K reputation)SSCommitted (1.6K reputation)SSCommitted (1.6K reputation)SSCommitted (1.6K reputation)SSCommitted (1.6K reputation)SSCommitted (1.6K reputation)

Group: General Forum Members
Points: 1575 Visits: 699
This may look a little funky, but it does the job Smile



declare @x table ( words varchar(50) not null )
insert @x
select 'This is the first line' union all
select 'Second line'    union all
select 'Word' union all
select 'This is the fourth line'


select   case when charindex(' ', words, charindex(' ', words) + 1) = 0
then words
    else left(words, charindex(' ', words, charindex(' ', words) + 1))
    end
from @x
go

--------------------------------------------------
This is
Second line
Word
This is

(4 row(s) affected)



/Kenneth



Jeff Moden
Jeff Moden
SSC-Forever
SSC-Forever (44K reputation)SSC-Forever (44K reputation)SSC-Forever (44K reputation)SSC-Forever (44K reputation)SSC-Forever (44K reputation)SSC-Forever (44K reputation)SSC-Forever (44K reputation)SSC-Forever (44K reputation)

Group: General Forum Members
Points: 44802 Visits: 39845
martin.griffiths (2/27/2008)
I'm currently trying to write a query that will extract the first two words from a string. I'm using charindex and substrings and it's getting quite messy! Has anyone ever written a function may be to extract a specific number of words i.e. text around spaces please? It would need to be able to cope with data where there are different lengths and number of words such as:

This is the first line
Second line
Word
This is the fourth line

and if I was asking it to return a string containing the first two words the results would be:

This is
Second line
Word
This is

Thanks.


Just curious... what will this be used for? In other words, why do you need to do this?

--Jeff Moden

RBAR is pronounced ree-bar and is a Modenism for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
     Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
Although they tell us that they want it real bad, our primary goal is to ensure that we dont actually give it to them that way.
Although change is inevitable, change for the better is usually not.
Just because you can do something in PowerShell, doesnt mean you should. Wink

Helpful Links:
How to post code problems
How to post performance problems
Forum FAQs
Griffster
Griffster
SSC Veteran
SSC Veteran (251 reputation)SSC Veteran (251 reputation)SSC Veteran (251 reputation)SSC Veteran (251 reputation)SSC Veteran (251 reputation)SSC Veteran (251 reputation)SSC Veteran (251 reputation)SSC Veteran (251 reputation)

Group: General Forum Members
Points: 251 Visits: 713
I need to do this to parse out data required by an agency we send address details too to place the contents of our address line 1 to their separate Flat No, house number, and street name fields. So I'm basically going to try to select the first two words if address line 1 starts with Flat in to Flat No, if it begins with a number then place the number in to house number field, and then put the remainder (or the else) in to the street name field. Not entirely 100% reliable but will do. Kenneth's response above will work nicely if I'm only looking for two words but will not work with instances like my fourth line where there's multiple spaces. Would prefer a function though that I can re-use taking in a string and a number representing the number of words required. Like the way it works though thanks Kenneth.
Adam Haines
Adam Haines
SSCrazy
SSCrazy (2.3K reputation)SSCrazy (2.3K reputation)SSCrazy (2.3K reputation)SSCrazy (2.3K reputation)SSCrazy (2.3K reputation)SSCrazy (2.3K reputation)SSCrazy (2.3K reputation)SSCrazy (2.3K reputation)

Group: General Forum Members
Points: 2324 Visits: 3135
I found a function that may be more along the lines of what you are looking for. This function requires two inputs and the number of words you would like returned. I did not write the code and will give credit to the programmer. His link is http://www.tom-muck.com/blog/index.cfm?newsid=72


CREATE FUNCTION udf_GetNumberOfWords (
@stringToSplit varchar(8000),
@numberOfWords int
)

RETURNS varchar(8000) AS

BEGIN

DECLARE @currentword varchar(8000)
DECLARE @returnstring varchar(8000)
DECLARE @wordcount int
SET @wordcount = 0
SET @returnstring = ''
SET @currentword = ''
SET @stringToSplit = ltrim(rtrim(@stringToSplit))
Declare @index int

WHILE @wordcount 0
BEGIN
Select @index = CHARINDEX(' ', @stringToSplit)
if @index = 0
BEGIN
SELECT @currentword = ltrim(rtrim(@stringToSplit))
SELECT @wordcount = @numberOfWords
END
else
BEGIN
IF (len(@stringToSplit) - @index > 0) BEGIN
SELECT @currentword = ltrim(rtrim(LEFT(@stringToSplit, @index-1)))--the new shortened string
SELECT @stringToSplit = RIGHT(@stringToSplit,LEN(@stringToSplit) - @index) -- the rest
END
END
SELECT @returnstring = @returnstring + ' ' + @currentword
SELECT @wordcount = @wordcount + 1
END

SET @returnstring = LTRIM(@returnstring)
RETURN @returnstring

END



The function can be called like this: (2 is the number of words to return)

select dbo.udf_GetNumberOfWords(mycolumn,2)
from mytable





My blog: http://jahaines.blogspot.com
Lowell
Lowell
SSChampion
SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)

Group: General Forum Members
Points: 14912 Visits: 38896
Matt Miller (2/27/2008)
Lowell - if you're on 2005 - use CLR instead to create the REGEX expressions instead. The XP is nice in 2000, but you can do better in 2005.

agreed Matt; I think I'm waiting for someone to put together a package of "CLR must have functions", they way someone did for the Toolkit.
Regular expressions, with specific calls for common string manipulations (strip html for example) is something I'm playing with now, but i get distracted easily.

Lowell

--
help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

Matt Miller (#4)
Matt Miller (#4)
SSCertifiable
SSCertifiable (7.6K reputation)SSCertifiable (7.6K reputation)SSCertifiable (7.6K reputation)SSCertifiable (7.6K reputation)SSCertifiable (7.6K reputation)SSCertifiable (7.6K reputation)SSCertifiable (7.6K reputation)SSCertifiable (7.6K reputation)

Group: General Forum Members
Points: 7624 Visits: 18043
I have the start to one. I need to spiffy them up and I will see what I can do about posting them.

----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
Griffster
Griffster
SSC Veteran
SSC Veteran (251 reputation)SSC Veteran (251 reputation)SSC Veteran (251 reputation)SSC Veteran (251 reputation)SSC Veteran (251 reputation)SSC Veteran (251 reputation)SSC Veteran (251 reputation)SSC Veteran (251 reputation)

Group: General Forum Members
Points: 251 Visits: 713
That function's worked a dream thanks. I added a little while loop in to take care of the multiple spaces and it looks really good now. Many thanks.
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search