splitting long sentence into multiple lines

  • T.S.Giridharan-371646

    SSC-Addicted

    Points: 455

    I am new to MS Sql. I am trying to find a function which will split a long sentence into multiple sentences of fixed length...

    For example

    'This is a long sentence and I want to split it into multiple sentences of 25 character each with out splitting a word into multiple lines'

    needs to be split as

    'This is a long sentence'

    'and I want to split it'

    'into multiple '

    'sentences of 25'

    'character each with '

    'out splitting a word '

    'into multiple lines'

    Any help on this is greatly appreciated.

    regards

    giri

  • Jeff Moden

    SSC Guru

    Points: 993661

    giri,

    If you'd share the business reason behind this requirement (I'm always curious), I believe I may have a way to pull it off.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
    "If you think its expensive to hire a professional to do the job, wait until you hire an amateur."--Red Adair
    "Change is inevitable... change for the better is not."
    When you put the right degree of spin on it, the number 3|8 is also a glyph that describes the nature of a DBAs job. 😉

    Helpful Links:
    How to post code problems

  • T.S.Giridharan-371646

    SSC-Addicted

    Points: 455

    The data needs to get pulled into a columnar report. There are 5 columns of fixed length. When the data gets into this report, I don't want the words to get split into 2 lines as it would affect the readability. One way of doing this is by manipulating the front end application. The other way which I thought was easier one was through the stored procedures.

  • Jeff Moden

    SSC Guru

    Points: 993661

    I am SO sorry about this post. I totally lost track of it. Thank you for taking the time to explain. I've got the code somewhere... I just need to find it again.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
    "If you think its expensive to hire a professional to do the job, wait until you hire an amateur."--Red Adair
    "Change is inevitable... change for the better is not."
    When you put the right degree of spin on it, the number 3|8 is also a glyph that describes the nature of a DBAs job. 😉

    Helpful Links:
    How to post code problems

  • Jeff Moden

    SSC Guru

    Points: 993661

    Found it. Again, sorry for the delay.

    First, you need a Tally table which is nothing more than a table with a single column of sequential numbers. For more information on how a Tally table works to replace a loop, please see the following article...

    http://www.sqlservercentral.com/articles/T-SQL/62867/

    Here's the code to build a "standard" 11,000 row Tally table but do read the article above to really understand what it does...

    --===== Create and populate the Tally table on the fly

    SELECT TOP 11000 IDENTITY(INT,1,1) AS N

    INTO dbo.Tally

    FROM Master.dbo.SysColumns sc1,

    Master.dbo.SysColumns sc2

    --===== Add a Primary Key to maximize performance

    ALTER TABLE dbo.Tally

    ADD CONSTRAINT PK_Tally_N

    PRIMARY KEY CLUSTERED (N) WITH FILLFACTOR = 100

    --===== Allow the general public to use it

    GRANT SELECT ON dbo.Tally TO PUBLIC

    Here's the code you need for your 25 character sentences. You could paramaterize the "25" in the code to make a User Defined Function that would have some bit a flexability...

    --===== Declare a variable to hold a long string.

    -- This would likely be a parameter in a User Defined Function

    DECLARE @pLongString VARCHAR(8000);

    SELECT @pLongString = 'This is a long sentence and I want to split it into multiple sentences of 25 character each with out splitting a word into multiple lines.';

    --===== We need a place to hold split words and determine which line of 25 characters each belongs to.

    DECLARE @Return TABLE

    (

    WordNum INT IDENTITY(1,1) PRIMARY KEY CLUSTERED,

    Word VARCHAR(25),

    Line INT

    );

    WITH

    --====== cteSplit splits all of the words and their trailing spaces and stores them in a table variable

    cteSplit AS

    (

    SELECT Word = SUBSTRING(@pLongString,t.N,CHARINDEX(' ',@pLongString+' ',t.N)-t.N+1)

    FROM dbo.Tally t

    WHERE N <= LEN(@pLongString)+1

    AND SUBSTRING(' '+@pLongString,t.N,1) = ' '

    )

    INSERT INTO @Return

    (Word)

    SELECT Word

    FROM cteSplit;

    --===== Declare some obviously named variables and preset them for use

    DECLARE @Width INT,

    @Line INT,

    @Dummy INT;

    SELECT @Width = 0,

    @Line = 1;

    --===== This counts up the length of each word and assigns them to groups of 25 characters

    -- which will eventually become the 25 character "sentences". The method used is

    -- known as the "Quirky Update".

    UPDATE @Return

    SET @Width = @Width + LEN(Word),

    @Line = Line = CASE WHEN @Width > 25 THEN @Line + 1 ELSE @Line END,

    @Width = CASE WHEN @Width > 25 THEN LEN(Word) ELSE @Width END,

    @Dummy = WordNum

    FROM @Return

    OPTION (MAXDOP 1);

    --===== Concatenate all the words for each numbered sentence.

    -- This would be the return for a User Defined Function.

    SELECT t1.Line,

    (SELECT ''+t2.Word

    FROM @return t2

    WHERE t2.Line = t1.Line --Correlation here

    ORDER BY t2.Line,t2.WordNum

    FOR XML PATH('')

    ) AS SomeLetters2

    FROM @return t1

    GROUP BY t1.Line -- without GROUP BY multiple rows are returned

    ORDER BY t1.Line -- Remove this line for a User Defined Function

    ;

    GO

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
    "If you think its expensive to hire a professional to do the job, wait until you hire an amateur."--Red Adair
    "Change is inevitable... change for the better is not."
    When you put the right degree of spin on it, the number 3|8 is also a glyph that describes the nature of a DBAs job. 😉

    Helpful Links:
    How to post code problems

  • Jeff Moden

    SSC Guru

    Points: 993661

    Did it help?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
    "If you think its expensive to hire a professional to do the job, wait until you hire an amateur."--Red Adair
    "Change is inevitable... change for the better is not."
    When you put the right degree of spin on it, the number 3|8 is also a glyph that describes the nature of a DBAs job. 😉

    Helpful Links:
    How to post code problems

  • mister boom

    Mr or Mrs. 500

    Points: 523

    Jeff

    That code doesn't quite work. For example, row 1 is returned as "This is a long sentence and I", which is 29 characters. I think you've forgotten to count spaces between words. If you change line

    "SET @Width = @Width + LEN(Word)" to

    "SET @Width = @Width + LEN(Word)+1", that fixes it for me.

    I know it's RBAR, but this also works for me, and may be easier for newbies to follow:

    ALTER FUNCTION FNSplitSentenceInto25Chars

    (@sentence varchar(1000))

    /************************************************************************

    Alun Basten

    26/10/2009

    This function takes a sentence and splits it into lines of up to

    25 characters in length without splitting any words

    ************************************************************************/

    RETURNS @CharSplit table

    (RowNumber tinyint,

    LineOfText varchar(25),

    LengthOfText tinyint)

    AS

    BEGIN

    --Declare variables

    DECLARE @lineoftext varchar(26)

    DECLARE @lastspace tinyint

    DECLARE @charsleft tinyint

    DECLARE @rownumber tinyint

    --Initialise variables

    SELECT @charsleft=len(@sentence)

    SELECT @rownumber=0

    --Loop through the sentence parameter, 26 chars at a time

    WHILE @charsleft>0

    BEGIN

    --Retrieve the first 26 chars. If the 26th char is a space, we can just take the first 25 chars

    --as our string. If the 26th char is not a space, find the space nearest to the end

    SELECT @lineoftext=LEFT(@sentence,26)

    SELECT @lastspace=CHARINDEX(' ',REVERSE(@lineoftext),1)

    SELECT @lineoftext=LEFT(@lineoftext,26-@lastspace)

    SELECT @sentence=LTRIM(SUBSTRING(@sentence,27-@lastspace,1000))

    SELECT @charsleft=LEN(@sentence)

    SELECT @rownumber=@rownumber+1

    INSERT @CharSplit

    SELECT @rownumber,@lineoftext,LEN(@lineoftext)

    END

    RETURN

    END

    /*************************************************************************************

    Example of how to run:

    SELECT * from FNSplitSentenceInto25Chars('This is a very long sentence and I would like to split it into multiple sentences of 25 characters each without splitting a word into multiple lines')

    **************************************************************************************/

    Regards

    Alun

  • T.S.Giridharan-371646

    SSC-Addicted

    Points: 455

    Sorry for the delay in my reply. Your code worked for me. Thanks.

  • Jeff Moden

    SSC Guru

    Points: 993661

    T.S.

    Alun was correct... I forgot to count the trailing space after each word. By changing LEN to DATALENGTH in my previous code, the answer is correctly resolved. Here's the code...

    --===== Declare a variable to hold a long string.

    -- This would likely be a parameter in a User Defined Function

    DECLARE @pLongString VARCHAR(8000);

    SELECT @pLongString = 'This is a long sentence and I want to split it into multiple sentences of 25 character each with out splitting a word into multiple lines.';

    --===== We need a place to hold split words and determine which line of 25 characters each belongs to.

    DECLARE @Return TABLE

    (

    WordNum INT IDENTITY(1,1) PRIMARY KEY CLUSTERED,

    Word VARCHAR(25),

    Line INT

    );

    WITH

    --====== cteSplit splits all of the words and their trailing spaces and stores them in a table variable

    cteSplit AS

    (

    SELECT Word = SUBSTRING(@pLongString,t.N,CHARINDEX(' ',@pLongString+' ',t.N)-t.N+1)

    FROM dbo.Tally t

    WHERE N <= LEN(@pLongString)+1

    AND SUBSTRING(' '+@pLongString,t.N,1) = ' '

    )

    INSERT INTO @Return

    (Word)

    SELECT Word

    FROM cteSplit;

    --===== Declare some obviously named variables and preset them for use

    DECLARE @Width INT,

    @Line INT,

    @Dummy INT;

    SELECT @Width = 0,

    @Line = 1;

    --===== This counts up the length of each word and assigns them to groups of 25 characters

    -- which will eventually become the 25 character "sentences". The method used is

    -- known as the "Quirky Update".

    UPDATE @Return

    SET @Width = @Width + LEN(Word),

    @Line = Line = CASE WHEN @Width > 25 THEN @Line + 1 ELSE @Line END,

    @Width = CASE WHEN @Width > 25 THEN LEN(Word) ELSE @Width END,

    @Dummy = WordNum

    FROM @Return

    OPTION (MAXDOP 1);

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
    "If you think its expensive to hire a professional to do the job, wait until you hire an amateur."--Red Adair
    "Change is inevitable... change for the better is not."
    When you put the right degree of spin on it, the number 3|8 is also a glyph that describes the nature of a DBAs job. 😉

    Helpful Links:
    How to post code problems

  • Jeff Moden

    SSC Guru

    Points: 993661

    mister boom (10/26/2009)


    I know it's RBAR, but this also works for me, and may be easier for newbies to follow:

    Thanks for the feedback, Alun, but that's exactly what I'm worried about... 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
    "If you think its expensive to hire a professional to do the job, wait until you hire an amateur."--Red Adair
    "Change is inevitable... change for the better is not."
    When you put the right degree of spin on it, the number 3|8 is also a glyph that describes the nature of a DBAs job. 😉

    Helpful Links:
    How to post code problems

  • alexis.c.cesar

    Valued Member

    Points: 61

    Jeff Moden - Friday, October 30, 2009 10:35 PM

    T.S.Alun was correct... I forgot to count the trailing space after each word. By changing LEN to DATALENGTH in my previous code, the answer is correctly resolved. Here's the code...--===== Declare a variable to hold a long string. -- This would likely be a parameter in a User Defined FunctionDECLARE @pLongString VARCHAR(8000); SELECT @pLongString = 'This is a long sentence and I want to split it into multiple sentences of 25 character each with out splitting a word into multiple lines.';--===== We need a place to hold split words and determine which line of 25 characters each belongs to.DECLARE @Return TABLE ( WordNum INT IDENTITY(1,1) PRIMARY KEY CLUSTERED, Word VARCHAR(25), Line INT );WITH--====== cteSplit splits all of the words and their trailing spaces and stores them in a table variablecteSplit AS( SELECT Word = SUBSTRING(@pLongString,t.N,CHARINDEX(' ',@pLongString+' ',t.N)-t.N+1) FROM dbo.Tally t WHERE N <= LEN(@pLongString)+1 AND SUBSTRING(' '+@pLongString,t.N,1) = ' ') INSERT INTO @Return (Word) SELECT Word FROM cteSplit;--===== Declare some obviously named variables and preset them for useDECLARE @Width INT, @Line INT, @Dummy INT; SELECT @Width = 0, @Line = 1;--===== This counts up the length of each word and assigns them to groups of 25 characters -- which will eventually become the 25 character "sentences". The method used is -- known as the "Quirky Update". UPDATE @Return SET @Width = @Width + LEN(Word), @Line = Line = CASE WHEN @Width > 25 THEN @Line + 1 ELSE @Line END, @Width = CASE WHEN @Width > 25 THEN LEN(Word) ELSE @Width END, @Dummy = WordNum FROM @Return OPTION (MAXDOP 1);

    Hello,

    This is my first post and I know this is a really old thread but the information here almost has me to the end of what I need to do. First of all thank you to everyone who posted. 

    I need to split an entire table of long sentences into 30 characters each, which sounds about what this does. It works great for the first record but the second record is actually very short and so the third record begins in place of the second. Source Data:

    Split Data:
    Record #4 Text should say "OUTSIDE PROCESS -"
    Record #5 Text should begin with "1" D-SERIES DRIVE"

    The "Line" column is not working out correctly because of how short the Text is in recrod 2 above.

    My code to create the Tally table was taken from above and I modified the code above to read from the source table.  I have pasted it below.  Any assistance would be greatly appreciated.

    Sorry I'm unable to find out how to post the code properly so it is just showing up on one line unfortunately.  It is almost the same, just set for 30 and pulling from a table shown in the first screenshot above.


    --===== Declare a variable to hold a long string.  -- This would likely be a parameter in a User Defined FunctionDECLARE @pLongString VARCHAR(8000);DECLARE @StockCode VARCHAR(50);DECLARE @PrevStockCode VARCHAR(50);--===== We need a place to hold split words and determine which line of 30 characters each belongs to.DECLARE @Return TABLE   (   WordNum INT IDENTITY(1,1) PRIMARY KEY CLUSTERED,StockCode VARCHAR(50),TextType CHAR(1),   Word  VARCHAR(30),   Line  INT   );DECLARE @MyCursor CURSOR;BEGINSET @MyCursor = CURSOR FORSELECT StockCode, UpdatedDesc FROM SharpeNarrations..ConcatTechNotes_P WHERE id Between 1 and 3OPEN @MyCursorFETCH NEXT FROM @MyCursorINTO @StockCode, @pLongStringWHILE @@FETCH_STATUS = 0BEGINWITH--====== cteSplit splits all of the words and their trailing spaces and stores them in a table variablecteSplit AS( SELECT Word  = SUBSTRING(@pLongString,t.N,CHARINDEX(' ',@pLongString+' ',t.N)-t.N+1) FROM tempdb.dbo.Tally tWHERE N <= LEN(@pLongString)+1AND SUBSTRING(' '+@pLongString,t.N,1) = ' ') INSERT INTO @Return(StockCode, TextType, Word) SELECT @StockCode, 'P', Word FROM cteSplit;DECLARE @q XML = (SELECT * FROM @Return FOR XML AUTO)--===== Declare some obviously named variables and preset them for useDECLARE @Width INT,@Line INT,@Dummy INT; SELECT @Width = 0,@Line = 1;--Trying to increment the line number by testing against previous stock code to see if I can get record #3 '1 IN D-SERIES DRIVE' to not--overlap record #2 '010003'SELECT @Line = CASE WHEN (@StockCode = @PrevStockCode) OR (@PrevStockCode = '') THEN @Line ELSE @Line + 1 END--===== This counts up the length of each word and assigns them to groups of 30 characters -- which will eventually become the 30 character "sentences". The method used is -- known as the "Quirky Update". UPDATE @ReturnSET @Width = @Width + DATALENGTH(Word),@Line = Line = CASE WHEN @Width > 30 THEN @Line + 1 ELSE @Line END,@Width =   CASE WHEN @Width > 30 THEN DATALENGTH(Word) ELSE @Width END,@Dummy = WordNum FROM @Return OPTION (MAXDOP 1); SELECT @PrevStockCode = @StockCodeFETCH NEXT FROM @MyCursorINTO @StockCode, @pLongStringEND;CLOSE @MyCursor;DEALLOCATE @MyCursor;END;--===== Concatenate all the words for each numbered sentence.  -- This would be the return for a User Defined Function. SELECT t1.StockCode, TextType, t1.Line,   (SELECT ''+t2.Word          FROM @Return t2         WHERE t2.Line = t1.Line --Correlation here         ORDER BY t2.Line,t2.WordNum          FOR XML PATH('')      ) AS Text FROM @Return t1 GROUP BY t1.StockCode, t1.TextType, t1.Line -- without GROUP BY multiple rows are returned ORDER BY t1.StockCode, t1.TextType, t1.Line -- Remove this line for a User Defined Function;GO

    Thank you in advance,

Viewing 11 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply