ProperCase Function

  • Some excellent work

    DECLARE @TestVar VARCHAR(16);

    DECLARE @StartTime DATETIME;

    PRINT 'Old Method Scalar Function'

    SELECT @StartTime = GETDATE();

    SELECT@TestVar = Utility.f_ProperCase(PT.String)

    FROM#ProperTest AS PT;

    PRINT CAST(DATEDIFF(ms,@StartTime,GETDATE()) AS VARCHAR) + ' MilliSeconds';

    PRINT 'Dohsan'

    SELECT @StartTime = GETDATE();

    SELECT@TestVar = CA1.ProperCaseString

    FROM#ProperTest AS PT

    CROSS

    APPLYdbo.ProperCase(PT.String) AS CA1;

    PRINT CAST(DATEDIFF(ms,@StartTime,GETDATE()) AS VARCHAR) + ' MilliSeconds';

    PRINT 'Luis'

    SELECT @StartTime = GETDATE();

    SELECT@TestVar = CA1.ProperCased

    FROM#ProperTest AS PT

    CROSS

    APPLYdbo.ProperCase1(PT.String) AS CA1;

    PRINT CAST(DATEDIFF(ms,@StartTime,GETDATE()) AS VARCHAR) + ' MilliSeconds';

    PRINT 'Dwain'

    SELECT @StartTime = GETDATE();

    SELECT@TestVar = CA1.ProperCaseStr

    FROM#ProperTest AS PT

    CROSS

    APPLYdbo.ProperCase2(PT.String) AS CA1;

    PRINT CAST(DATEDIFF(ms,@StartTime,GETDATE()) AS VARCHAR) + ' MilliSeconds';

    PRINT 'Magoo varchar255'

    SELECT @StartTime = GETDATE();

    SELECT@TestVar = CA1.[text]

    FROM#ProperTest AS PT

    CROSS

    APPLYdbo.ProperCase3(PT.String) AS CA1;

    PRINT CAST(DATEDIFF(ms,@StartTime,GETDATE()) AS VARCHAR) + ' MilliSeconds';

    PRINT 'Luis V2'

    SELECT @StartTime = GETDATE();

    SELECT@TestVar = CA1.ProperCaseString

    FROM#ProperTest AS PT

    CROSS

    APPLYdbo.ProperCase4(PT.String) AS CA1;

    PRINT CAST(DATEDIFF(ms,@StartTime,GETDATE()) AS VARCHAR) + ' MilliSeconds';

    Here as the results:

    300k

    Old Method Scalar Function

    47306 MilliSeconds

    Dohsan

    19113 MilliSeconds

    Luis

    5946 MilliSeconds

    Dwain

    54486 MilliSeconds

    Magoo varchar255

    5440 MilliSeconds

    Luis V2

    6433 MilliSeconds

    1million

    Old Method Scalar Function

    160710 MilliSeconds

    Dohsan

    63750 MilliSeconds

    Luis

    15450 MilliSeconds

    Dwain

    175366 MilliSeconds

    Magoo varchar255

    15513 MilliSeconds

    Luis V2

    17843 MilliSeconds

    How do they handle some special cases?

    DECLARE @SingleTestString VARCHAR(255) = 'ándre, ándre, luís and o''brien had a low-budget 3d printer in washington d.c.',

    @TestVar2 VARCHAR(255);

    PRINT 'Original String'

    PRINT @SingleTestString;

    PRINT 'Old Method Scalar Function'

    SELECT @TestVar2 = Utility.f_ProperCase(@SingleTestString);

    PRINT @TestVar2

    PRINT 'Dohsan'

    SELECT @TestVar2 = ProperCaseString FROM dbo.ProperCase(@SingleTestString);

    PRINT @TestVar2

    PRINT 'Luis'

    SELECT @TestVar2 = ProperCased FROM dbo.ProperCase1(@SingleTestString);

    PRINT @TestVar2

    PRINT 'Dwain'

    SELECT @TestVar2 = ProperCaseStr FROM dbo.ProperCase2(@SingleTestString);

    PRINT @TestVar2

    PRINT 'Magoo varchar255'

    SELECT @TestVar2 = [text] FROM dbo.ProperCase3(@SingleTestString);

    PRINT @TestVar2

    PRINT 'Luis V2'

    SELECT @TestVar2 = ProperCaseString FROM dbo.ProperCase4(@SingleTestString);

    PRINT @TestVar2

    Results:

    Original String

    ándre, ándre, luís and o'brien had a low-budget 3d printer in washington d.c.

    Old Method Scalar Function

    Ándre, Ándre, Luís And O'Brien Had A Low-Budget 3d Printer In Washington D.c.

    Dohsan

    Ándre, Ándre, Luís And O'Brien Had A Low-Budget 3d Printer In Washington D.c.

    Luis

    Ándre, Ándre, Luís And O'brien Had A Low-budget 3d Printer In Washington D.c.

    Dwain

    Ándre, Ándre, Luís And O'Brien Had A Low-Budget 3D Printer In Washington D.C.

    Magoo varchar255

    Ándre, Ándre, Luís And O'Brien Had A Low-Budget 3D Printer In Washington D.C.

    Luis V2

    Ándre, Ándre, Luís And O'Brien Had A Low-Budget 3d Printer In Washington D.C.

  • Original Test strings were small (16 chars), the original function was only for 255 although a few of the functions presented have been able to handle far more. So upping it to nearer the 255 limit to see how they cope (so all functions can be run)

    IF OBJECT_ID('tempdb..#ProperTest','U') IS NOT NULL

    DROP TABLE #ProperTest;

    SELECTREPLICATE(STUFF(STUFF(CHAR(ABS(CHECKSUM(NEWID()))%26+65)

    + CHAR(ABS(CHECKSUM(NEWID()))%26+65)

    + CHAR(ABS(CHECKSUM(NEWID()))%26+65)

    + CHAR(ABS(CHECKSUM(NEWID()))%26+65)

    + CHAR(ABS(CHECKSUM(NEWID()))%26+65)

    + CHAR(ABS(CHECKSUM(NEWID()))%26+65)

    + CHAR(ABS(CHECKSUM(NEWID()))%26+65)

    + CHAR(ABS(CHECKSUM(NEWID()))%26+65)

    + CHAR(ABS(CHECKSUM(NEWID()))%26+65)

    + CHAR(ABS(CHECKSUM(NEWID()))%26+65)

    + CHAR(ABS(CHECKSUM(NEWID()))%26+65)

    + CHAR(ABS(CHECKSUM(NEWID()))%26+65)

    + CHAR(ABS(CHECKSUM(NEWID()))%26+65)

    + CHAR(ABS(CHECKSUM(NEWID()))%26+65)

    + CHAR(ABS(CHECKSUM(NEWID()))%26+65)

    + CHAR(ABS(CHECKSUM(NEWID()))%26+65),N%16+1,1,' '),N%3+1,1,' '),14) AS String

    INTO #ProperTest

    FROMdbo.GetNums(1,300000);

    Results:

    Old Method Scalar Function

    617393 MilliSeconds

    Dohsan

    162690 MilliSeconds

    Luis

    36680 MilliSeconds

    Dwain

    424590 MilliSeconds

    Magoo varchar255

    44063 MilliSeconds

    Luis V2

    49436 MilliSeconds

  • I ran some similar comparisons, but I had run them up to handle varchar(8000) and ran against a million-row test table with the strings filled to capacity. They were running pretty close to each other (with MM being the best at ~28 seconds), but I haven't tested everything with special characters or multiple trailing spaces. Dropping down to a max length of 255 definitely improved performance (dropping the best to ~22 seconds), but I hadn't completed testing last night before I needed to get some sleep. 😛

    All in all, I think the work done on this thread has been excellent.

  • That is some excellent work on the testing there Dohsan - well done!:-)

    MM



    select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);

  • Forum Etiquette: How to post Reporting Services problems
  • [/url]
  • Forum Etiquette: How to post data/code on a forum to get the best help - by Jeff Moden
  • [/url]
  • How to Post Performance Problems - by Gail Shaw
  • [/url]

  • BWAAA-HAAA!!!! Dare I say IT? DARE I? Oh, what the heck...

    [font="Arial Black"]"Ahhhhhhh MAGOO! You've done it again." [/font]:-P (Sorry ol' friend... I've been holding it back for more than a month and couldn't resist).

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (1/17/2014)


    BWAAA-HAAA!!!! Dare I say IT? DARE I? Oh, what the heck...

    [font="Arial Black"]"Ahhhhhhh MAGOO! You've done it again." [/font]:-P (Sorry ol' friend... I've been holding it back for more than a month and couldn't resist).

    If I am honest, I kind of missed it :hehe:

    MM



    select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);

  • Forum Etiquette: How to post Reporting Services problems
  • [/url]
  • Forum Etiquette: How to post data/code on a forum to get the best help - by Jeff Moden
  • [/url]
  • How to Post Performance Problems - by Gail Shaw
  • [/url]

  • I agree that you've done some very thorough testing here. Excellent work!

    Dohsan (1/17/2014)


    Dwain

    Ándre, Ándre, Luís And O'Brien Had A Low-Budget 3D Printer In Washington D.C.

    I wasn't sure how you wanted to handle "Low-Budget' and "3D" but either can be handled in the opposite manner by adjusting the match pattern.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • dwain.c (1/17/2014)


    I agree that you've done some very thorough testing here. Excellent work!

    Dohsan (1/17/2014)


    Dwain

    Ándre, Ándre, Luís And O'Brien Had A Low-Budget 3D Printer In Washington D.C.

    I wasn't sure how you wanted to handle "Low-Budget' and "3D" but either can be handled in the opposite manner by adjusting the match pattern.

    Don't make me start with contractions or possessives. You'Ll Say It'S My Fault.:hehe:

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • Luis Cazares (1/17/2014)


    dwain.c (1/17/2014)


    I agree that you've done some very thorough testing here. Excellent work!

    Dohsan (1/17/2014)


    Dwain

    Ándre, Ándre, Luís And O'Brien Had A Low-Budget 3D Printer In Washington D.C.

    I wasn't sure how you wanted to handle "Low-Budget' and "3D" but either can be handled in the opposite manner by adjusting the match pattern.

    Don't make me start with contractions or possessives. You'Ll Say It'S My Fault.:hehe:

    hah very true!

  • This thread has been fun to watch, thanks folks! Here's an option I'm surprised nobody posted up:

    CREATE FUNCTION [dbo].[IF_ProperWithREPLACE]

    (

    @inputstring VARCHAR(8000)

    )

    RETURNS TABLE WITH SCHEMABINDING

    AS

    RETURN

    (

    SELECT ProperisedString =

    REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(

    REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(

    REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(

    ' '+@inputstring COLLATE Latin1_General_BIN,

    ' a',' A'),' b',' B'),' c',' C'),' d',' D'),' e',' E'),' f',' F'),' g',' G'),' h',' H'),' i',' I'),

    ' j',' J'),' k',' K'),' l',' L'),' m',' M'),' n',' N'),' o',' O'),' p',' P'),' q',' Q'),' r',' R'),

    ' s',' S'),' t',' T'),' u',' U'),' v',' V'),' w',' W'),' x',' X'),' y',' Y'),' z',' Z')

    )

    It's about 8x faster than any of the solutions posted so far, leaving some scope for dealing with weird stuff like accents.


    [font="Arial"]Low-hanging fruit picker and defender of the moggies[/font]

    For better assistance in answering your questions, please read this[/url].


    Understanding and using APPLY, (I)[/url] and (II)[/url] Paul White[/url]

    Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]

  • I came back to this thread today because I was trying to proper case some stuff around here. I needed to exclude tokens that contain at least a single digit but all other tokens should be init caps.

    I made a slight modification to Dwain's that worked well for my purposes.

    CREATE FUNCTION [dbo].[ProperCaseWithNumbers]

    (

    @StrIn VARCHAR(255)

    )

    RETURNS TABLE WITH SCHEMABINDING

    AS

    RETURN

    SELECT STUFF((

    SELECT CASE

    WHEN split.Item LIKE '%[0-9]%'

    THEN ' ' + split.Item

    ELSE ' ' + Upper(LEFT(split.Item, 1)) + Lower(Substring(split.Item, 2, 255))

    END AS word

    FROM dbo.Delimitedsplit8k(@StrIn, ' ') split

    FOR XML PATH('')

    ,TYPE

    ).value('.', 'varchar(255)'), 1, 1, '') AS ProperCased

    We have some data coming from the mainframe and it is all caps but I need to make this data more "user friendly". Here is an example of this one in action.

    declare @MyVals table (SeriesName varchar(25))

    insert @MyVals

    select '9400 SERIES' union all

    select '9760 SERIES' union all

    select 'ADE360 SERIES' union all

    select 'CUSTOM' union all

    select 'M16360 SERIES' union all

    select 'N1260 SERIES' union all

    select 'N3160 SERIES' union all

    select 'PA360 SERIES' union all

    select 'S260 SERIES'

    select *

    from @MyVals v

    cross apply dbo.ProperCaseWithNumbers(v.SeriesName)

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • Sean Lange (2/26/2014)


    I made a slight modification to Dwain's that worked well for my purposes.

    Can't say I see that much of a resemblance but it is always good to hear success stories.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • dwain.c (2/26/2014)


    Sean Lange (2/26/2014)


    I made a slight modification to Dwain's that worked well for my purposes.

    Can't say I see that much of a resemblance but it is always good to hear success stories.

    Doh!!! That's because it was a modification of the code Luis posted.

    Thanks Luis!!!

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • Sean Lange (2/27/2014)


    dwain.c (2/26/2014)


    Sean Lange (2/26/2014)


    I made a slight modification to Dwain's that worked well for my purposes.

    Can't say I see that much of a resemblance but it is always good to hear success stories.

    Doh!!! That's because it was a modification of the code Luis posted.

    Thanks Luis!!!

    I knew the code look familiar 😀

    I guess that's the benefit of collective intelligence.

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • I wonder if a CLR function would be worth considering here? .NET includes the System.Globalization.TextInfo.ToTitleCase() function which seems to do a pretty good job getting things properly capitalized. Not perfect but perhaps an extension method could handle the edge cases ?

  • Viewing 15 posts - 16 through 30 (of 30 total)

    You must be logged in to reply to this topic. Login to reply