Detective Stories - Changing the Case

  • So many different solutions... suggests SQL should have a proper case function of it's own?

    Many moons ago I came up with this - our need is quite straightforward, lower case and capitalise each word delimited by spaces and/or hyphens. Not pretty, but it does the job it's designed to do:

    CREATE FUNCTION [dbo].[ToProperCase](@string NVARCHAR(4000)) RETURNS NVARCHAR(4000)

    AS

    BEGIN

    SET @string = LOWER(LTRIM(RTRIM(@string)))

    DECLARE @i INT

    SET @i = ASCII('a')

    WHILE @i <= ASCII('z')

    BEGIN

    SET @string = REPLACE( @string, ' ' + CHAR(@i), ' ' + CHAR(@i-32))

    SET @string = REPLACE( @string, '-' + CHAR(@i), '-' + CHAR(@i-32))

    SET @i = @i + 1

    END

    IF (ASCII(LEFT(@string, 1)) BETWEEN ASCII('a') AND ASCII ('z'))

    BEGIN

    SET @string = CHAR(ASCII(LEFT(@string, 1))-32) + RIGHT(@string, LEN(@string)-1)

    END

    RETURN @string

    END

  • Same Userdefined Function, tried with PATINDEX.

    CREATE FUNCTION [CapitalCase]

    (

    @Input varchar(255)

    )

    RETURNS varchar(255)

    AS

    BEGIN

    DECLARE @Results varchar(255)

    Declare @NextWord int, @NextSpace int

    if len(@Input)>0

    Begin

    -- Trimming the input

    Set @Input = lower(ltrim(rtrim(@Input)))

    -- Finding next word in the input

    Set @NextWord = patindex('%[a-zA-Z]%',@Input)

    -- Initializing the result

    Set @Results = ''

    -- If there is a word beginning, then make it Capital case

    While @NextWord <> 0

    Begin

    -- Capital casing first character of first word and adding to result

    Set @Results = @Results + UPPER(substring(@Input,@NextWord,1))

    -- finding next word from next space

    Set @NextSpace = charindex(char(32),@Input,1)

    -- If there is no space then words are over, write till the end of string to result and break out of the loop

    If @NextSpace = 0

    BEGIN

    SET @Results = @Results +SUBSTRING(@Input,@NextWord+1,LEN(@Input))

    BREAK

    END

    -- If space is present then there might be one more word existing

    ELSE

    BEGIN

    -- Set result set from the capital case character till the next space

    SET @Results = @Results + SUBSTRING(@Input,@NextWord+1,(@NextSpace - @NextWord))

    -- Making input as remaining part of Input from the next space

    Set @Input = SUBSTRING(@Input,@NextSpace+1,LEN(@Input))

    Set @NextWord = patindex('%[a-zA-Z]%',@Input)

    END

    End

    End

    Else

    Set @Results= ''

    Return @Results

    END

  • I also too the function route

    create function uf_mixed (@p_col varchar(max))

    returns varchar(max)

    as

    begin

    declare @lgth int, @ctr int, @found varchar(1), @v_col varchar(max), @nxt int

    select @lgth = LEN(@p_col), @ctr = 1, @found = 'x', @nxt = 0

    if @lgth > 0

    begin

    select @v_col = upper(SUBSTRING(@p_col,1,1))

    select @ctr = @ctr + 1

    while (@ctr <= @lgth)

    begin

    select @found = SUBSTRING(@p_col,@ctr,1)

    if @found = ' ' select @nxt = @ctr+1

    if @nxt = @ctr

    select @v_col = @v_col + upper(SUBSTRING(@p_col,@ctr,1))

    else

    select @v_col = @v_col + lower(SUBSTRING(@p_col,@ctr,1))

    select @ctr = @ctr + 1

    end

    end

    return(@v_col)

    end

    go

    update Import_Data_Filter set Location = dbo.uf_mixed(Location)


    Best Regards,

    Mark Tierney

  • Here's a UDF that also removes extraneous spaces.

    ALTER FUNCTION [dbo].[udfCamelCase]

    (

    -- Add the parameters for the function here

    @VStringvarchar(1000)

    )

    RETURNS varchar(1000)

    AS

    BEGIN

    -- Table used to get the words in the string

    DECLARE @PartTable table (StringPart varchar(100));

    -- Variable to hold each part of the return string

    DECLARE @HWordvarchar(100);

    -- These variables are used to "clean" the string, i.e., make sure only

    -- one space exists between the parts of the string

    DECLARE@CNvarchar(200);

    DECLARE@Xvarchar(100);

    DECLARE@Iint;

    DECLARE @HString varchar(1000);

    DECLARE@RetVarvarchar(1000);

    IF @VString = '' OR @VString IS NULL

    RETURN '';

    SET @HString = RTRIM(LTRIM(@VString));

    SET @I = CHARINDEX(' ',@HString);

    -- If there are no spaces just make the first character upper case and return it

    IF @I = 0

    BEGIN

    SET @RetVar = UPPER(SUBSTRING(@HString,1,1));

    SET @RetVar = @RetVar + LOWER(SUBSTRING(@HString,2,LEN(@HString) - 1));

    RETURN @RetVar;

    END

    WHILE (@I != 0)

    BEGIN

    SET @RetVar = UPPER(SUBSTRING(@HString,1,1));

    SET @RetVar = @RetVar + LOWER(SUBSTRING(@HString,2,@I - 1));

    INSERT@PartTable

    SELECT@RetVar;

    SET @HString = LTRIM(RTRIM(SUBSTRING(@HString,@I+1,LEN(@HString) - @I)));

    SET @I = CHARINDEX(' ',@HString);

    END

    IF LEN(@HString) > 0

    BEGIN

    SET @RetVar = UPPER(SUBSTRING(@HString,1,1));

    SET @RetVar = @RetVar + LOWER(SUBSTRING(@HString,2,LEN(@HString) - 1));

    INSERT@PartTable

    SELECT@RetVar;

    END

    SET @HString = '';

    WHILE ((SELECT COUNT(*) FROM @PartTable) > 0)

    BEGIN

    SELECTTOP (1) @HWord = LTRIM(RTRIM(StringPart))

    FROM@PartTable;

    DELETETOP (1)

    FROM@PartTable;

    IF @HString = ''

    SET @HString = @HWord;

    ELSE

    SET @HString = @HString + ' ' + @HWord;

    END

    --RETURN @DBVar;

    RETURN @HString;

    END

  • Similar to some of the posts above - here's my version using a function. It performs well.

    Here's the function:

    CREATE FUNCTION dbo.fn_title_case

    (

    @string varchar(max)

    )

    RETURNS varchar(max)

    AS

    BEGIN

    DECLARE @curr char(1);

    DECLARE @len int;

    DECLARE @loc int;

    DECLARE @out_string varchar(max);

    DECLARE @prev_alpha char(1);

    SET @out_string = '';

    SET @prev_alpha = 'N'

    SET @len = LEN(@string);

    SET @loc = 1;

    WHILE @loc <= @len

    BEGIN

    SET @curr = SUBSTRING(@string,@loc,1);

    IF @curr LIKE '[A-Z0-9]'

    BEGIN

    IF @prev_alpha = 'Y'

    SET @out_string = @out_string + LOWER(@curr);

    ELSE

    SET @out_string = @out_string + UPPER(@curr);

    SET @prev_alpha = 'Y';

    END

    ELSE

    BEGIN

    SET @out_string = @out_string + @curr;

    SET @prev_alpha = 'N';

    END

    SET @loc = @loc + 1;

    END

    RETURN @out_string;

    END

    Then apply it as follows (using the table name from the original post):

    SELECT dbo.fn_decapitalise(Location) into #result from Import_Data_Filter;

  • Jan Van der Eecken (10/18/2010)


    Hi Brandie,

    I guess the final update back will fail on a database with a case sensitive collation. Shouldn't it rather read:

    Update idf

    Set Location = mt1.Location

    from dbo.Import_Data_Filter idf

    join dbo.#MyTemp1 mt1

    on UPPER(idf.Location) = UPPER(mt1.Location);

    Sorry it's been so long since you posted. I was working on massive projects.

    Actually, Jan, your code would set everything back to Upper case which is what the customer did NOT want. He wanted Camel Case names.

    Brandie Tarvin, MCITP Database AdministratorLiveJournal Blog: http://brandietarvin.livejournal.com/[/url]On LinkedIn!, Google+, and Twitter.Freelance Writer: ShadowrunLatchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.

  • Carl Federl (10/18/2010)


    Perhaps I am missing something but what is wrong with a solution that has no cursors, whiles or UDFs ?

    Maybe I'm misreading your solution, but it doesn't seem to account for spaces between names. Such as "St. Augustine". Your solution would make it "St. augustine".

    I haven't tested this, though. I'm going off what I see.

    Brandie Tarvin, MCITP Database AdministratorLiveJournal Blog: http://brandietarvin.livejournal.com/[/url]On LinkedIn!, Google+, and Twitter.Freelance Writer: ShadowrunLatchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.

  • Thanks to everyone who's posted a response. I see I have a lot of testing to do.

    Someone mentioned removing extraneous spaces. In this particular case, there are no "extraneous" spaces. The spaces are supposed to be there. We wouldn't want "Little Rock", as in the city from Arkansas, to end up as "Littlerock" or "LittleRock" as that would be bad data. And if there were extraneous spaces, I wouldn't have had to jump through hoops to get them. Just do a REPLACE(location,Space(1),'') and then capitalize the first letter without a looping process.

    Still, I appreciate all the input. I do agree that if there are this many solutions to the problem that SQL Server aught to have a proper function for this.

    Brandie Tarvin, MCITP Database AdministratorLiveJournal Blog: http://brandietarvin.livejournal.com/[/url]On LinkedIn!, Google+, and Twitter.Freelance Writer: ShadowrunLatchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.

  • Why nobody likes my solution without UDF, WHILEs, Dynamic SQL statments, etc ...

    ๐Ÿ™‚

    Create Table #Import_Data_Filter (MyID Int Identity(1,1) NOT NULL, Location varchar(100))

    Insert into #Import_Data_Filter (Location)

    (Select Lower('Mandarin') UNION ALL Select Lower('San Jose') UNION ALL Select Lower('Baymeadows') UNION ALL

    Select Lower('My FH Locale') UNION ALL Select Lower('St. Augustine') UNION ALL Select Lower('Test For Three Spaces')

    UNION ALL Select Lower('Test for being Four Spaces') UNION ALL Select Lower('Test for being Five More Spaces')

    UNION ALL Select Lower('Baymeadows') UNION ALL Select Lower('St. Augustine'))

    update #Import_Data_Filter set

    location = char(160)+replace(location,' ',char(160))

    r:

    update #import_data_filter set

    location = replace(

    location,

    char(160)+substring(location,charindex(char(160),location)+1,1),

    ' '+UPPER(substring(location,charindex(char(160),location)+1,1))

    )

    where charindex(char(160),location)<>0

    if @@rowcount>0 goto r

    select ltrim(location) from #Import_Data_Filter

    drop table #Import_Data_Filter

  • arty 15255 (11/15/2010)


    Why nobody likes my solution without UDF, WHILEs, Dynamic SQL statments, etc ...

    ๐Ÿ™‚

    ...snip...

    Because it has a loop?

    Or because it has to perform n table scans, where n = the maximum number of words in the target column?

    โ€œWrite the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.โ€ - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Does that mean that looping through each character is better?

  • Hi Brandie,

    No it would not, it is just in the join that it compares the upper-case version of the camel-cased string to the upper-case version of the original string. If you don't do that on a case-sensitive database, then that join would not return any matches, and no updates would take place at all (assuming that the original and the camel-cased versions to indeed differ).

    --------------------------------------------------------------------------
    A little knowledge is a dangerous thing (Alexander Pope)
    In order for us to help you as efficiently as possible, please read this before posting (courtesy of Jeff Moden)[/url]

  • arty 15255 (11/15/2010)


    Does that mean that looping through each character is better?

    Yes, if it means avoiding an unnecessary table scan!

    if @@rowcount>0 goto r

    "If the last iteration f the UPDATE performed any work, then scan the table again"

    โ€œWrite the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.โ€ - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Now it is clear.

    Thank you,

  • Then, here is mine version of UDF:

    create function fn_convert_ucase

    (@s nvarchar(max))

    returns nvarchar(max)

    as

    begin

    if @s-2 is null goto ex

    set @s-2 = char(160)+replace(@s,' ',char(160))

    while charindex(char(160),@s)<>0 begin

    set @s-2 = replace(@s,char(160)+substring(@s,charindex(char(160),@s)+1,1),' '+UPPER(substring(@s,charindex(char(160),@s)+1,1)))

    end

    ex:

    return @s-2

    end

Viewing 15 posts - 16 through 30 (of 31 total)

You must be logged in to reply to this topic. Login to reply