Detective Stories - Changing the Case

  • Brandie Tarvin

    SSC Guru

    Points: 172743

    Comments posted to this topic are about the item Detective Stories - Changing the Case

    Brandie Tarvin, MCITP Database AdministratorLiveJournal Blog: http://brandietarvin.livejournal.com/[/url]On LinkedIn!, Google+, and Twitter.Freelance Writer: ShadowrunLatchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.

  • mister.magoo

    SSC-Forever

    Points: 47068

    There was a thread about this a while back where Paul White showed an excellent SQLCLR method http://www.sqlservercentral.com/Forums/FindPost910545.aspx

    MM



    select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);

  • Forum Etiquette: How to post Reporting Services problems
  • [/url]
  • Forum Etiquette: How to post data/code on a forum to get the best help - by Jeff Moden
  • [/url]
  • How to Post Performance Problems - by Gail Shaw
  • [/url]

  • LutzM

    SSC Guru

    Points: 107049

    How about this?

    ;WITH cte AS

    -- split each string by character and decide UPPER or lower based on previous character or being the first char

    (

    SELECT

    N,

    Id,

    CASE

    WHEN N=1 OR (N>1 AND SUBSTRING(location ,N-1,1 ) = ' ')

    THEN UPPER(SUBSTRING(location ,N ,1))

    ELSE LOWER(SUBSTRING(location ,N ,1))

    END AS split

    FROM Tally

    CROSS APPLY -- apply the code to each location

    ( SELECT Location, min(MyId) AS Id

    FROM Import_Data_Filter

    GROUP BY Location

    )y

    WHERE N < LEN(' ' + location + ' ')

    )

    -- and put it back together

    SELECT

    REPLACE((SELECT '' + split FROM cte c2 WHERE c2.Id = c1.Id ORDER BY N FOR XML PATH('')),' ',' ')

    FROM cte c1

    GROUP BY Id



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • robr-793239

    SSC-Addicted

    Points: 471

    I've read through the examples and find the variety of techniques interesting. When I had to solve the problem I was looking at fixing the data as it went into the database, so my solution is a generalized function that can be applied with an Insert or Update process. I offer it as another solution if anyone finds it of interest.

    CREATE FUNCTION [dbo].[FormalCase]

    (

    @Input varchar(255)

    )

    RETURNS varchar(255)

    AS

    BEGIN

    DECLARE @Results varchar(255)

    if len(@Input)>0

    Begin

    Set @Input = lower(ltrim(rtrim(@Input)))

    Declare @NextSpace int, @LastSpace int

    Set @LastSpace = 0

    Set @NextSpace = charindex(char(32),@Input,1)

    While @NextSpace>@LastSpace

    Begin

    Set @Input = Left(@Input, @NextSpace) + upper(substring(@Input,@NextSpace + 1, 1)) + Right(@Input, len(@Input)-(@Nextspace+1))

    Set @LastSpace = @NextSpace

    Set @NextSpace = charindex(char(32),@Input,@LastSpace + 1)

    End

    SELECT @Results = Upper(left(@Input,1)) + right(@Input, len(@Input)-1)

    End

    Else

    Set @Results= ''

    RETURN @Results

    END

    /* UNIT TESTING

    Select dbo.formalcase('a stitch in time')

    */

    GO

  • arty 15255

    Old Hand

    Points: 341

    my solution:

    Create Table #Import_Data_Filter (MyID Int Identity(1,1) NOT NULL, Location varchar(100))

    Insert into #Import_Data_Filter (Location)

    (Select Lower('Mandarin') UNION ALL Select Lower('San Jose') UNION ALL Select Lower('Baymeadows') UNION ALL

    Select Lower('My FH Locale') UNION ALL Select Lower('St. Augustine') UNION ALL Select Lower('Test For Three Spaces')

    UNION ALL Select Lower('Test for being Four Spaces') UNION ALL Select Lower('Test for being Five More Spaces')

    UNION ALL Select Lower('Baymeadows') UNION ALL Select Lower('St. Augustine'))

    update #Import_Data_Filter set

    location = char(160)+replace(location,' ',char(160))

    r:

    update #import_data_filter set

    location = replace(

    location,

    char(160)+substring(location,charindex(char(160),location)+1,1),

    ' '+UPPER(substring(location,charindex(char(160),location)+1,1))

    )

    where charindex(char(160),location)<>0

    if @@rowcount>0 goto r

    select ltrim(location) from #Import_Data_Filter

    drop table #Import_Data_Filter

    p.s.: You can change char(160) to anything.

  • sjones

    Grasshopper

    Points: 20

    I inherited this function which uses regular expressions to look for where a letter should be capitalised. In the current incarnation, it looks for the pattern, space, comma, apostrophe (uses '', because the apostrophe will be embedded in a string, followed by a letter.

    ALTER FUNCTION [dbo].[fnProperCase]

    (

    @String varchar(1023)

    )

    RETURNS varchar(1023)

    AS

    BEGIN

    DECLARE @Pos int,

    @Temp varchar(1023),

    @Result varchar(1023)

    SET @Temp = LOWER(LTRIM(RTRIM(@String)))

    SET @Result = UPPER(SUBSTRING(@Temp,1,1))

    SET @Temp = SUBSTRING(@Temp,2,1022)

    SET @Pos = PATINDEX('%[ ,''-][a-z]%', @Temp)

    WHILE @Pos > 0

    BEGIN

    SET @Result = @Result + SUBSTRING(@Temp,1,@Pos) + UPPER(SUBSTRING(@Temp,@Pos+1,1))

    SET @Temp = SUBSTRING(@Temp, @Pos + 2, 1022)

    SET @Pos = PATINDEX('%[ ,''-][a-z]%', @Temp)

    END

    SET @Result = @Result + @Temp

    RETURN @Result

    END

  • Jan Van der Eecken

    SSCrazy Eights

    Points: 8890

    Hi Brandie,

    I guess the final update back will fail on a database with a case sensitive collation. Shouldn't it rather read:

    Update idf

    Set Location = mt1.Location

    from dbo.Import_Data_Filter idf

    join dbo.#MyTemp1 mt1

    on UPPER(idf.Location) = UPPER(mt1.Location);

    --------------------------------------------------------------------------
    A little knowledge is a dangerous thing (Alexander Pope)
    In order for us to help you as efficiently as possible, please read this before posting (courtesy of Jeff Moden)[/url]

  • ronmoses@gmail.com

    SSCarpal Tunnel

    Points: 4480

    I'll throw in my proper case function just for laughs...

    ALTER FUNCTION [cf_ProperCase] (@String varchar(MAX))

    RETURNS varchar(MAX)

    AS BEGIN

    DECLARE @ReturnString varchar(max),

    @i int;

    SET @ReturnString = ''

    SET @i = 1

    WHILE @i <= LEN(@String)

    BEGIN

    SET @ReturnString = @ReturnString +

    (CASE WHEN (@i = 1 OR SUBSTRING(@String, @i - 1, 1) = ' ')

    THEN UPPER(SUBSTRING(@String, @i, 1))

    ELSE LOWER(SUBSTRING(@String, @i, 1))

    END)

    SET @i = @i + 1

    END

    RETURN @ReturnString

    END

    Very brute-force, but it works.

    Ron Moses

    -----
    a haiku...

    NULL is not zero
    NULL is not an empty string
    NULL is the unknown

  • mrpolecat

    SSCertifiable

    Points: 6934

    I think I originally got this from Steve Jones or at least one of his posts. Made a couple of tweaks to handle Roman Numerals because we mostly use it for names.

    SET QUOTED_IDENTIFIER ON

    GO

    SET ANSI_NULLS ON

    GO

    ALTER procedure sp_ProperCase

    @table varchar(50),@field varchar(50)

    as

    -- Replace spaces with the "special" character

    exec('update ' + @table + '

    set ' + @field + ' = lower(replace( ' + @field + ' , '' '', ''@''))')

    -- Handle case 1 - First item

    exec('update ' + @table + '

    set ' + @field + ' = upper( substring( ltrim( ' + @field + ' ), 1, 1)) + substring( ltrim( ' + @field + ' ), 2, 80)')

    -- loop while there are rows with the flag

    exec('while exists(

    select *

    from ' + @table + '

    where ' + @field + ' like ''%@%''

    )

    begin

    -- Proper case the word after the flag.

    update ' + @table + '

    set ' + @field + ' = substring( ' + @field + ' , 1, charindex( ''@'', ' + @field + ' )) +

    upper( substring( ' + @field + ' , charindex( ''@'', ' + @field + ' )+1, 1 )) +

    substring( ' + @field + ' , charindex( ''@'', ' + @field + ' )+2, 80)

    where ' + @field + ' like ''%@%''

    -- Remove the first flag encountered in each row

    update ' + @table + '

    set ' + @field + ' = substring( ' + @field + ' , 1, charindex( ''@'', ' + @field + ' )-1) +

    '' '' + substring( ' + @field + ' , charindex( ''@'', ' + @field + ' ) + 1, 80)

    where ' + @field + ' like ''%@%''

    end')

    exec('update ' + @table + ' set ' + @field + ' = replace(' + @field + ','' ii'','' II'') where ' + @field + ' like ''% ii''')

    exec('update ' + @table + ' set ' + @field + ' = replace(' + @field + ','' iii'','' III'') where ' + @field + ' like ''% iii''')

    GO

    SET QUOTED_IDENTIFIER OFF

    GO

    SET ANSI_NULLS ON

    GO


  • sdsiedentop

    Grasshopper

    Points: 23

    Back in the "olden days" when I did FoxPro development, I used a function called PROPER() quite frequently. I'm not sure why the SQL development team didn't include such a function in their product, but I digress...

    I've come up with a number of ways over the years to replicate the FoxPro PROPER() function in SQL and it makes me happy to see that I'm not the only one that needed such functionality.

    SQL is great and I still love FoxPro. In fact, I can do anything in FoxPro....except get a job.

  • mark-1062149

    Newbie

    Points: 8

    I came up with this: It's a set based approach, rather than working all the way through the string. Check for any strings which are suspect then fix the first instance within the string. I also used a DONE flag so as not to hit every row in the table at every iteration.

    Then loop round again for any rows which still have an instance of the suspect string. I tried it with some bodged data from AdventureWorksDW..DimCustomer (18,000+ rows) and it only took 4 seconds.

    -- CREATE TEST DATA

    drop table ImportDataFilter

    go

    Create Table ImportDataFilter (MyID Int Identity(1,1) NOT NULL, Location varchar(100), Comments varchar(20))

    GO

    Insert into ImportDataFilter (Location, Comments)

    (Select Upper('Mandarin'),'' UNION ALL Select Upper('San Jose'),'' UNION ALL Select Upper('Baymeadows'),'' UNION ALL

    Select Upper('My FH Locale'),'' UNION ALL Select Upper('St. Augustine'),'' UNION ALL Select Upper('Test For Three Spaces'),'Concatenated Spaces'

    UNION ALL Select Upper('Test for being Four Spaces'),'Concatenated Spaces' UNION ALL Select Upper('Test for being Five More Spaces '),'Concatenated Spaces'

    UNION ALL Select Upper('Baymeadows'),'' UNION ALL Select Upper('St. Augustine'),'' )

    -- test with leading space

    UNION ALL Select Upper(' Oswestry'),''

    GO

    If (Select Object_ID('tempdb..#MyTemp2')) is NOT NULL

    Drop table #MyTemp2; -- Drop temp table if it already exists

    Select Distinct Location into dbo.#MyTemp2 from dbo.ImportDataFilter; --Get distinct values for all locations

    Update mt1 Set Location = Lower(mt1.Location) from dbo.#MyTemp2 mt1; -- Set entire string to lower case letters

    -- Brandie's article mentioned double spaces - this will do the trick

    -- Now get rid of conatenated spaces

    -- simply loop while there are still instances of SPACESPACE

    Declare @spFlag int , @debugcount int

    select @spFlag = 1, @debugcount = 0

    while @spFlag <> 0

    BEGIN

    Update #MyTemp2 set Location = replace(Location, ' ',' ')

    -- we only want to do the rows affected and not the whole table every time

    where charindex(' ', Location) > 0

    -- how many rows are left : if zero then quit

    Select @spFlag = count(*) from #MyTemp2 where charindex(' ', Location) > 0

    -- just in case we spiral out of control

    Select @debugcount = @debugcount +1

    if @debugcount = 100

    SELECT @spFlag = 0 -- just in case

    -- print @debugcount -- debug

    END

    -- select * from #MyTemp2

    -- ===================================================================

    -- Now for the title case

    -- First: do the Very First char: a blanket approach for ease

    Update #MyTemp2 set Location = upper(left(Location,1)) + substring(Location,2,300)

    -- add a column to record the position of the next space which needs attention

    Alter table #MyTemp2 add spacepos int, DONE int

    go

    -- initialise

    update #MyTemp2 set spacepos = charindex(' ',Location), DONE = 0

    -- NOW LOOP

    Declare @ucFlag int

    select @ucFlag = 1

    WHILE @ucFlag <> 0

    BEGIN

    -- process the Location

    UPDATE #MyTemp2 SET

    Location = LEFT(Location, spacepos) + upper(substring(Location,spacepos + 1,1)) + substring(Location,spacepos + 2,250)

    WHERE spacepos <> 0 and DONE = 0

    -- get new spacepos

    UPDATE #MyTemp2 SET spacepos = charindex(' ',Location, spacepos + 1)

    WHERE spacepos <> 0 and DONE = 0

    SELECT @ucFlag = count(*) from #MyTemp2 where spacepos <> 0

    -- are any spacepos now = 0 ?

    -- if so then set DONE

    UPDATE #MyTemp2 SET DONE = 1

    where spacepos = 0 and DONE = 0

    END

    select * from #MyTemp2

    -- then you can update your original table

  • Carl Federl

    One Orange Chip

    Points: 25384

    Perhaps I am missing something but what is wrong with a solution that has no cursors, whiles or UDFs ?

    use tempdb

    go

    Create Table Import_Data_Filter (MyID Int Identity(1,1) NOT NULL, Location varchar(100))

    GO

    Insert into Import_Data_Filter (Location)

    (Select Upper('Mandarin') UNION ALL Select Upper('San Jose') UNION ALL Select Upper('Baymeadows') UNION ALL

    Select Upper('My FH Locale') UNION ALL Select Upper('St. Augustine') UNION ALL Select Upper('Test For Three Spaces')

    UNION ALL Select Upper('Test for being Four Spaces') UNION ALL Select Upper('Test for being Five More Spaces')

    UNION ALL Select Upper('Baymeadows') UNION ALL Select Upper('St. Augustine'))

    GO

    The simplist SQL:

    use tempdb

    go

    BEGIN TRAN;

    UPDATE IDF

    SET Location = Import_Data_Fixed.LocationMixed

    FROM Import_Data_Filter as IDF

    JOIN (select MyID

    , UPPER(Substring(Location,1,1)) -- First Character is always UPPER CASE

    + REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE

    (REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE

    (REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(

    LOWER(SUBSTRING(Location,2,100))

    , ' a', ' A'), ' b', ' B'), ' c', ' C'), ' d', ' D'), ' e', ' E'), ' f', ' F')

    , ' g', ' G'), ' h', ' H'), ' i', ' I'), ' j', ' J'), ' k', ' K'), ' l', ' L')

    , ' m', ' M'), ' n', ' N'), ' o', ' O'), ' p', ' P'), ' q', ' Q'), ' r', ' R')

    , ' s', ' S'), ' t', ' T'), ' u', ' U'), ' v', ' V'), ' w', ' W'), ' x', ' X')

    , ' y', ' Y'), ' z', ' Z')

    as LocationMixed

    from Import_Data_Filter

    ) AS Import_Data_Fixed

    on Import_Data_Fixed.MyID = IDF.MyID

    AND Import_Data_Fixed.LocationMixed <> IDF.Location

    ;

    select * from Import_Data_Filter;

    rollback;

    SQL = Scarcely Qualifies as a Language

  • Alan G-436699

    SSC-Addicted

    Points: 474

    Assuming these functions have to be used for English, don't forget all the special cases like "McDonald" and "MacDonald". But if you fix these, then you don't want this to happen: "BraMcOte" or "MaHon". Addresses are an interesting one too: You probably want "Apt 1a" rather than "Apt 1A". I think a function that uses UPPER and LOWER that can be simply called in any SQL update is the way to go. Perhaps you could use a parameter to tell the function whether you are proper casing a name, address or something else so you can code in the special rules discussed above?

  • Carl Federl

    One Orange Chip

    Points: 25384

    If you collation is Case Insensitive, then to only update rows when values would have changed, the SQL needs to cast to a Case Sensitive collation:

    AND CAST( Import_Data_Fixed.LocationMixed as varchar(100) ) COLLATE SQL_Latin1_General_CP1_CS_AS

    <> CAST( IDF.Location as varchar(100) ) COLLATE SQL_Latin1_General_CP1_CS_AS

    SQL = Scarcely Qualifies as a Language

  • SQLRNNR

    SSC Guru

    Points: 281243

    Thanks Brandie

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • Viewing 15 posts - 1 through 15 (of 32 total)

    You must be logged in to reply to this topic. Login to reply