Detective Stories - Changing the Case

  • Comments posted to this topic are about the item Detective Stories - Changing the Case

    Brandie Tarvin, MCITP Database AdministratorLiveJournal Blog: http://brandietarvin.livejournal.com/[/url]On LinkedIn!, Google+, and Twitter.Freelance Writer: ShadowrunLatchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.

  • There was a thread about this a while back where Paul White showed an excellent SQLCLR method http://www.sqlservercentral.com/Forums/FindPost910545.aspx

    MM



    select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);

  • Forum Etiquette: How to post Reporting Services problems
  • [/url]
  • Forum Etiquette: How to post data/code on a forum to get the best help - by Jeff Moden
  • [/url]
  • How to Post Performance Problems - by Gail Shaw
  • [/url]

  • How about this?

    ;WITH cte AS

    -- split each string by character and decide UPPER or lower based on previous character or being the first char

    (

    SELECT

    N,

    Id,

    CASE

    WHEN N=1 OR (N>1 AND SUBSTRING(location ,N-1,1 ) = ' ')

    THEN UPPER(SUBSTRING(location ,N ,1))

    ELSE LOWER(SUBSTRING(location ,N ,1))

    END AS split

    FROM Tally

    CROSS APPLY -- apply the code to each location

    ( SELECT Location, min(MyId) AS Id

    FROM Import_Data_Filter

    GROUP BY Location

    )y

    WHERE N < LEN(' ' + location + ' ')

    )

    -- and put it back together

    SELECT

    REPLACE((SELECT '' + split FROM cte c2 WHERE c2.Id = c1.Id ORDER BY N FOR XML PATH('')),' ',' ')

    FROM cte c1

    GROUP BY Id



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • I've read through the examples and find the variety of techniques interesting. When I had to solve the problem I was looking at fixing the data as it went into the database, so my solution is a generalized function that can be applied with an Insert or Update process. I offer it as another solution if anyone finds it of interest.

    CREATE FUNCTION [dbo].[FormalCase]

    (

    @Input varchar(255)

    )

    RETURNS varchar(255)

    AS

    BEGIN

    DECLARE @Results varchar(255)

    if len(@Input)>0

    Begin

    Set @Input = lower(ltrim(rtrim(@Input)))

    Declare @NextSpace int, @LastSpace int

    Set @LastSpace = 0

    Set @NextSpace = charindex(char(32),@Input,1)

    While @NextSpace>@LastSpace

    Begin

    Set @Input = Left(@Input, @NextSpace) + upper(substring(@Input,@NextSpace + 1, 1)) + Right(@Input, len(@Input)-(@Nextspace+1))

    Set @LastSpace = @NextSpace

    Set @NextSpace = charindex(char(32),@Input,@LastSpace + 1)

    End

    SELECT @Results = Upper(left(@Input,1)) + right(@Input, len(@Input)-1)

    End

    Else

    Set @Results= ''

    RETURN @Results

    END

    /* UNIT TESTING

    Select dbo.formalcase('a stitch in time')

    */

    GO

  • my solution:

    Create Table #Import_Data_Filter (MyID Int Identity(1,1) NOT NULL, Location varchar(100))

    Insert into #Import_Data_Filter (Location)

    (Select Lower('Mandarin') UNION ALL Select Lower('San Jose') UNION ALL Select Lower('Baymeadows') UNION ALL

    Select Lower('My FH Locale') UNION ALL Select Lower('St. Augustine') UNION ALL Select Lower('Test For Three Spaces')

    UNION ALL Select Lower('Test for being Four Spaces') UNION ALL Select Lower('Test for being Five More Spaces')

    UNION ALL Select Lower('Baymeadows') UNION ALL Select Lower('St. Augustine'))

    update #Import_Data_Filter set

    location = char(160)+replace(location,' ',char(160))

    r:

    update #import_data_filter set

    location = replace(

    location,

    char(160)+substring(location,charindex(char(160),location)+1,1),

    ' '+UPPER(substring(location,charindex(char(160),location)+1,1))

    )

    where charindex(char(160),location)<>0

    if @@rowcount>0 goto r

    select ltrim(location) from #Import_Data_Filter

    drop table #Import_Data_Filter

    p.s.: You can change char(160) to anything.

  • I inherited this function which uses regular expressions to look for where a letter should be capitalised. In the current incarnation, it looks for the pattern, space, comma, apostrophe (uses '', because the apostrophe will be embedded in a string, followed by a letter.

    ALTER FUNCTION [dbo].[fnProperCase]

    (

    @String varchar(1023)

    )

    RETURNS varchar(1023)

    AS

    BEGIN

    DECLARE @Pos int,

    @Temp varchar(1023),

    @Result varchar(1023)

    SET @Temp = LOWER(LTRIM(RTRIM(@String)))

    SET @Result = UPPER(SUBSTRING(@Temp,1,1))

    SET @Temp = SUBSTRING(@Temp,2,1022)

    SET @Pos = PATINDEX('%[ ,''-][a-z]%', @Temp)

    WHILE @Pos > 0

    BEGIN

    SET @Result = @Result + SUBSTRING(@Temp,1,@Pos) + UPPER(SUBSTRING(@Temp,@Pos+1,1))

    SET @Temp = SUBSTRING(@Temp, @Pos + 2, 1022)

    SET @Pos = PATINDEX('%[ ,''-][a-z]%', @Temp)

    END

    SET @Result = @Result + @Temp

    RETURN @Result

    END

  • Hi Brandie,

    I guess the final update back will fail on a database with a case sensitive collation. Shouldn't it rather read:

    Update idf

    Set Location = mt1.Location

    from dbo.Import_Data_Filter idf

    join dbo.#MyTemp1 mt1

    on UPPER(idf.Location) = UPPER(mt1.Location);

    --------------------------------------------------------------------------
    A little knowledge is a dangerous thing (Alexander Pope)
    In order for us to help you as efficiently as possible, please read this before posting (courtesy of Jeff Moden)[/url]

  • I'll throw in my proper case function just for laughs...

    ALTER FUNCTION [cf_ProperCase] (@String varchar(MAX))

    RETURNS varchar(MAX)

    AS BEGIN

    DECLARE @ReturnString varchar(max),

    @i int;

    SET @ReturnString = ''

    SET @i = 1

    WHILE @i <= LEN(@String)

    BEGIN

    SET @ReturnString = @ReturnString +

    (CASE WHEN (@i = 1 OR SUBSTRING(@String, @i - 1, 1) = ' ')

    THEN UPPER(SUBSTRING(@String, @i, 1))

    ELSE LOWER(SUBSTRING(@String, @i, 1))

    END)

    SET @i = @i + 1

    END

    RETURN @ReturnString

    END

    Very brute-force, but it works.

    Ron Moses

    -----
    a haiku...

    NULL is not zero
    NULL is not an empty string
    NULL is the unknown

  • I think I originally got this from Steve Jones or at least one of his posts. Made a couple of tweaks to handle Roman Numerals because we mostly use it for names.

    SET QUOTED_IDENTIFIER ON

    GO

    SET ANSI_NULLS ON

    GO

    ALTER procedure sp_ProperCase

    @table varchar(50),@field varchar(50)

    as

    -- Replace spaces with the "special" character

    exec('update ' + @table + '

    set ' + @field + ' = lower(replace( ' + @field + ' , '' '', ''@''))')

    -- Handle case 1 - First item

    exec('update ' + @table + '

    set ' + @field + ' = upper( substring( ltrim( ' + @field + ' ), 1, 1)) + substring( ltrim( ' + @field + ' ), 2, 80)')

    -- loop while there are rows with the flag

    exec('while exists(

    select *

    from ' + @table + '

    where ' + @field + ' like ''%@%''

    )

    begin

    -- Proper case the word after the flag.

    update ' + @table + '

    set ' + @field + ' = substring( ' + @field + ' , 1, charindex( ''@'', ' + @field + ' )) +

    upper( substring( ' + @field + ' , charindex( ''@'', ' + @field + ' )+1, 1 )) +

    substring( ' + @field + ' , charindex( ''@'', ' + @field + ' )+2, 80)

    where ' + @field + ' like ''%@%''

    -- Remove the first flag encountered in each row

    update ' + @table + '

    set ' + @field + ' = substring( ' + @field + ' , 1, charindex( ''@'', ' + @field + ' )-1) +

    '' '' + substring( ' + @field + ' , charindex( ''@'', ' + @field + ' ) + 1, 80)

    where ' + @field + ' like ''%@%''

    end')

    exec('update ' + @table + ' set ' + @field + ' = replace(' + @field + ','' ii'','' II'') where ' + @field + ' like ''% ii''')

    exec('update ' + @table + ' set ' + @field + ' = replace(' + @field + ','' iii'','' III'') where ' + @field + ' like ''% iii''')

    GO

    SET QUOTED_IDENTIFIER OFF

    GO

    SET ANSI_NULLS ON

    GO


  • Back in the "olden days" when I did FoxPro development, I used a function called PROPER() quite frequently. I'm not sure why the SQL development team didn't include such a function in their product, but I digress...

    I've come up with a number of ways over the years to replicate the FoxPro PROPER() function in SQL and it makes me happy to see that I'm not the only one that needed such functionality.

    SQL is great and I still love FoxPro. In fact, I can do anything in FoxPro....except get a job.

  • I came up with this: It's a set based approach, rather than working all the way through the string. Check for any strings which are suspect then fix the first instance within the string. I also used a DONE flag so as not to hit every row in the table at every iteration.

    Then loop round again for any rows which still have an instance of the suspect string. I tried it with some bodged data from AdventureWorksDW..DimCustomer (18,000+ rows) and it only took 4 seconds.

    -- CREATE TEST DATA

    drop table ImportDataFilter

    go

    Create Table ImportDataFilter (MyID Int Identity(1,1) NOT NULL, Location varchar(100), Comments varchar(20))

    GO

    Insert into ImportDataFilter (Location, Comments)

    (Select Upper('Mandarin'),'' UNION ALL Select Upper('San Jose'),'' UNION ALL Select Upper('Baymeadows'),'' UNION ALL

    Select Upper('My FH Locale'),'' UNION ALL Select Upper('St. Augustine'),'' UNION ALL Select Upper('Test For Three Spaces'),'Concatenated Spaces'

    UNION ALL Select Upper('Test for being Four Spaces'),'Concatenated Spaces' UNION ALL Select Upper('Test for being Five More Spaces '),'Concatenated Spaces'

    UNION ALL Select Upper('Baymeadows'),'' UNION ALL Select Upper('St. Augustine'),'' )

    -- test with leading space

    UNION ALL Select Upper(' Oswestry'),''

    GO

    If (Select Object_ID('tempdb..#MyTemp2')) is NOT NULL

    Drop table #MyTemp2; -- Drop temp table if it already exists

    Select Distinct Location into dbo.#MyTemp2 from dbo.ImportDataFilter; --Get distinct values for all locations

    Update mt1 Set Location = Lower(mt1.Location) from dbo.#MyTemp2 mt1; -- Set entire string to lower case letters

    -- Brandie's article mentioned double spaces - this will do the trick

    -- Now get rid of conatenated spaces

    -- simply loop while there are still instances of SPACESPACE

    Declare @spFlag int , @debugcount int

    select @spFlag = 1, @debugcount = 0

    while @spFlag <> 0

    BEGIN

    Update #MyTemp2 set Location = replace(Location, ' ',' ')

    -- we only want to do the rows affected and not the whole table every time

    where charindex(' ', Location) > 0

    -- how many rows are left : if zero then quit

    Select @spFlag = count(*) from #MyTemp2 where charindex(' ', Location) > 0

    -- just in case we spiral out of control

    Select @debugcount = @debugcount +1

    if @debugcount = 100

    SELECT @spFlag = 0 -- just in case

    -- print @debugcount -- debug

    END

    -- select * from #MyTemp2

    -- ===================================================================

    -- Now for the title case

    -- First: do the Very First char: a blanket approach for ease

    Update #MyTemp2 set Location = upper(left(Location,1)) + substring(Location,2,300)

    -- add a column to record the position of the next space which needs attention

    Alter table #MyTemp2 add spacepos int, DONE int

    go

    -- initialise

    update #MyTemp2 set spacepos = charindex(' ',Location), DONE = 0

    -- NOW LOOP

    Declare @ucFlag int

    select @ucFlag = 1

    WHILE @ucFlag <> 0

    BEGIN

    -- process the Location

    UPDATE #MyTemp2 SET

    Location = LEFT(Location, spacepos) + upper(substring(Location,spacepos + 1,1)) + substring(Location,spacepos + 2,250)

    WHERE spacepos <> 0 and DONE = 0

    -- get new spacepos

    UPDATE #MyTemp2 SET spacepos = charindex(' ',Location, spacepos + 1)

    WHERE spacepos <> 0 and DONE = 0

    SELECT @ucFlag = count(*) from #MyTemp2 where spacepos <> 0

    -- are any spacepos now = 0 ?

    -- if so then set DONE

    UPDATE #MyTemp2 SET DONE = 1

    where spacepos = 0 and DONE = 0

    END

    select * from #MyTemp2

    -- then you can update your original table

  • Perhaps I am missing something but what is wrong with a solution that has no cursors, whiles or UDFs ?

    use tempdb

    go

    Create Table Import_Data_Filter (MyID Int Identity(1,1) NOT NULL, Location varchar(100))

    GO

    Insert into Import_Data_Filter (Location)

    (Select Upper('Mandarin') UNION ALL Select Upper('San Jose') UNION ALL Select Upper('Baymeadows') UNION ALL

    Select Upper('My FH Locale') UNION ALL Select Upper('St. Augustine') UNION ALL Select Upper('Test For Three Spaces')

    UNION ALL Select Upper('Test for being Four Spaces') UNION ALL Select Upper('Test for being Five More Spaces')

    UNION ALL Select Upper('Baymeadows') UNION ALL Select Upper('St. Augustine'))

    GO

    The simplist SQL:

    use tempdb

    go

    BEGIN TRAN;

    UPDATEIDF

    SETLocation = Import_Data_Fixed.LocationMixed

    FROMImport_Data_Filter as IDF

    JOIN(selectMyID

    ,UPPER(Substring(Location,1,1)) -- First Character is always UPPER CASE

    +REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE

    (REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE

    (REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(

    LOWER(SUBSTRING(Location,2,100))

    , ' a', ' A'), ' b', ' B'), ' c', ' C'), ' d', ' D'), ' e', ' E'), ' f', ' F')

    , ' g', ' G'), ' h', ' H'), ' i', ' I'), ' j', ' J'), ' k', ' K'), ' l', ' L')

    , ' m', ' M'), ' n', ' N'), ' o', ' O'), ' p', ' P'), ' q', ' Q'), ' r', ' R')

    , ' s', ' S'), ' t', ' T'), ' u', ' U'), ' v', ' V'), ' w', ' W'), ' x', ' X')

    , ' y', ' Y'), ' z', ' Z')

    as LocationMixed

    fromImport_Data_Filter

    ) AS Import_Data_Fixed

    onImport_Data_Fixed.MyID= IDF.MyID

    AND Import_Data_Fixed.LocationMixed <> IDF.Location

    ;

    select * from Import_Data_Filter;

    rollback;

    SQL = Scarcely Qualifies as a Language

  • Assuming these functions have to be used for English, don't forget all the special cases like "McDonald" and "MacDonald". But if you fix these, then you don't want this to happen: "BraMcOte" or "MaHon". Addresses are an interesting one too: You probably want "Apt 1a" rather than "Apt 1A". I think a function that uses UPPER and LOWER that can be simply called in any SQL update is the way to go. Perhaps you could use a parameter to tell the function whether you are proper casing a name, address or something else so you can code in the special rules discussed above?

  • If you collation is Case Insensitive, then to only update rows when values would have changed, the SQL needs to cast to a Case Sensitive collation:

    AND CAST( Import_Data_Fixed.LocationMixed as varchar(100) ) COLLATE SQL_Latin1_General_CP1_CS_AS

    <> CAST( IDF.Location as varchar(100) ) COLLATE SQL_Latin1_General_CP1_CS_AS

    SQL = Scarcely Qualifies as a Language

  • Thanks Brandie

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • Viewing 15 posts - 1 through 15 (of 31 total)

    You must be logged in to reply to this topic. Login to reply