Can I Proper case a UK address string with postcode with a function

  • Hi All

    Thanks in advance for any solutions I have been doing some searching and found pattern matching for UK post code but need to find a way to get this to work
    Here are some example of the strings

    999, Main Road, Some Town, London, NN66 2BN
    THE SQUAT, END OF MY ROAD, OTHER PLACE, LONDON, NX1 2BD -- Some with ,

    999  Main Road  Some Town  London  NN66 2BN
    THE SQUAT  END OF MY ROAD  OTHER PLACE  LONDON  NX1 2BD -- Some with no ,

    999, Main Road Some Town, London  NN66  2BN
    THE SQUAT, END OF MY ROAD OTHER  PLACE, LONDON, NX1  2BD -- Some with & without and double space or tab,

    I need the address string to all be first letter of each word UPPER and rest LOWER except the post code which I need in ALL UPPER please I have tried a few things but no joy sorry to say 🙁

    And I know having the address in one long string is not the best but this is a DB we have inherited so I have no control

    Kind regards
    Stig of the dump

  • Well, if you at least have spaces between each element, you could use a string splitter (see Jeff Moden's article here:  http://www.sqlservercentral.com/articles/72993/ ) to get to each element, then as long as an element is not either of the last 2 elements, you UPPER() the leftmost character, and LOWER() the remainder.  Otherwise, the last 2 elements are treated to UPPER() in their entirety.   Then you reassemble the address using STUFF and FOR XML PATH('') and a space for the separator.

    Steve (aka sgmunson) 🙂 🙂 🙂
    Rent Servers for Income (picks and shovels strategy)

  • sgmunson - Thursday, July 12, 2018 6:24 AM

    Well, if you at least have spaces between each element, you could use a string splitter (see Jeff Moden's article here:  http://www.sqlservercentral.com/articles/72993/ ) to get to each element, then as long as an element is not either of the last 2 elements, you UPPER() the leftmost character, and LOWER() the remainder.  Otherwise, the last 2 elements are treated to UPPER() in their entirety.   Then you reassemble the address using STUFF and FOR XML PATH('') and a space for the separator.

    Hi Steve

    Thanks for the post I had thought of having to do that but just wanted to know if anyone had found another way

    Kind regards
    Stig of the dump

  • Northern Monkey - Thursday, July 12, 2018 7:32 AM

    sgmunson - Thursday, July 12, 2018 6:24 AM

    Well, if you at least have spaces between each element, you could use a string splitter (see Jeff Moden's article here:  http://www.sqlservercentral.com/articles/72993/ ) to get to each element, then as long as an element is not either of the last 2 elements, you UPPER() the leftmost character, and LOWER() the remainder.  Otherwise, the last 2 elements are treated to UPPER() in their entirety.   Then you reassemble the address using STUFF and FOR XML PATH('') and a space for the separator.

    Hi Steve

    Thanks for the post I had thought of having to do that but just wanted to know if anyone had found another way

    Kind regards
    Stig of the dump

    Yep.  Understand.   This is one of those "consequences of bad database design" scenarios.   I honestly can't see ANY other way that has a chance at working.   At least Jeff's function is lightning fast....

    Steve (aka sgmunson) 🙂 🙂 🙂
    Rent Servers for Income (picks and shovels strategy)

  • A tightly-written standard scalar function is another option.  I'll give you mine.  But you'll have to add the UK post code part to it, since this code was not written with that requirement.


    SET ANSI_NULLS ON;
    SET QUOTED_IDENTIFIER ON;
    GO
    CREATE FUNCTION [dbo].[Proper_Case] (
        @string varchar(2000)
    )
    RETURNS varchar(2000)
    AS
    BEGIN /*FUNCTION*/

    DECLARE @previousByte int
    DECLARE @byte int

    SET @string = LOWER(@string)
    IF LEFT(@string, 1) LIKE '[a-z]'
        SET @string = STUFF(@string, 1, 1, UPPER(LEFT(@string, 1)))
    SET @previousByte = 2

    WHILE 1 = 1
    BEGIN
      SET @byte = PATINDEX('%[^a-z0-9][a-z]%', SUBSTRING(@string, @previousByte, 2000))
      IF @byte = 0
          BREAK
      SET @string = STUFF(@string, @previousByte + @byte, 1,
          UPPER(SUBSTRING(@string, @previousByte + @byte, 1)))
      SET @previousByte = @previousByte + @byte + 1
    END /*WHILE*/

    RETURN @string

    END /*FUNCTION*/

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • Northern Monkey - Thursday, July 12, 2018 4:32 AM

    Hi All

    Thanks in advance for any solutions I have been doing some searching and found pattern matching for UK post code but need to find a way to get this to work
    Here are some example of the strings

    999, Main Road, Some Town, London, NN66 2BN
    THE SQUAT, END OF MY ROAD, OTHER PLACE, LONDON, NX1 2BD -- Some with ,

    999  Main Road  Some Town  London  NN66 2BN
    THE SQUAT  END OF MY ROAD  OTHER PLACE  LONDON  NX1 2BD -- Some with no ,

    999, Main Road Some Town, London  NN66  2BN
    THE SQUAT, END OF MY ROAD OTHER  PLACE, LONDON, NX1  2BD -- Some with & without and double space or tab,

    I need the address string to all be first letter of each word UPPER and rest LOWER except the post code which I need in ALL UPPER please I have tried a few things but no joy sorry to say 🙁

    And I know having the address in one long string is not the best but this is a DB we have inherited so I have no control

    Kind regards
    Stig of the dump

    Quick question, why is the case important, normally one just throws everything in upper or lower case when matching? Are you doing something like a case sensitive edit distance algorithm?
    😎

    Another approach could be to normalize the data, plenty of referential data available, makes a set based approach easy.

  • Rather than repeatedly having to identity the post code, add an AFTER INSERT, UPDATE trigger that finds the post code and then store its byte location and length in a column(s).  Then you can quickly and easily upper-case the post code after the entire thing has been proper-cased.

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • This is the code we have been using to do this, it's almost certainly not the best way of doing it but it works

    DECLARE @index int,@Char CHAR(1),@InputString VARCHAR(255), @OutputString VARCHAR(255),@postlength int, @postcode char (10)
    SET @InputString = 'THE SQUAT, END OF MY ROAD OTHER PLACE, LONDON NX1 2BD'

    select @postcode = ltrim(right(rtrim(@InputString),8))
    if @postcode <> '' and right(left(@postcode,3),1) not like '[0-9]'
    select @postcode = '', @postlength = 0
    else if right(left(@postcode,4),1) not like '[0-9]'
    select @postlength = 8
    else
    select @postlength = 9

    select @outputstring = LOWER(substring(@InputString,1,len(@InputString)-@postlength))
    SET @index = 2
    SET @OutputString = STUFF(@OutputString, 1, 1,UPPER(SUBSTRING(@OutputString,1,1)))

    WHILE @index <= LEN(@InputString)-@postlength
    BEGIN
    SET @Char = SUBSTRING(@InputString, @index, 1)
    IF @Char IN (' ', ';', ':', '!', '?', ',', '.', '_', '-', '/', '&','''', '(')
    IF @index + 1 <= LEN(@InputString)-@postlength
    BEGIN
    IF @Char != ''''
    SET @OutputString =
     STUFF(@OutputString, @index + 1, 1,UPPER(SUBSTRING(@InputString, @index + 1, 1)))
    END
    SET @index = @index + 1
    END

    select locaddress1 =
    CASE WHEN @postlength > 0 then
    ISNULL(@OutputString+(right(@InputString,@postlength)),'')
    ELSE
    ISNULL(@OutputString,'')
    END

  • Thanks for all the reply's I've been fire fighting for the last two days and now playing catch up so I will take a look at the options next week 🙂

    Many Thanks
    S

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply