Does anyone have a good 'validate email' function?

  • With all of the rocket scientist SQL programmers here I was wondering if anyone had a validate email function they would be willing to share?

    We started one and quickly went down the rabbit hole and got over-complicated.

    Or should I look for a CLR solution? Although that might have too much overhead and we'd have to enable it on our PROD server.

    Thanks ahead of time,

    Doug

  • Depends how accurate you want it.

    Want a complete, perfect validation? Use DB_Mail to send an email and have the app ask for the sent code, or validate a clicked on link

    Just the basics? Check that there's at least 1 ., that it's not at the beginning or the end and that there's a single @.

    And read this: http://haacked.com/archive/2007/08/21/i-knew-how-to-validate-an-email-address-until-i.aspx/

    Because, despite being flagged as wrong by many 'validators', this is perfectly valid

    !def!xyz%ab.c@example.com

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster (8/27/2014)


    Depends how accurate you want it.

    Want a complete, perfect validation? Use DB_Mail to send an email and have the app ask for the sent code, or validate a clicked on link

    Just the basics? Check that there's at least 1 ., that it's not at the beginning or the end and that there's a single @.

    And read this: http://haacked.com/archive/2007/08/21/i-knew-how-to-validate-an-email-address-until-i.aspx/

    Because, despite being flagged as wrong by many 'validators', this is perfectly valid

    !def!xyz%ab.c@example.com

    I used to use this. Not perfect by any stretch but got most of the junk out. I have another version with about 45 more checks that various PMs insisted on, that I found ridiculous and that slowed it down a bunch. I suppose the bit where it checks domains is out of date now as well, with all the new additions, but you can update if you see fit. The 'REGEX' at the very beginning takes care of most malformed emails, from various counts I took of the column that marked which exception got caught.

    Thanks

    ALTER PROC [dbo].[EmailExcludes]

    @projectID int,

    @email varchar(64)= 'email'

    as

    DECLARE @ProcessTable varchar(64)

    select @ProcessTable=ds_ProcessTable from sample.dbo.ds_dataset where ds_ProjectID=@ProjectID

    declare @SQLString varchar(max)

    -- Remove leading and trailing spaces from email first

    set @sqlString='UPDATE

    SET = ltrim(rtrim())

    FROM

    '

    SET @SQLString=REPLACE(@SQLString,'

    ',@ProcessTable)

    SET @SQLString=REPLACE(@SQLString,'',@email)

    exec (@SQLString)

    --Flag Invalid Emails

    set @sqlString='UPDATE

    SET statusflag = case

    when len() < 5 or is null then 101

    when not like ''[A-Za-z0-9_-.'''']%@[A-Za-z0-9_-]%.[A-Za-z]%[A-Za-z]'' then 106

    when charindex(''@'', ) is null then 108

    when charindex(''.'', ) is null then 103

    when charindex(''_'', ) > charindex(''@'', ) then 105

    when patindex (''%[ &,":;!+=\/()<>?]%'',) > 0 then 106

    when like ''%.@%'' or like ''%@.%'' then 109

    when like ''%@%@%'' then 102

    when like ''%..%'' then 110

    when len(substring(, len() - charindex(''.'', reverse()) + 2, len())) = 3

    and substring(, len() - charindex(''.'', reverse()) + 2, len())

    not in (''biz'',''cat'',''com'',''int'',''net'',''org'',''pro'',''tel'',''xxx'',''edu'',''gov'',''mil'') then 111

    when len(substring(, len() - charindex(''.'', reverse()) + 2, len())) > 3

    and substring(, len() - charindex(''.'', reverse()) + 2, len())

    not in (''aero'',''asia'',''coop'',''info'',''jobs'',''mobi'',''museum'',''name'',''travel'') then 111

    else statusflag

    end

    where statusflag = 0'

    SET @SQLString=REPLACE(@SQLString,'

    ',@ProcessTable)

    SET @SQLString=REPLACE(@SQLString,'',@email)

    exec (@SQLString)

  • sqldriver (8/27/2014)


    I used to use this. Not perfect by any stretch but got most of the junk out.

    Does it flag this as invalid?

    MyAccount+SSC@Gmail.com

    http://gmailblog.blogspot.com/2008/03/2-hidden-ways-to-get-more-from-your.html

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster (8/27/2014)


    sqldriver (8/27/2014)


    I used to use this. Not perfect by any stretch but got most of the junk out.

    Does it flag this as invalid?

    MyAccount+SSC@Gmail.com

    http://gmailblog.blogspot.com/2008/03/2-hidden-ways-to-get-more-from-your.html

    It does.

    DECLARE @email VARCHAR(100) = 'MyAccount+SSC@Gmail.com'

    SELECT CASE

    when len(@email) < 5 or @email is null then 101

    when @email not like '[A-Za-z0-9_-.'']%@[A-Za-z0-9_-]%.[A-Za-z]%[A-Za-z]' then 106

    when charindex('@', @email) is null then 108

    when charindex('.', @email) is null then 103

    when charindex('_', @email) > charindex('@', @email) then 105

    when patindex ('%[ &,":;!+=\/()<>?]%',@email) > 0 then 106

    when @email like '%.@%' or @email like '%@.%' then 109

    when @email like '%@%@%' then 102

    when @email like '%..%' then 110

    when len(substring(@email, len(@email) - charindex('.', reverse(@email)) + 2, len(@email))) = 3

    and substring(@email, len(@email) - charindex('.', reverse(@email)) + 2, len(@email))

    not in ('biz','cat','com','int','net','org','pro','tel','xxx','edu','gov','mil') then 111

    when len(substring(@email, len(@email) - charindex('.', reverse(@email)) + 2, len(@email))) > 3

    and substring(@email, len(@email) - charindex('.', reverse(@email)) + 2, len(@email))

    not in ('aero','asia','coop','info','jobs','mobi','museum','name','travel') then 111

    else 'Gail Wins a Pony!'

    END

    So I suppose you could do this. The line in bold is likely made redundant by the initial REGEX-ish pattern match, but at the time I wrote this I felt some degree of empathy for the human being who sifted through our bounceback/reply-to email box after participant emails went out, and wanted to be thorough.

    DECLARE @email VARCHAR(100) = 'MyAccount+SSC@Gmail.com'

    SELECT CASE

    when len(@email) < 5 or @email is null then '101'

    when @email not like '[A-Za-z0-9_-.''+]%@[A-Za-z0-9_-]%.[A-Za-z]%[A-Za-z]' then '106'

    when charindex('@', @email) is null then '108'

    when charindex('.', @email) is null then '103'

    when charindex('_', @email) > charindex('@', @email) then '105'

    when patindex ('%[ &,":;!=\/()<>?]%',@email) > 0 then '106'

    when @email like '%.@%' or @email like '%@.%' then '109'

    when @email like '%@%@%' then '102'

    when @email like '%..%' then '110'

    when len(substring(@email, len(@email) - charindex('.', reverse(@email)) + 2, len(@email))) = 3

    and substring(@email, len(@email) - charindex('.', reverse(@email)) + 2, len(@email))

    not in ('biz','cat','com','int','net','org','pro','tel','xxx','edu','gov','mil') then '111'

    when len(substring(@email, len(@email) - charindex('.', reverse(@email)) + 2, len(@email))) > 3

    and substring(@email, len(@email) - charindex('.', reverse(@email)) + 2, len(@email))

    not in ('aero','asia','coop','info','jobs','mobi','museum','name','travel') then '111'

    else 'Gail Wins a Pony!'

    END

    These were the additional checks that were put in place. On a really big table with mostly valid email addresses this thing was dead meat.

    when like ''no@%''

    or like ''none%@%''

    or like ''non@%''

    or like ''nope@%''

    or like ''noe@%''

    or like ''nop@%''

    or like ''nonone%@%''

    or like ''noemail%@%''

    or like ''no%mail@%''

    or like ''no%comp%@%''

    or like ''noeam%@%''

    or like ''noem%@%''

    or like ''nomeail%@%''

    or like ''noname%@%''

    or like ''noone@%''

    or like ''noprime@%''

    or like ''no%reply%@%''

    or like ''noreply%@%''

    or like ''novalid%@%''

    or like ''notvalid%@%''

    or like ''nowhere@%''

    or like ''unk%''

    or like ''abc%''

    or like ''123%''

    or like ''[0-9]@%''

    or like ''[0-9][0-9]@%''

    or like ''[0-9][0-9][0-9]@%''

    or like ''[0-9][0-9][0-9][0-9]@%''

    or like ''[0-9][0-9][0-9][0-9][0-9]@%''

    or like ''[0-9][0-9][0-9][0-9][0-9][0-9]@%''

    or like ''[0-9][0-9][0-9][0-9][0-9][0-9][0-9]@%''

    or like ''[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]@%''

    or like ''[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]@%''

    or like ''[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]@%''

    or like ''%@no.%''

    or like ''%@non.%''

    or like ''%@noe.%''

    or like ''%@none.%''

    or like ''%@nope.%''

    or like ''%@XX%''

    then 113

    when like replicate(substring(,1,1),5) + ''%'' then 113

    else statusflag

  • sqldriver (8/27/2014)


    GilaMonster (8/27/2014)


    sqldriver (8/27/2014)


    I used to use this. Not perfect by any stretch but got most of the junk out.

    Does it flag this as invalid?

    MyAccount+SSC@Gmail.com

    http://gmailblog.blogspot.com/2008/03/2-hidden-ways-to-get-more-from-your.html

    It does.

    Which is why I point people at that linked blog post on this topic, as what is allowed is much broader than most regex statements allow. That, and the fact that I get tired of having valid emails rejected. I still sometimes get told my .za.net email is invalid (since za is usually a top-level domain)

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster (8/27/2014)


    sqldriver (8/27/2014)


    GilaMonster (8/27/2014)


    sqldriver (8/27/2014)


    I used to use this. Not perfect by any stretch but got most of the junk out.

    Does it flag this as invalid?

    MyAccount+SSC@Gmail.com

    http://gmailblog.blogspot.com/2008/03/2-hidden-ways-to-get-more-from-your.html

    It does.

    Which is why I point people at that linked blog post on this topic, as what is allowed is much broader than most regex statements allow. That, and the fact that I get tired of having valid emails rejected. I still sometimes get told my .za.net email is invalid (since za is usually a top-level domain)

    It will be a very snowy day in .za before I disagree with you, however all of the email addresses I was processing were for customers/clients in North America, except for a few isolated projects where foreign email addresses were all business users which used <naming convention>@(sometimes country prefix).<business name>.com, so it was acceptable to screen out what I was screening out. That script does not screen out two letter country prefixes, or co.uk style email addresses, etc.

    Thanks

  • take a look at this thread, where we did some analysis a while back;

    i threw a SQL CLR function out, that seemed to do pretty well performance wise, that leverages the .Net Mail object, which has some pretty expansive, built in rules for handling email addresses.

    http://www.sqlservercentral.com/Forums/Topic1406771-386-1.aspx

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • GilaMonster (8/27/2014)


    sqldriver (8/27/2014)


    I used to use this. Not perfect by any stretch but got most of the junk out.

    Does it flag this as invalid?

    MyAccount+SSC@Gmail.com

    http://gmailblog.blogspot.com/2008/03/2-hidden-ways-to-get-more-from-your.html

    That is pretty slick. I didn't know gmail could do that. 😛

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • GilaMonster (8/27/2014)


    sqldriver (8/27/2014)


    GilaMonster (8/27/2014)


    sqldriver (8/27/2014)


    I used to use this. Not perfect by any stretch but got most of the junk out.

    Does it flag this as invalid?

    MyAccount+SSC@Gmail.com

    http://gmailblog.blogspot.com/2008/03/2-hidden-ways-to-get-more-from-your.html

    It does.

    Which is why I point people at that linked blog post on this topic, as what is allowed is much broader than most regex statements allow. That, and the fact that I get tired of having valid emails rejected. I still sometimes get told my .za.net email is invalid (since za is usually a top-level domain)

    And I sometimes get my .org.uk address rejected because .org is usually a top level domain. More often I get sites I'm providing an address to asking if I really mean .org.uk, not just .org - and in fact both work, so it isn't really a problem.

    Ages ago I saw the same problem for .ac.uk, but I think that was people trying to use the new address format before the agreed date. It didn't affect my address, because I was tom@oxprg until I was actually required to change to tom@prg.ox.ac.uk - and in fact I don't think anyone at prg was ever bitten. But I remember someone somewere else (someone at bath.ac.uk I think) being caught out when trying to jump the gun.

    edit: I also had addresses full of "!" and/or "%" back in the days before the shiny new email address RFC - in fact I suspect I was tom%oxprg rather that tom@oxprg, it was a long time ago and I don't really remember.

    Tom

  • Thanks Gail, I'll give it a go.

    Best,

    Doug

  • Sean Lange (8/27/2014)


    GilaMonster (8/27/2014)


    sqldriver (8/27/2014)


    I used to use this. Not perfect by any stretch but got most of the junk out.

    Does it flag this as invalid?

    MyAccount+SSC@Gmail.com

    http://gmailblog.blogspot.com/2008/03/2-hidden-ways-to-get-more-from-your.html

    That is pretty slick. I didn't know gmail could do that. 😛

    Haha yes I agree. I just sent an email to my team about this. I referenced this article and did not take credit. 🙂

    I always thought .(dot) was important in my email but I guess it has no meaning.

  • brad.mason5 (8/27/2014)


    I always thought .(dot) was important in my email but I guess it has no meaning.

    Depends on the mail provider. My work email starts gail.shaw@ ... and if you leave that . out and try sending to gailshaw@ ..., it definitely won't get to be.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • TomThomson (8/27/2014)


    GilaMonster (8/27/2014)


    sqldriver (8/27/2014)


    GilaMonster (8/27/2014)


    sqldriver (8/27/2014)


    I used to use this. Not perfect by any stretch but got most of the junk out.

    Does it flag this as invalid?

    MyAccount+SSC@Gmail.com

    http://gmailblog.blogspot.com/2008/03/2-hidden-ways-to-get-more-from-your.html

    It does.

    Which is why I point people at that linked blog post on this topic, as what is allowed is much broader than most regex statements allow. That, and the fact that I get tired of having valid emails rejected. I still sometimes get told my .za.net email is invalid (since za is usually a top-level domain)

    And I sometimes get my .org.uk address rejected because .org is usually a top level domain. More often I get sites I'm providing an address to asking if I really mean .org.uk, not just .org - and in fact both work, so it isn't really a problem.

    Ages ago I saw the same problem for .ac.uk, but I think that was people trying to use the new address format before the agreed date. It didn't affect my address, because I was tom@oxprg until I was actually required to change to tom@prg.ox.ac.uk - and in fact I don't think anyone at prg was ever bitten. But I remember someone somewere else (someone at bath.ac.uk I think) being caught out when trying to jump the gun.

    edit: I also had addresses full of "!" and/or "%" back in the days before the shiny new email address RFC - in fact I suspect I was tom%oxprg rather that tom@oxprg, it was a long time ago and I don't really remember.

    Alumni of OUCL are you Tom? Good stuff 🙂

  • Gazareth (8/28/2014)


    Alumni of OUCL are you Tom? Good stuff 🙂

    Tony's PRG used to have what were called "research officers" which essentially were representatives of industrial concerns that they were doing research collaborations with, and I was one of them for a few years in teh 80s/90s. Of course I was a "senior member of the University" as well, having been given my MA decades before, which may have made it easier to fit in; and I had done a couple of one week doses of bring-up-to-date training at PRG a while before, which meant I spoke the local language too. But I was never a genuine member of PRG.

    Tom

Viewing 15 posts - 1 through 15 (of 28 total)

You must be logged in to reply to this topic. Login to reply