TIP: Simple Regular Expression for Email Validation

I was playing around with Ruby on Rails and needed an email field for some application. I thought of validating the email address before actually saving it. Googled for same and got tons of results, but none of them was perfect. Few were somewhat exhaustive but horribly difficult to understand and those which were simple were  not exhaustive enough for practical use. I thought I’ll just write the regex myself and came up with the following. I have tested it for few weird cases like an email hosted on a.a.a.a.a.a.a.a.a.a.a.a.a.a.a.a.a.a.a.a.a.a.a.exmaple.com (a subdomain deep down the hierarchy) etc.

^[A-Za-z0-9._%+-]+@([A-Za-z0-9-]+\.)+([A-Za-z0-9]{2,4}|museum)$

I hope it’ll be of help. And don’t forget to pen down your suggestions via comments.

 

11 thoughts on “TIP: Simple Regular Expression for Email Validation

  1. Here is my validation Regex which matches a bit more that yours. Specifically emails with apostrophes, which I would prefer not be allowed, but several corporations allow them. also it only allowed a-z in the domain name.

    This regex is designed to run in “case-insensitive mode”

    this link was a very nice source of why certain characters that are allowed in the email spec probably shouldn’t and aren’t by most systems.

    http://www.remote.org/jochen/mail/info/chars.html

    ^(?:(?:[\-+%=_’a-z0-9]+)(?:\.(?![\.@]))?)+@(?:[a-z0-9\-]+\.)+[a-z]+

    Though personally I like the regex on this page

    http://ex-parrot.com/~pdw/Mail-RFC822-Address.html

    That’s a good one to scare anyone.

  2. Your regex does not appear to be based on RFC 5322.

    What was your source? It looks to me like you’re choosing to reinvent a standard which has carefully been considered by more appropriate people.

    This is exactly the sort of thing the F/OSS community accuses proprietary software houses of doing.

  3. Here is the Jquery Email verification function
    function isValidEmailAddress(emailAddress) {
    var pattern = new RegExp(/^((“[\w-\s]+”)|([\w-]+(?:\.[\w-]+)*)|(“[\w-\s]+”)([\w-]+(?:\.[\w-]+)*))(@((?:[\w-]+\.)*\w[\w-]{0,66})\.([a-z]{2,6}(?:\.[a-z]{2})?)$)|(@\[?((25[0-5]\.|2[0-4][0-9]\.|1[0-9]{2}\.|[0-9]{1,2}\.))((25[0-5]|2[0-4][0-9]|1[0-9]{2}|[0-9]{1,2})\.){2}(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[0-9]{1,2})\]?$)/i);
    return pattern.test(emailAddress);
    }
    Regards

  4. @Edward, Tushar Thanks for sharing your regex(s). I’ll test them out.
    @Bernie Take it easy dude. I am not defining a standard. I needed a simple regex for email validation in my app and I wrote it. Period.

  5. E-mail validation means RFC 5322. If your regex doesn’t match it you are perceived as thus redefining this standard; how angry would you be if you have a valid e-mail address according to the real standard and someone else’s web page tells them they don’t according to their regex?

    Here’s what to do.

    1. Check there’s a valid MX record for the domain.
    2. Do an SMTP call-forward to that domain: connect to the lowest priority MX on port 25 and issue
    EHLO my.machine.name[crlf]
    MAIL FROM:[crlf]
    RCPT TO:[crlf]
    RSET
    QUIT

    Wait for a banner first, and a response each time beginning “2nn “.

    After the RCPT TO, “2nn ” means e-mail address deliverable, “4nn ” means not currently deliverable, “5nn ” means not deliverable. Up to you how you treat “4nn “.

    If the above is too difficult, just send them an e-mail containing a token needed to complete whatever your process is. Then not only do you know it’s a valid address, but even that it is the user’s actual address.

  6. Come on, Bernie! Your suggestion is a joke, right? Didn’t think it through, right? Verifying an email your way would mean

    1. any email thats not established yet to be invalid.
    2. For existing emails your method would be the slowest imagineable and would of course not only
    3. lead to unnecessary network traffic, but also to
    4. the requirement of being always connected.

  7. Here is Regular Expression for RFC 5322

    /^(?:[a-zA-Z0-9!#$%&’*+/=?^_`{|}~-]+(?:\.[a-zA-Z0-9!#$%&’*+/=?^_`{|}~-]+)*|”(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21\x23-\x5b\x5d-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])*”)@(?:(?:[a-zA-Z0-9](?:[a-zA-Z0-9-]*[a-zA-Z0-9])?\.)*[a-zA-Z0-9](?:[a-zA-Z0-9-]*[a-zA-Z0-9])?|\[(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?|[a-zA-Z0-9-]*[a-zA-Z0-9]:(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21-\x5a\x53-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])+)\]|(?:\[(?:(?:IPv6:(?:(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){7})|(?:(?!(?:.*[a-f0-9][:\]]){7,})(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){0,5})?::(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){0,5})?)))|(?:(?:IPv6:(?:(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){5}:)|(?:(?!(?:.*[a-f0-9]:){5,})(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){0,3})?::(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){0,3}:)?)))?(?:(?:25[0-5])|(?:2[0-4][0-9])|(?:1[0-9]{2})|(?:[1-9]?[0-9]))(?:\.(?:(?:25[0-5])|(?:2[0-4][0-9])|(?:1[0-9]{2})|(?:[1-9]?[0-9]))){3}))\]))/;

    This wil validate following valid and invalid mail Ids:

    Valid email addresses

    niceandsimple@example.com
    simplewith+symbol@example.com
    less.common@example.com
    a.little.more.unusual@dept.example.com
    user@[IPv6:2001:db8:1ff::a0b:dbd0]
    “much.more\ unusual”@example.com
    “very.unusual.@.unusual.com”@example.com
    “very.(),:;[]\”.VERY.\”very@\\\ \”very\”.unusual”@strange.example.com
    0@a
    !#$%&’*+-/=?^_`{}|~@example.org
    “()[]:,;@\\\”!#$%&’*+-/=?^_`{}|\ \ ~\ \ \ ?\ \ \ ^_`{}|~.a”@example.org
    “”@example.org
    postbox@com (top-level domains are valid hostnames)

    Invalid email addresses

    Abc.example.com (an @ character must separate the local and domain parts)
    Abc.@example.com (character dot(.) is last in local part)
    Abc..123@example.com (character dot(.) is double)
    A@b@c@example.com (only one @ is allowed outside quotation marks)
    a”b(c)d,e:f;gi[j\k]l@example.com (none of the special characters in this local part is allowed outside quotation marks)
    just”not”right@example.com (quoted strings must be dot separated, or the only element making up the local-part)
    this is”not\allowed@example.com (spaces, quotes, and backslashes may only exist when within quoted strings and preceded by a slash)
    this\ still\”not\\allowed@example.com (even if escaped (preceded by a backslash), spaces, quotes, and backslashes must

Comments are closed.