Colin McDonnell @colinhacks
published March 7th, 2023
I maintain Zod, which is a schema library. Zod let's people declare an email schema like this:
import {z} from 'zod';
z.string().email();
A lot of people think every technically valid email address should pass validation by that schema. I used to think that too.
Over the years I've merged several PRs to make the email regex more "technically correct". Most recently, I merged a PR that adds support for IPv6 addresses as the domain part of an email. They look like this and I hate them:
jonny@[ipv6:7e95:0559:10f2:21e9:9dab:7309:c116:ca3b]
Turns out that PR also broke plain old subdomains: [email protected]
. That means Zod currently fails to parse my mom's current email address but Jonny up there can parse his freakish IPv6 email.
This caused a spiritual crisis and made me re-evaluate my whole stance on what z.string().email()
should do. Zod's users are mostly engineers who are building apps. When you're building an app, you want to make sure your users are providing normal-ass email addresses. So that's what z.string().email()
is going do.
So I rewrote the regex from scratch to be simple and reasonable. Here's what I came up with:
/^([A-Z0-9_+-]+\.?)*[A-Z0-9_+-]@([A-Z0-9][A-Z0-9-]*\.)+[A-Z]{2,}$/i;
Let's break that down in plain English:
The username (AKA "local part")
[a-zA-Z0-9-._]
The domain
[a-zA-Z0-9-.]
[a-zA-Z]
allowedIt is not trying to be RFC 5322 compliant. It's not going to check if the TLD is real. And it's not going to implement any of this craziness:
wtf-!#$%&'*/=?^_{|}[email protected]
"zod is cool"@mail.com
regexiscool(kinda)@mail.com
billie@[1.2.3.4]
jonny@[ipv6:7e95:0559:10f2:21e9:9dab:7309:c116:ca3b]
π@mail.com
But for reasonable people, it'll do.