Regular Expression for extracting US Zip Codes

There are many such expressions out there, but for some reason none of them worked in my particular situation, which was a single line extracted from a multi-line address or a single form field, which may or not be zip+4 format, and which maybe on its own, following a city and/or state, and maybe preceding a country. This was for a C# web application – syntax may vary depending on platform.Some example data are as follows:

  • 12345
  • 12345-6789
  • Anytown, ST 12345
  • Anytown, ST 123456 (too many numbers! not valid)
  • Anytown, ST 1234 (not enough numbers! not valid)
  • Anytown ST, 12345-6789
  • Anytown, ST 12345 USA
  • 12345 USA
  • etc.

And here is the expression:

(^[0-9]{5}(\-[0-9]{4}){0,1}$)|((?<=[^\d\w]+)[0-9]{5}(\-[0-9]{4}){0,1}$|((?<=[^\d\w]+)[0-9]{5}(\-[0-9]{4}){0,1}(?=\D)))

I know that there are still a few flaws – for example, 12345-678 will return 12345, even though the whole thing is obviously an invalid attempt at zip+4. I’m not worried about that in this context, but maybe worth improving at some point. And, I’m pretty sure that this is not the most efficient expression, but it was the best I came up with at the time. Improvements welcome…

No Comments

Leave a Reply

Your email is never shared.Required fields are marked *