For validation of email addresses, Cal Henderson's RFC 822 and RFC 2822 is_valid_email() functions rule all:
http://code.iamcal.com/php/rfc822/
preg_match
(PHP 4, PHP 5)
preg_match — Perform a regular expression match
Description
Searches subject for a match to the regular expression given in pattern .
Parameters
- pattern
-
The pattern to search for, as a string.
- subject
-
The input string.
- matches
-
If matches is provided, then it is filled with the results of search. $matches[0] will contain the text that matched the full pattern, $matches[1] will have the text that matched the first captured parenthesized subpattern, and so on.
- flags
-
flags can be the following flag:
- PREG_OFFSET_CAPTURE
- If this flag is passed, for every occurring match the appendant string offset will also be returned. Note that this changes the return value in an array where every element is an array consisting of the matched string at index 0 and its string offset into subject at index 1.
- offset
-
Normally, the search starts from the beginning of the subject string. The optional parameter offset can be used to specify the alternate place from which to start the search (in bytes).
Note: Using offset is not equivalent to passing substr($subject, $offset) to preg_match() in place of the subject string, because pattern can contain assertions such as ^, $ or (?<=x). Compare:
<?php
$subject = "abcdef";
$pattern = '/^def/';
preg_match($pattern, $subject, $matches, PREG_OFFSET_CAPTURE, 3);
print_r($matches);
?>The above example will output:
Array ( )
while this example
<?php
$subject = "abcdef";
$pattern = '/^def/';
preg_match($pattern, substr($subject,3), $matches, PREG_OFFSET_CAPTURE);
print_r($matches);
?>will produce
Array ( [0] => Array ( [0] => def [1] => 0 ) )
Return Values
preg_match() returns the number of times pattern matches. That will be either 0 times (no match) or 1 time because preg_match() will stop searching after the first match. preg_match_all() on the contrary will continue until it reaches the end of subject . preg_match() returns FALSE if an error occurred.
ChangeLog
| Version | Description |
|---|---|
| 4.3.3 | The offset parameter was added |
| 4.3.0 | The PREG_OFFSET_CAPTURE flag was added |
| 4.3.0 | The flags parameter was added |
Examples
Example #1 Find the string of text "php"
<?php
// The "i" after the pattern delimiter indicates a case-insensitive search
if (preg_match("/php/i", "PHP is the web scripting language of choice.")) {
echo "A match was found.";
} else {
echo "A match was not found.";
}
?>
Example #2 Find the word "web"
<?php
/* The \b in the pattern indicates a word boundary, so only the distinct
* word "web" is matched, and not a word partial like "webbing" or "cobweb" */
if (preg_match("/\bweb\b/i", "PHP is the web scripting language of choice.")) {
echo "A match was found.";
} else {
echo "A match was not found.";
}
if (preg_match("/\bweb\b/i", "PHP is the website scripting language of choice.")) {
echo "A match was found.";
} else {
echo "A match was not found.";
}
?>
Example #3 Getting the domain name out of a URL
<?php
// get host name from URL
preg_match('@^(?:http://)?([^/]+)@i',
"http://www.php.net/index.html", $matches);
$host = $matches[1];
// get last two segments of host name
preg_match('/[^.]+\.[^.]+$/', $host, $matches);
echo "domain name is: {$matches[0]}\n";
?>
The above example will output:
domain name is: php.net
Example #4 Using named subpattern
<?php
$str = 'foobar: 2008';
preg_match('/(?<name>\w+): (?<digit>\d+)/', $str, $matches);
print_r($matches);
?>
The above example will output:
Array ( [0] => foobar: 2008 [name] => foobar [1] => foobar [digit] => 2008 [2] => 2008 )
Notes
preg_match
10-Aug-2008 11:12
09-Jul-2008 01:11
preg_match and preg_replace_callback doesnt match up in the structure of the array that they fill-up for a match.
preg_match, as the example shows, supports named patterns, whereas preg_replace_callback doesnt seem to support it at all. It seem to ignore any named pattern matched.
08-Jul-2008 05:01
I made a mistake in my previous post. Mail addresses may of course only be "exotic" in their local parts, not in the domain part. Therefore, an exotic mail address would be "exotic#%$mail@domain.com".
07-Jul-2008 11:51
For those not so familiar with regex's, I post my algorithmic email validation routine. It can more easily be changed for individual needs than regex's. My function does NOT recognize exotic email addresses as allowed by RFC. (For example, info@exotic%&$#mail.com is a legal email address but not allowed by my function.)
-Tim
<?php
function email_is_valid($email) {
if (substr_count($email, '@') != 1)
return false;
if ($email{0} == '@')
return false;
if (substr_count($email, '.') < 1)
return false;
if (strpos($email, '..') !== false)
return false;
$length = strlen($email);
for ($i = 0; $i < $length; $i++) {
$c = $email{$i};
if ($c >= 'A' && $c <= 'Z')
continue;
if ($c >= 'a' && $c <= 'z')
continue;
if ($c >= '0' && $c <= '9')
continue;
if ($c == '@' || $c == '.' || $c == '_' || $c == '-')
continue;
return false;
}
$TLD = array (
'COM', 'NET',
'ORG', 'MIL',
'EDU', 'GOV',
'BIZ', 'NAME',
'MOBI', 'INFO',
'AERO', 'JOBS',
'MUSEUM'
);
$tld = strtoupper(substr($email, strrpos($email, '.') + 1));
if (strlen($tld) != 2 && !in_array($tld, $TLD))
return false;
return true;
}
?>
03-Jul-2008 11:30
The regexp below thinks that the e-mail address:
'me@de.com' is invalid, which it is not.
'/^([a-z0-9])(([-a-z0-9._])*([a-z0-9]))*\@
([a-z0-9])([-a-z0-9_])+([a-z0-9])*
(\.([a-z0-9])([-a-z0-9_-])([a-z0-9])+)*$/i'
I modified it and it seems to work for me in my limited tests of it.
YMMV.
26-Jun-2008 04:48
Paperweight, this pattern worked fine for me (even for intranet adresses, like "john@localhost"; and also for subdomain emails, like "john@foo.bar.com"):
'/([a-z0-9])([-a-z0-9._])+([a-z0-9])\@
([a-z0-9])([-a-z0-9_])+([a-z0-9])
(\.([a-z0-9])([-a-z0-9_-])([a-z0-9])+)*/i'
but, still, this won't replace the "activation link", that is the better way to check if an e-mail is valid or not.
26-May-2008 09:50
Because making a truly correct email validation function is harder than one may think, consider using this one which comes with PHP through the filter_var function (http://www.php.net/manual/en/function.filter-var.php):
<?php
$email = "someone@domain .local";
if(!filter_var($email, FILTER_VALIDATE_EMAIL)) {
echo "E-mail is not valid";
} else {
echo "E-mail is valid";
}
?>
04-Apr-2008 11:36
In addition to reiner-keller's comment about Umlaute using setlocale (LC_ALL, 'de_DE');
To enable 'de_DE' on my Debian 4 machine I first had to:
- uncomment 'de_DE' in file /etc/locale.gen and afterwards
- run locale-gen from the shell
