Programming Tutorials

preg_match(), function preg_match_all(), preg_grep() in PHP

By: Andi, Stig and Derick in PHP Tutorials on 2008-11-23  

preg_match() is the function that matches one pattern with the subject string and returns either true or false depending whether the subject matched the pattern. It also can return an array containing the contents of the different sub-pattern matches. X Turns on extra features in the PCRE engine. At the moment, the only feature it turns on is that the engine will throw an error in case an unknown escape sequence was detected. Normally, this would just have been treated as a literal. (Notice that we still have to escape the one \ for PHP itself.)

The function preg_match_all() is similar, except that it matches the pattern with the subject repeatedly. Finding all the matches is useful when extracting information from documents. Take, for example, the situation in which you want to extract email addresses from a web site:

<?php
$raw_document = file_get_contents('http://www.w3.org/TR/CSS21');
$doc = html_entity_decode($raw_document);
$count = preg_match_all( '/<(?P<email>([a-z.]+).?@[a-z0-9]+\.[a-z]{1,6})>/Ui',
$doc, $matches );

var_dump($matches);
?>

outputs

Array
(
[0] => Array

(
[0] => <bert @w3.org>
[1] => <tantekc @microsoft.com>
[2] => <ian @hixie.ch>
[3] => <howcome @opera.com>
)

[email] => Array

(
[0] => bert @w3.org
[1] => tantekc @microsoft.com
[2] => ian @hixie.ch
[3] => howcome @opera.com
)

[1] => Array

(
[0] => bert @w3.org
[1] => tantekc @microsoft.com
[2] => ian @hixie.ch
[3] => howcome @opera.com
)

[2] => Array

(
[0] => bert
[1] => tantekc
[2] => ian
[3] => howcome
)

)

This example reads the contents of the CSS 2.1 specification into a string and decodes the HTML entities in it. The script then uses a preg_match_all() on the document, using a pattern that matches < + an email address + >, and stores the email addresses in the $matches array. The output shows that preg_match_all() doesn't store all sub-pattern belonging to one match in one element of the $matches array. Instead, it stores all the sub-pattern matches belonging to the different matches into one element of $matches.

preg_grep() performs similarly to the UNIX egrep command. It compares a pattern against elements of an array containing the subjects. It returns an array containing the elements that were successfully matched against the pattern. See the next example, which returns all valid IP addresses from the array $addresses:

<?php
$addresses = array('212.187.38.47', '188.141.21.91', '2.9.256.7', '<<empty>>');
$pattern = '@^((\d?\d|1\d\d|2[0-4]\d|25[0-5])\.){3}' '(\d?\d|1\d\d|2[0-4]\d|25[0-5])@';
$addresses = preg_grep($pattern, $addresses);
print_r($addresses);
?>





Add Comment

* Required information
1000

Comments

No comments yet. Be the first!

Most Viewed Articles (in PHP )

Latest Articles (in PHP)