Programming Tutorials

preg_replace() and preg_replace_callback() in PHP

By: Andi, Stig and Derick in PHP Tutorials on 2008-11-23  

PHP's regular expression functions can also replace text based on pattern matching. The replacement functions can replace a substring that matches a subpattern with different text. In the replacement, you can refer to the pattern matches using back references. Here is an example that explains the replacement functions. In this example, we use preg_replace() to replace a pseudo-link, such as [link url="www.php.net"]PHP[/link], with a real HTML link:

<?php
$str = '[link url="http://php.net"]PHP[/link] is cool.';
$pattern = '@\[link\ url="([^"]+)"\](.*?)\[/link\]@';
$replacement = '<a href="\\1">\\2</a>';
$str = preg_replace($pattern, $replacement, $str);
echo $str;
?>

The script outputs

<a href="http://php.net">PHP</a> is cool.

The pattern consists of two sub-patterns, ([^"]+) for the URL and (.*?). Instead of returning the substring of the subject that matches the two subpatterns, the PCRE engine assigns the substring to back references, which you can access by using \\1 and \\2 in the replacement string. If you don't want to use \\1, you may use $1. Be careful when putting the replacement string into double quotes, because you will have to escape either the slashes (so that a back reference looks like \\\\1) or the dollar sign (so that a back reference looks like \$1). You should always put the replacement string in single quotes.

The full pattern match is assigned to back reference 0, just like the element with key 0 in the matches array of the preg_match() function.

Tip:

If the replacement string needs to be back reference + number, you can also use ${1}1 for the first back reference, followed by the number 1.

preg_replace() can replace more than one subject at the same time by using an array of subjects. For instance, the following example script changes the format of the names in the array $names:

<?php
$names = array(
'rethans, derick',
'sæther bakken, stig',
'gutmans, andi'
);
$names = preg_replace('@([^,]+).\ (.*)@', '\\2 \\1', $names);
?>

The names array is changed to

array('derick rethans', 'stig sæther bakken', 'andi gutmans');

However, names usually start with an uppercase letter. You can uppercase the first letter by using either the /e modifier or preg_replace_callback(). The /e modifier uses the replacement string to be evaluated as PHP code. Its return value is the replacement string:

<?php
$names = array(
'rethans, derick',
'sæther bakken, stig',
'gutmans, andi'
);
$names = preg_replace('@([^,]+).\ (.*)@e', 'ucwords("\\2\\1")', $names);
?>

If you need to do more complex manipulation with the matched patterns, evaluating replacement strings becomes complicated. You can use the preg_replace_callback() function instead:

<?php
function format_string($matches)
{
return ucwords("{$matches[2]} {$matches[1]}");
}

$names = array(
'rethans, derick',
'sæther bakken, stig',
'gutmans, andi'
);

$names = preg_replace_callback(
'@([^,]+).\ (.*)@', // pattern
'format_string', // callback function
$names // array with 'subjects'
);

print_r($names);

?>

Here's one more useful example:

<?php

$show_with_vat = true;
$format = '&euro; %.2f';
$exchange_rate = 1.2444;
function currency_output_vat ($data)
{
$price = $data[1];
$vat_percent = $data[2];
$show_vat = isset ($_GLOBALS['show_with_vat']) &&
$_GLOBALS['show_with_vat'];
$amount = ($show_vat)
? $price * (1 + $vat_percent / 100)
: $price;
return sprintf(
$GLOBALS['format'],
$amount / $GLOBALS['exchange_rate']
);
}

$data = "This item costs {amount: 27.95 %19%} ".
"and the other one costs {amount: 29.95 %0%}.\n";
echo preg_replace_callback (
'/\{amount\:\ ([0-9.]+)\ \%([0-9.]+)\%\}/',
'currency_output_vat',
$data
);

?>

This example originates from a webshop where the format and exchange rate are decoupled from the text, which is stored in a cache file. With this solution, it is possible to use caching techniques and still have a dynamic exchange rate.

preg_replace() and preg_replace_callback() allow the pattern to be an array of patterns. When an array is passed as the first parameter, every pattern is matched against the subject. preg_replace() also enables you to pass an array for the replacement string when the first parameter is an array with patterns:

<?php
$text = "This is a nice text; with punctuation AND capitals";
$patterns = array('@[A-Z]@e', '@[\W]@', '@_+@');
$replacements = array('strtolower(\\0)', '_', '_');
$text = preg_replace($patterns, $replacements, $text);
echo $text."\n";
?>

The first pattern @[A-Z]@e matches any uppercase character and, because the e modifier is used, the accompanying replacement string strtolower(\\0) is evaluated as PHP code. The second pattern [\W\] matches all non-word characters and, because the second replacement string is simply _, all non-word characters are replaced by the underscore (_). Because the replacements are done in order, the third pattern matches the already modified subject, replacing all multiple occurrences of _ with one.






Add Comment

* Required information
1000

Comments

No comments yet. Be the first!

Most Viewed Articles (in PHP )

Latest Articles (in PHP)