
preg_replace() - php 正则表达式(PCRE)

(PHP 4, PHP 5, PHP 7)



preg_replace(mixed $pattern, mixed $replacement, mixed $subject[,int $limit= -1[,int &$count]]): mixed










当使用被弃用的e修饰符时,这个函数会转义一些字符(即:'"、和 NULL)然后进行后向引用替换。当这些完成后请确保后向引用解析完后没有单引号或双引号引起的语法错误(比如:'strlen('$1')+strlen("$2")')。确保符合PHP的字符串语法,并且符合eval语法。因为在完成替换后,引擎会将结果字符串作为php代码使用eval方式进行评估并将返回值作为最终参与替换的字符串。





preg_replace() - php 正则表达式(PCRE)








PHP 5.5.0 起,传入"e"修饰符的时候,会产生一个E_DEPRECATED错误; PHP 7.0.0 起,会产生E_WARNING错误,同时"e"也无法起效。









The bear black slow jumps over the lazy dog.



The slow black bear jumps over the lazy dog.



$startDate = 5/27/1999










Because i search a lot 4 this:
The following should be escaped if you are trying to match that character
\ ^ . $ | ( ) [ ]
* + ? { } ,
Special Character Definitions
\ Quote the next metacharacter
^ Match the beginning of the line
. Match any character (except newline)
$ Match the end of the line (or before newline at the end)
| Alternation
() Grouping
[] Character class
* Match 0 or more times
+ Match 1 or more times
? Match 1 or 0 times
{n} Match exactly n times
{n,} Match at least n times
{n,m} Match at least n but not more than m times
More Special Character Stuff
\t tab (HT, TAB)
\n newline (LF, NL)
\r return (CR)
\f form feed (FF)
\a alarm (bell) (BEL)
\e escape (think troff) (ESC)
\033 octal char (think of a PDP-11)
\x1B hex char
\c[ control char
\l lowercase next char (think vi)
\u uppercase next char (think vi)
\L lowercase till \E (think vi)
\U uppercase till \E (think vi)
\E end case modification (think vi)
\Q quote (disable) pattern metacharacters till \E
Even More Special Characters
\w Match a "word" character (alphanumeric plus "_")
\W Match a non-word character
\s Match a whitespace character
\S Match a non-whitespace character
\d Match a digit character
\D Match a non-digit character
\b Match a word boundary
\B Match a non-(word boundary)
\A Match only at beginning of string
\Z Match only at end of string, or before newline at the end
\z Match only at end of string
\G Match only where previous m//g left off (works only with /g)
Post slug generator, for creating clean urls from titles.
It works with many languages.

Example: post_slug(' -Lo#&@rem IPSUM //dolor-/sit - amet-/-consectetur! 12 -- ')
will output: lorem-ipsum-dolor-sit-amet-consectetur-12
If you want to catch characters, as well european, russian, chinese, japanese, korean of whatever, just :
- use mb_internal_encoding('UTF-8');
- use preg_replace('`...`u', '...', $string) with the u (unicode) modifier
For further information, the complete list of preg_* modifiers could be found at :
Note that it is in most cases much more efficient to use preg_replace_callback(), with a named function or an anonymous function created with create_function(), instead of the /e modifier. When preg_replace() is called with the /e modifier, the interpreter must parse the replacement string into PHP code once for every replacement made, while preg_replace_callback() uses a function that only needs to be parsed once.
It may be useful to note that if you pass an associative array as the $replacement parameter, the keys are preserved. 
If you want to replace only the n-th occurrence of $pattern, you can use this function:

this outputs |aa|b|cc|dd is the 4th|e|ff|gg|kkk| 
backreferences are accepted in $replacement
preg_replace (and other preg-functions) return null instead of a string when encountering problems you probably did not think about!
It may not be obvious to everybody that the function returns NULL if an error of any kind occurres. An error I happen to stumple about quite often was the back-tracking-limit:
When working with HTML-documents and their parsing it happens that you encounter documents that have a length of over 100.000 characters and that may lead to certain regular-expressions to fail due the back-tracking-limit of above.
A regular-expression that is ungreedy ("U", http://de.php.net/manual/de/reference.pcre.pattern.modifiers.php) often does the job, but still: sometimes you just need a greedy regular expression working on long strings ...
Since, an unhandled return-value of NULL usually creates a consecutive error in the application with unwanted and unforeseen consequences, I found the following solution to be quite helpful and at least save the application from crashing:

You may or should also put a log-message or the sending of an email into the if-condition in order to get informed, once, one of your regular-expressions does not have the effect you desired it to have.
[Editor's note: in this case it would be wise to rely on the preg_quote() function instead which was added for this specific purpose]
If your replacement string has a dollar sign or a backslash. it may turn into a backreference accidentally! This will fix it.
I want to replace 'text' with '$12345' but this becomes a backreference to $12 (which doesn't exist) and then it prints the remaining '34'. The function down below will return a string that escapes the backreferences.
string(8) "some 345"
string(11) "some \12345"
string(8) "some 345"
string(11) "some $12345" 
This code must convert numeric html entities to utf8. And it does with a little exception. It treats wrong codes starting with 
The reason is that code2utf will be called with leading zero, exactly what the pattern matches - code2utf(039).
And it does matter! PHP treats 039 as octal number.
There seems to be some confusion over how greediness works. For those familiar with Regular Expressions in other languages, particularly Perl: it works like you would expect, and as documented. Greedy by default, un-greedy if you follow a quantifier with a question mark.
There is a PHP/PCRE-specific U pattern modifier that flips the greediness, so that quantifiers are by default un-greedy, and become greedy if you follow the quantifier with a question mark: http://www.php.net/manual/en/reference.pcre.pattern.modifiers.php
To make things clear, a series of examples:

Results in this:
Default, no ?: a bunch of stuff this that and more stuff with a second code block then extra at the end
Default, with ?: a bunch of stuff this that and more stuff with a second code block then extra at the end
U flag, no ?: a bunch of stuff this that and more stuff with a second code block then extra at the end
U flag, with ?: a bunch of stuff this that and more stuff with a second code block then extra at the end
As expected: greedy by default, ? inverts it to ungreedy. With the U flag, un-greedy by default, ? makes it greedy.
Warning: a common made mistake in trying to remove all characters except numbers and letters from a string, is to use code with a regex similar to preg_replace('[^A-Za-z0-9_]', '', ...). The output goes in an unexpected direction in case your input contains two double quotes.
echo preg_replace('[^A-Za-z0-9_]', '', 'D"usseldorfer H"auptstrasse')
D"usseldorfer H"auptstrasse
It is important to not forget a leading an trailing forward slash in the regex: 
echo preg_replace('/[^A-Za-z0-9_]/', '', 'D"usseldorfer H"auptstrasse')
Dusseldorfer Hauptstrasse
PS An alternative is to use preg_replace('/\W/', '', $t) for keeping all alpha numeric characters including underscores.
Take care when you try to strip whitespaces out of an UTF-8 text. Using something like:

brokes in my case the letter à which is hex c3a0. But a0 is a whitespace. So use 

to strip all spaces and tabs, or better, use a multibyte function like mb_ereg_replace.
If there's a chance your replacement text contains any strings such as "$0.95", you'll need to escape those $n backreferences: 
as I wasn't able to find another way to do this, I wrote a function converting any UTF-8 string into a correct NTFS filename (see http://en.wikipedia.org/wiki/Filename).

It converts all control characters and filename characters which are reserved by Windows ('\/:*?"') into an underscore.
This way you can safely create an NTFS filename out of any UTF-8 string.
$firstname = htmlspecialchars($_POST['campo']);
$firstname = preg_replace("/[^a-zA-Z0-9]/", "", $firstname, -1, $count_fn);
// $count_fn conta quantos caracteres foram mudados.
// $firstname variavel que captura o input
preg_replace to only show alpha numeric characters
$info = "The Development of code . http://www.";
$info = preg_replace("/[^a-zA-Z0-9]+/", "", $info);
echo $info;
OUTPUTS: TheDevelopmentofcodehttpwww
This is a good workable code
if your intention to code and decode mod_rewrite urls and handle it with php and mysql ,this should work
to convert to url
$url = preg_replace('/[^A-Za-z0-9_-]+/', '-', $string);
And to check in mysql with the url value,use the same expression discounting '-'.
first replace the url value with php using preg_replace and use with mysql REGEXP
$sql = "select * from table where fieldname_to_check REGEXP '".preg_replace("/-+/",'[^A-Za-z0-9_]+',$url)."'"
There seems to be some unexpected behavior when using the /m modifier when the line terminators are win32 or mac format.
If you have a string like below, and try to replace dots, the regex won't replace correctly:

The /m modifier doesn't seem to work properly when CRLFs or CRs are used. Make sure to convert line endings to LFs (*nix format) in your input string.
Also worth noting is that you can use array_keys()/array_values() with preg_replace like: 
Why not offset parameter to replace the string? It would be helpful
mixed preg_replace (mixed $pattern, mixed $replacement, mixed $subject [, int $limit = -1 [, int & $count [, int $offset = 0]]]) 
1 $pattern
2 $replacement 
3 $subject
4 $limit
5 $count 
6 $offset 
A variable can handle a huge quantity of data but preg_replace can't.
Example :

$head can have the desired content, or be empty, depends on the length of $data.
For this application, just add :
$data = substr($data, 0, 4096);
before using preg_replace, and it will work fine.
This function will strip all the HTML-like content in a string.
I know you can find a lot of similar content on the web, but this one is simple, fast and robust. Don't simply use the built-in functions like strip_tags(), they dont work so good.
Careful however, this is not a correct validation of a string ; you should use additional functions like mysql_real_escape_string and filter_var, as well as custom tests before putting a submission into your database.

Hope this helps someone else out there trying to do the same thing :)
If you have issues where preg_replace returns an empty string, please take a look at these two ini parameters:
The default is set to 100K. If your buffer is larger than this, look to increase these two values.
To covert a string to SEO friendly, do this:

This will print: this-is-the-string-to-be-made-seo-friendly
The function seofy () creates a SEO friendly version from a string. Umlauts and other letters not contained in the ASCII character set are either reduced to the basic form equivalent (e. g.: é becomes e and ú wid u) or completely converted (e. g. ß becomes ss and ü becomes ue).
On the one hand this succeeds because the php function preg_replace performs the replacement by means of unicode - Unicode Regular Expressions - and on the other hand because an approximate translation is attempted by means of the php function iconv with the TRANSLIT option.
Quote php. net about iconv and TRANSLIT:
"If you append the character string //TRANSLIT to out_charset, transliteration is activated. This means that a character that cannot be displayed in the target character set can be approximated with one or more similar-looking characters.[…]"
Hello there, 
I would like to share a regex (PHP) sniplet of code 
I wrote (2012) for myself it is also being used in the 
Yerico sriptmerge plugin for joomla marked as simple code.. 
To compress javascript code and remove all comments from it. 
It also works with mootools It is fast... 
(in compairison to other PHP solutions) and does not damage the 
Javascript it self and it resolves lots of comment removal isseus.
//START Remove comments.
  $buffer = str_replace('/// ', '///', $buffer);    
  $buffer = str_replace(',//', ', //', $buffer);
  $buffer = str_replace('{//', '{ //', $buffer);
  $buffer = str_replace('}//', '} //', $buffer);
  $buffer = str_replace('*//*', '*/ /*', $buffer);
  $buffer = str_replace('/**/', '/* */', $buffer);
  $buffer = str_replace('*///', '*/ //', $buffer);
  $buffer = preg_replace("/\/\/.*\n\/\/.*\n/", "", $buffer);
  $buffer = preg_replace("/\s\/\/\".*/", "", $buffer);
  $buffer = preg_replace("/\/\/\n/", "\n", $buffer);
  $buffer = preg_replace("/\/\/\s.*.\n/", "\n \n", $buffer);
  $buffer = preg_replace('/\/\/w[^w].*/', '', $buffer);
  $buffer = preg_replace('/\/\/s[^s].*/', '', $buffer);
  $buffer = preg_replace('/\/\/\*\*\*.*/', '', $buffer);
  $buffer = preg_replace('/\/\/\*\s\*\s\*.*/', '', $buffer);
  $buffer = preg_replace('/[^\*]\/\/[*].*/', '', $buffer);
  $buffer = preg_replace('/([;])\/\/.*/', '$1', $buffer);
  $buffer = preg_replace('/((\r)|(\n)|(\R)|([^0]1)|([^\"]\s*\-))(\/\/)(.*)/', '$1', $buffer);
  $buffer = preg_replace("/([^\*])[\/]+\/\*.*[^a-zA-Z0-9\s\-=+\|!@#$%^&()`~\[\]{};:\'\",?]/", "$1", $buffer);
 $buffer = preg_replace("/\/\*/", "\n/*dddpp", $buffer);
 $buffer = preg_replace('/((\{\s*|:\s*)[\"\']\s*)(([^\{\};\"\']*)dddpp)/','$1$4', $buffer);
 $buffer = preg_replace("/\*\//", "xxxpp*/\n", $buffer);
 $buffer = preg_replace('/((\{\s*|:\s*|\[\s*)[\"\']\s*)(([^\};\"\']*)xxxpp)/','$1$4', $buffer);
 $buffer = preg_replace('/([\"\'])\s*\/\*/', '$1/*', $buffer);
 $buffer = preg_replace('/(\n)[^\'"]?\/\*dddpp.*?xxxpp\*\//s', '', $buffer);
 $buffer = preg_replace('/\n\/\*dddpp([^\s]*)/', '$1', $buffer);
 $buffer = preg_replace('/xxxpp\*\/\n([^\s]*)/', '*/$1', $buffer);
 $buffer = preg_replace('/xxxpp\*\/\n([\"])/', '$1', $buffer);
 $buffer = preg_replace('/(\*)\n*\s*(\/\*)\s*/', '$1$2$3', $buffer);
 $buffer = preg_replace('/(\*\/)\s*(\")/', '$1$2', $buffer);
 $buffer = preg_replace('/\/\*dddpp(\s*)/', '/*', $buffer);
 $buffer = preg_replace('/\n\s*\n/', "\n", $buffer);
 $buffer = preg_replace("/([^\'\"]\s*)(?!()).*/","$1", $buffer);
 $buffer = preg_replace('/([^\n\w\-=+\|!@#$%^&*()`~\[\]{};:\'",\/?\\\\])(\/\/)(.*)/', '$1', $buffer);
//END Remove comments.  
//START Remove all whitespaces
 $buffer = preg_replace('/\s+/', ' ', $buffer);
 $buffer = preg_replace('/\s*(?:(?=[=\-\+\|%&\*\)\[\]\{\};:\,\.\\!\@\#\^`~]))/', '', $buffer);
 $buffer = preg_replace('/(?:(?

