Differences from POSIX regex
As of PHP 5.3.0, the POSIX Regex extension
is deprecated. There are a number of differences between POSIX regex and
PCRE regex. This page lists the most notable ones that are necessary to
know when converting to PCRE.
-
The PCRE functions require that the pattern is enclosed by delimiters.
-
Unlike POSIX, the PCRE extension does not have dedicated functions for
case-insensitive matching. Instead, this is supported using the
i (PCRE_CASELESS) pattern modifier. Other
pattern modifiers are also available for changing the matching strategy.
-
The POSIX functions find the longest of the leftmost match, but PCRE
stops on the first valid match. If the string doesn't match at all it
makes no difference, but if it matches it may have dramatic effects on
both the resulting match and the matching speed.
To illustrate this difference, consider the following example from
"Mastering Regular Expressions" by Jeffrey Friedl. Using the pattern
one(self)?(selfsufficient)? on the string
oneselfsufficient with PCRE will result in matching
oneself, but using POSIX the result will be the full
string oneselfsufficient. Both (sub)strings match the
original string, but POSIX requires that the longest be the result.
-
The POSIX definition of a "character class" differs from that of PCRE.
Simple bracket expressions to match a set of explicit characters are
supported in the form of PCRE
character classes
but POSIX collating elements, character classes and character equivalents
are not supported.
Supplying an expression with a character class that both starts and ends
with :, . or =
characters to PCRE is interpreted as an attempt to use one of these
unsupported features and causes a compilation error.
Function replacements
POSIX |
PCRE |
ereg_replace |
preg_replace |
ereg |
preg_match |
eregi_replace |
preg_replace |
eregi |
preg_match |
split |
preg_split |
spliti |
preg_split |
sql_regcase |
No equivalent |