Perl Compatible Regular Expressions 10.32

PCRE is a widely used Unicode-compatible regular expression engine. It implements Perl5 regex syntax and semantics, some Python, .NET and Onigurama extensions. It uses just-in-time compilation, has consistent escaping rules, and allows for recursion, assertions and conditional patterns or complex subroutines and callouts, thus goes far beyond classic regular expressions.

Tags c regex pcre perl
License BSDL
State initial

Recent Releases

10.3212 Sep 2018 03:15 minor feature: This is another mainly and tidying release with a few minor Enhancements. These are the main ones: 1. pcre2grep now supports the inclusion of binary zeros in patterns that are. Read from files via the -f option. 2../configure now supports --enable-jit=auto, which automatically enables JIT. if the hardware supports it. 3. In pcre2_dfa_match(), internal recursive calls no longer use the stack for. Local workspace and local ovectors. Instead, an initial block of stack is Reserved, but if this is insufficient, heap memory is used. The heap limit Parameter now applies to pcre2_dfa_match(). 4. Updated to Unicode version 11.0.0. 5. (*ACCEPT:ARG), (*FAIL:ARG), and (*COMMIT:ARG) are now supported. 6. Added support for N U+dddd , but only in Unicode mode. 7. Added support for (? ) to unset all imnsx options.
10.3116 Feb 2018 15:45 minor feature: This is mainly a and tidying release (see ChangeLog for full details). However, there are some minor enhancements. 1. New pcre2_config() options: PCRE2_CONFIG_NEVER_BACKSLASH_C and. PCRE2_CONFIG_COMPILED_WIDTHS. 2. New pcre2_pattern_info() option PCRE2_INFO_EXTRAOPTIONS to retrieve the. extra compile time options. 3. There are now public names for all the pcre2_compile() error numbers. 4. Added PCRE2_CALLOUT_STARTMATCH and PCRE2_CALLOUT_BACKTRACK bits to a new. field callout_flags in callout blocks.
10.3016 Aug 2017 17:45 minor feature: The full list of changes that includes and tidies is, as always, in ChangeLog. These are the most important new features: 1. The main interpreter, pcre2_match(), has been refactored into a new version. That does not use recursive function calls (and therefore the system stack) for Remembering backtracking positions. This makes --disable-stack-for-recursion a NOOP. The new implementation allows backtracking into recursive group calls in Patterns, making it more compatible with Perl, and also some other Previously hard-to-do. For patterns that have a lot of backtracking, the Heap is now used, and there is explicit limit on the amount, settable by Pcre2_set_heap_limit() or (*LIMIT_HEAP=xxx). The "recursion limit" is retained, But is renamed as "depth limit" (though the old names remain for Compatibility). There is also a change in the way callouts from pcre2_match() are handled. The. Offset_vector field in the callout block is no longer a pointer to the Actual ovector that was passed to the matching function in the match data Block. Instead it points to an internal ovector of a size large enough to hold All possible captured substrings in the pattern. 2. The new option PCRE2_ENDANCHORED insists that a pattern match must end at. The end of the subject. 3. The new option PCRE2_EXTENDED_MORE implements Perl's /xx feature, and. Pcre2test is upgraded to support it. Setting within the pattern by (?xx) is Also supported. 4. (?n) can be used to set PCRE2_NO_AUTO_CAPTURE, because Perl now has this. 5. Additional compile options in the compile context are now available, and the. First two are: PCRE2_EXTRA_ALLOW_SURROGATE_ESCAPES and PCRE2_EXTRA_BAD_ESCAPE_IS LITERAL. 6. The newline type PCRE2_NEWLINE_NUL is now available. 7. The match limit value now also applies to pcre2_dfa_match() as there are. Patterns that can use up a lot of resources without necessarily recursing very Deeply. 8. The option REG_PEND (a GNU extension) is now available for the POSIX. Wrappe
10.2315 Feb 2017 03:45 major bugfix: 1. ChangeLog has the details of a lot of and tidies. 2. There has been a major re-factoring of the pcre2_compile.c file. Most syntax. Checking is now done in the pre-pass that identifies capturing groups. This has Reduced the amount of duplication and made the code tidier. While doing this, Some minor and Perl incompatibilities were (see ChangeLog for Details.) 3. Back references are now permitted in lookbehind assertions when there are. no duplicated group numbers (that is, (? has not been used), and, if the Reference is by name, there is only one group of that name. The referenced Group must, of course be of length. 4. g + (e.g. g +2 ) is now supported. It is a "forward back. Reference" and can be useful in repetitions (compare g - ). Perl does Not recognize this syntax. 5. pcre2grep now automatically expands its buffer up to a maximum set by. --max-buffer-size. 6. The -t option (grand total) has been added to pcre2grep. 7. A new function called pcre2_code_copy_with_tables() exists to copy a. Compiled pattern along with a private copy of the character tables that is Uses. 8. A user supplied a number of patches to upgrade pcre2grep under Windows and. Tidy the code. 9. Several updates have been made to pcre2test and test scripts (see. ChangeLog).
10.2231 Jul 2016 20:45 minor feature: 1. ChangeLog has the details of a number of. 2. The POSIX wrapper function regcomp() did not used to support back references. And subroutine calls if called with the REG_NOSUB option. It now does. 3. A new function, pcre2_code_copy(), is added, to make a copy of a compiled. Pattern. 4. Support for string callouts is added to pcre2grep. 5. Added the PCRE2_NO_JIT option to pcre2_match(). 6. The pcre2_get_error_message() function now returns with a negative error. Code if the error number it is given is unknown. 7. Several updates have been made to pcre2test and test scripts (see. ChangeLog).
10.2114 Jan 2016 00:45 minor feature: 1. Many have been. A large number of them were provoked only by very Strange pattern input, and were discovered by fuzzers. Some others were Discovered by code auditing. See ChangeLog for details. 2. The Unicode tables have been updated to Unicode version 8.0.0. 3. For Perl compatibility in EBCDIC environments, ranges such as a-z in a. Class, where both values are literal letters in the same case, omit the Non-letter EBCDIC code points within the range. 4. There have been a number of enhancements to the pcre2_substitute() function. Giving more flexibility to replacement facilities. It is now also possible to Cause the function to return the needed buffer size if the one given is too Small. 5. The PCRE2_ALT_VERBNAMES option causes the "name" parts of special verbs such. as (*THEN:name) to be processed for backslashes and to take note of PCRE2_EXTENDED. 6. PCRE2_INFO_HASBACKSLASHC makes it possible for a client to find out if a. Pattern uses C, and --never-backslash-C makes it possible to compile a version PCRE2 in which the use of C is always forbidden. 7. A limit to the length of pattern that can be handled can now be set by. Calling pcre2_set_max_pattern_length(). 8. When matching an unanchored pattern, a match can be required to begin within. a given number of code units after the start of the subject by calling Pcre2_set_offset_limit(). 9. The pcre2test program has been extended to test new facilities, and it can. Now run the tests when LF on its own is not a valid newline sequence. 10. The RunTest script has also been updated to enable more tests to be run. 11. There have been some minor performance enhancements.
10.2003 Jul 2015 18:45 minor feature: 1. Callouts with string arguments and the pcre2_callout_enumerate() function have been implemented. 2. The PCRE2_NEVER_BACKSLASH_C option, which locks out the use of C, is added. 3. The PCRE2_ALT_CIRCUMFLEX option lets match after a newline at the end of a subject in multiline mode. 4. The way named subpatterns are handled has been refactored. The previous approach had several bugs. 5. The handling of c in EBCDIC environments has been changed to conform to the perlebcdic document. This is an incompatible change. 6. Bugs have been mended, many of them discovered by fuzzers.
10.1029 Apr 2015 05:05 minor bugfix: 1. Serialization and de-serialization functions have been added to the API, making it possible to save and restore sets of compiled patterns, though restoration must be done in the same environment that was used for compilation. 2. The (*NO_JIT) feature has been added; this makes it possible for a pattern creator to specify that JIT is not to be used. 3. A number of bugs have been fixed. In particular, bugs that caused building on Windows using CMake to fail have been mended.
8.3716 Apr 2015 23:05 minor bugfix: This is bug-fix release. Note that this library (now called PCRE1) is now being maintained for bug fixes only. New projects are advised to use the new PCRE2 libraries.
8.3627 Sep 2014 14:11 minor bugfix: This is primarily a bug-fix release. However, in addition, the Unicode data tables have been updated to Unicode 7.0.0.