Fast, free-threaded Python bindings for `PCRE2` with a stable `stdlib.re`-compatible API. ⚡
- 04/13/2026 0.3.0: Lower-overhead public
Matchobjects, faster hot-pathmatch()/search()/fullmatch()/findall(), and tighter free-threaded execution. ⚡ - 03/22/2026 0.2.15: Python 3.15
recompatibility (prefixmatch,NOFLAG) ✅ - 03/21/2026 0.2.14: Python 3.14 compatibility 🐍
- 03/02/2026 0.2.11: Auto-detect
Visual Studioin Windows environments during install and compile. 🪟 - 02/24/2026 0.2.10: Allow a
Visual Studio(VS) compiler version check override via an environment variable. 🧰 - 12/15/2025 0.2.8: Fixed multi-arch Linux OS compatibility when both x86_64 and i386
pcre2libraries are installed. 🐧 - 10/20/2025 0.2.4: Removed the dependency on a system
python3-devpackage.Python.hwill be downloaded optimistically from python.org when needed. 📦 - 10/12/2025 0.2.3: 🤗 Full
GIL=0compliance for Python >= 3.13T. Reduced cache thread contention. Improved performance across all APIs. Expanded CI test coverage. FreeBSD, Solaris, and Windows compatibility validated. - 10/09/2025 0.1.0: 🎉 First release. Thread-safe, with auto JIT, auto pattern caching, and optimistic linking to the system library for fast installs.
PyPcre pairs Python's familiar re-compatible API with the real PCRE2 engine. You keep the ergonomics of the standard library while gaining a more capable regex engine, optional JIT, explicit threading support, and a binding designed and tested for free-threaded Python. 🧠⚡
- 🧬 Full power of PCRE2: PyPcre uses the real
PCRE2engine, so you get native compile options, semantics, JIT, and upstream tuning. - 🔥 More expressive regex syntax:
PCRE2supports constructs beyond stdlibre, including atomic groups(?>...), possessive quantifiers++, branch-reset groups(?|...), richer lookarounds, and backtracking control verbs like(*SKIP)(*FAIL). - 🧵 Thread-safe into
nogil: PyPcre is built forPYTHON_GIL=0, with CI coverage, lock-aware caches, reusable match/JIT resources, andparallel_map()for multi-subject fan-out. - ⚡ Fast on real workloads:
PCRE2JIT plus cached compiled patterns lets PyPcre match or beatreandregexon many common scans, especially multiline searches, lookaround-heavy patterns, and free-threaded execution. - 🛡️ Safer operational story: PyPcre prefers the system
libpcre2-8shared library so normal OS package updates can bring security and bug-fix benefits without a bundled fork. - ✅ Validated thoroughly: the project runs API tests, fuzz tests, memory-safety checks, local
valgrindleak checks, andmassifheap profiles. Recent local profiling found0definite leaks and0possible leaks in both the public API and raw binding paths.
| Area | PyPcre | stdlib.re |
regex |
|---|---|---|---|
| Engine | Full PCRE2 ✅ |
CPython stdlib engine | Separate engine, not PCRE2 |
PCRE2 syntax and flags |
Full access ✅ | No | No |
| Syntax power | Very rich ✅ | More limited | Rich, but different from PCRE2 |
| JIT execution | PCRE2 JIT ✅ |
No | No |
re-compatible API surface |
Stable and familiar ✅ | Native | Similar, but not the main goal |
| Free-threaded support | Built and tested for PYTHON_GIL=0 ✅ |
No explicit PyPcre-style layer | Not a project focus here |
| Built-in threaded subject fan-out | parallel_map() ✅ |
No | No |
| System library updates | Uses system libpcre2-8 by default ✅ |
N/A | N/A |
Measured on a Python 3.14.3 free-threaded build on x86_64 Linux with compiled-pattern reuse. Times are best-of-5; lower is better.
| Workload | Operation | PyPcre | re |
regex |
PyPcre edge |
|---|---|---|---|---|---|
First ERROR line in a multiline log buffer |
search |
3.68 ms |
51.72 ms |
5.67 ms |
14.0x vs re, 1.54x vs regex |
Extract only WARN / ERROR lines |
findall |
6.41 ms |
91.84 ms |
91.14 ms |
14.3x vs re, 14.2x vs regex |
| Per-line full-name extraction | findall |
22.28 ms |
172.38 ms |
218.29 ms |
7.74x vs re, 9.80x vs regex |
| Lookbehind + negative-lookahead extraction | findall |
50.23 ms |
53.35 ms |
57.03 ms |
1.06x vs re, 1.14x vs regex |
| UUID extraction | findall |
77.49 ms |
83.19 ms |
134.87 ms |
1.07x vs re, 1.74x vs regex |
| Boundary-aware token scan | findall |
127.76 ms |
128.03 ms |
146.37 ms |
effectively tied with re, 1.15x vs regex |
Measured in the same environment with 8 threads sharing one compiled pattern. Times are best-of-3; lower is better.
| Workload | Threads | PyPcre | re |
regex |
PyPcre edge |
|---|---|---|---|---|---|
First ERROR line in a multiline log buffer |
8 |
25.34 ms |
38.83 ms |
40.34 ms |
1.53x vs re, 1.59x vs regex |
Extract only WARN / ERROR lines |
8 |
28.58 ms |
65.54 ms |
73.55 ms |
2.29x vs re, 2.57x vs regex |
| Per-line full-name extraction | 8 |
31.68 ms |
123.44 ms |
164.80 ms |
3.90x vs re, 5.20x vs regex |
PyPcre is the stronger all-around choice when you want more than the baseline: full PCRE2 features, more expressive syntax, JIT, explicit free-threaded support, and a stable re-compatible API surface. It keeps Python ergonomics while giving you a substantially more capable engine. 🚀
pip install PyPcreBy default, the package links against the system libpcre2-8 shared library for fast installs and to inherit OS security updates. See Building for manual build details.
Linux, macOS, Windows, WSL, FreeBSD
If you already use the standard library re, migration is often just an import swap:
import pcre as reThe high-level API stays close to the standard library, so most existing re code can move over with little or no rewriting.
from pcre import compile, findall, match, search, Flag
if match(r"(?P<word>\\w+)", "hello world"):
print("found word")
pattern = compile(rb"\d+", flags=Flag.MULTILINE)
numbers = pattern.findall(b"line 1\nline 22")- Module helpers:
prefixmatch,match,search,fullmatch,finditer,findall,split,sub,subn,compile,escape,purge, andparallel_map. compile()returns aPatternobject with the familiar matching helpers plussplit(),sub(), andsubn().Patternexposes.pattern,.flags,.jit,.groupindex, and.groupsfor introspection.Matchobjects expose the usualgroup(),groups(),groupdict(),start(),end(),span(), andexpand()methods, along with.re,.string,.pos,.endpos,.lastindex,.lastgroup, and.regs.- Flags are available through
pcre.Flagand familiar aliases such asIGNORECASE,MULTILINE,DOTALL,VERBOSE,ASCII,UNICODE, andNOFLAG. - Errors are raised as
pcre.PcreError;errorandPatternErrorare kept as compatibility aliases.
Compiled patterns:
from pcre import compile, Flag
pattern = compile(r"(?P<name>[A-Za-z]+)", flags=Flag.CASELESS)
match = pattern.search("User: alice")
print(match.group("name")) # aliceSubstitution:
from pcre import sub
result = sub(r"\d+", "#", "room 101")
print(result) # room #Bytes:
from pcre import compile
pattern = compile(br"\w+")
print(pattern.findall(b"ab cd")) # [b'ab', b'cd']- Module-level helpers and the
Patternclass follow the same call shapes as the standard libraryremodule, includingpos,endpos, andflagsbehavior. - Python 3.15's
prefixmatch()alias is available at both the module level and on compiledPatternobjects, andre.NOFLAGis re-exported as the zero-value compatibility alias. Patternmirrorsre.Patternattributes like.pattern,.groupindex, and.groups, whileMatchobjects surface the familiar.re,.string,.pos,.endpos,.lastindex,.lastgroup,.regs, and.expand()API.- Substitution helpers enforce the same type rules as the standard library
remodule: string patterns require string replacements, byte patterns require bytes-like replacements, and callable replacements receive the wrappedMatch. compile()accepts nativeFlagvalues as well as compatiblere.RegexFlagmembers from the standard library. Supported stdlib flags map 1:1 to PCRE2 options (IGNORECASE→CASELESS,MULTILINE→MULTILINE,DOTALL→DOTALL,VERBOSE→EXTENDED); passing unsupported stdlib flags raises a compatibilityValueErrorto prevent silent divergences.pcre.escape()delegates directly tore.escapefor byte and text patterns so escaping semantics remain identical.- String patterns enable Unicode behavior by default. Byte patterns do not.
The regex package interprets
\uXXXX and \UXXXXXXXX escapes as UTF-8 code points, while PCRE2 expects
hexadecimal escapes to use the \x{...} form. Enable Flag.COMPAT_UNICODE_ESCAPE to
translate those escapes automatically when compiling patterns:
from pcre import compile, Flag
pattern = compile(r"\\U0001F600", flags=Flag.COMPAT_UNICODE_ESCAPE)
assert pattern.pattern == r"\\x{0001F600}"Set the default behavior globally with pcre.configure(compat_regex=True)
so that subsequent calls to compile() and the module-level helpers apply
the conversion without repeating the flag.
- Unsupported stdlib flags such as
re.DEBUG,re.LOCALE, andre.ASCIIraiseValueError. If you want ASCII-style behavior, usepcre.ASCIIorFlag.NO_UTF | Flag.NO_UCP. - Replacement types must match the subject type: text patterns use
strreplacements, while byte patterns use bytes-like replacements. - If you are porting patterns from the third-party
regexpackage, check\uand\Uescapes first. That is the most common compatibility gap. - Most users do not need to tune caching, JIT, or threading. The defaults are intended to work well out of the box.
pcre.configure(jit=False)disables JIT globally.Flag.JITandFlag.NO_JITlet you override that per pattern.pcre.set_cache_limit(),pcre.get_cache_limit(), andpcre.clear_cache()control the high-level compile cache.pcre.configure_threads(),pcre.configure_thread_pool(),shutdown_thread_pool(),Flag.THREADS, andFlag.NO_THREADSare available if you want to opt into or restrict threaded execution.
The extension links against an existing libpcre2-8 installation. Install the development headers for your platform before building,
for example apt install libpcre2-dev on Debian/Ubuntu, dnf install pcre2-devel
on Fedora/RHEL derivatives, or brew install pcre2 on macOS.
If the headers or library live in a non-standard location, you can export one
or more of the following environment variables prior to invoking the build
(pip install ., python -m build, etc.):
PYPCRE_ROOTPYPCRE_INCLUDE_DIRPYPCRE_LIBRARY_DIRPYPCRE_LIBRARY_PATH(pathsep-separated directories or explicit library files to prioritize when resolvinglibpcre2-8)PYPCRE_LIBRARIESPYPCRE_CFLAGSPYPCRE_LDFLAGS
If you would rather force a source build, set PYPCRE_BUILD_FROM_SOURCE=1
before installing.
When pkg-config is available, the build automatically picks up the
required include and link flags via pkg-config --cflags/--libs libpcre2-8.
Without pkg-config, the build script scans common installation prefixes for
Linux distributions (Debian, Ubuntu, Fedora/RHEL/CentOS, openSUSE, Alpine),
FreeBSD, and macOS (including Homebrew) to locate the headers and
libraries.
If your system ships libpcre2-8 under /usr but you also maintain a
manually built copy under /usr/local, export PYPCRE_LIBRARY_PATH (and, if
needed, a matching PYPCRE_INCLUDE_DIR) so the build links against the desired
location.
