pcre2

mirror of https://github.com/PCRE2Project/pcre2.git synced 2025-10-17 07:04:13 +08:00

Author	SHA1	Message	Date
Nicholas Wilson	abc24458b8	Small update to pcre2compat description of braced quantifiers	2025-06-02 08:15:38 +00:00
Nicholas Wilson	e62c0e0916	Re-apply "Use standard CMake constructs to export the targets. (#260 )" (#739 ) Additionally, I have attempted to clean up some CMake issues to make the package's build interface cleaner, in particular, avoiding polluting the parent directory's include path with our config.h file (if PCRE2 is being included as a subdirectory). This re-adds changes from Theodore's commit: `def175f4a9` and partially reverts changes from Carlo's commit: `92d56a1f7c` --------- Co-authored-by: Theodore Tsirpanis <teo@tsirpanis.gr>	2025-04-08 17:37:19 +01:00
Nicholas Wilson	a73417315a	Add documentation for subroutine return values (#738 )	2025-03-28 14:53:39 +00:00
github-actions[bot]	2e03e32333	Sync autogenerated files #noupdate	2025-03-24 13:30:18 +00:00
Nicholas Wilson	eb3bd3cf14	New pcre2_next_match() API to simplify pcre2demo, test, and substitute (#733 ) * The primary purpose of pcre2_next_match() is to make it much easier for PCRE2 clients to iterate over matches, without needing an advanced knowledge of regular expressions. * Secondly, we can simplify our own code by merging the three duplicate implementations of the /g global match behaviour: pcre2demo, pcre2_substitute, and pcre2test. * Thirdly, as I look closely at the issue, I can improve the documentation. * Fourthly, I would like to actually simplify the logic, removing a complex loop which makes several match attempts, swallows duplicate matches, and more. We can have identical behaviour with a simple retry using PCRE2_NOTEMPTY_ATSTART.	2025-03-24 13:29:52 +00:00
Nicholas Wilson	f63b5d2658	Add a little additional documentation on how to emulate pcre2_substitute's loop (#735 ) We won't implement more advanced/alternative global replacement strategies, but we can at least write a few sentences explaining how to do it in application code.	2025-03-24 10:08:12 +00:00
Nicholas Wilson	990d53f192	Add linker scripts with symbol versioning (#721 ) Both the Autoconf and CMake build systems are updated to detect linker support for symbol versioning. Currently, Linux, Solaris, and FreeBSD are tested and working. Windows (COFF) and macOS (Mach-O) have no symbol versioning. There is an Autoconf/CMake flag to opt out of the versioning behaviour.	2025-03-18 08:55:38 +00:00
Nicholas Wilson	b3ecb621bd	Remove the old WORKSPACE.bazel file (#732 )	2025-03-17 20:24:33 +00:00
github-actions[bot]	773486b4b5	Sync autogenerated files #noupdate	2025-02-28 22:29:19 +00:00
Nicholas Wilson	a792b72210	Add /i option to pcre2demo.c Co-authored-by: Greg Minshall <minshall@umich.edu>	2025-02-28 21:09:46 +00:00
Nicholas Wilson	b79ee1dea5	Rename files which are #included (#708 ) We have four files which have .c extensions, but which are actually #included rather than treated as their own compilation unit. This goes against conventions - Autotools, CMake, and Bazel all assume that the .h/.c distinction indicates which files are compilation units. pcre2_jit_match.c -> _inc.h pcre2_jit_misc.c -> _inc.h pcre2_printint.c -> _inc.h pcre2_ucptables.c -> _inc.h	2025-02-27 06:57:44 +00:00
github-actions[bot]	3e68381dae	Sync autogenerated files #noupdate	2025-02-26 22:29:20 +00:00
Nicholas Wilson	6e1da609f4	Update 132html to use <h2> and <h3>	2025-02-26 22:27:06 +00:00
Nicholas Wilson	500c68b986	Add testing for malloc() failures (#697 ) An additional testing argument, `-malloc` is added to pcre2test and to RunTest. The ManyConfig tests run this now in CI. We exercise each malloc failure in the core code by counting how many mallocs are done, then repeating compilation and matching with a failure on each successive malloc.	2025-02-23 09:51:32 +00:00
Nicholas Wilson	fb3b380abb	Another batch of very small typos & issues (#707 )	2025-02-22 12:31:53 +00:00
Nicholas Wilson	ce42cfac5c	Fix two typos in pcre2api, plus some other minor issues (#703 )	2025-02-19 19:15:38 +00:00
Joshua Rogers	fc04890d63	Fix documentation pcre2test.1 (#701 ) The error -47 corresponds to PCRE2_ERROR_MATCHLIMIT not PCRE2_ERROR_NOMEMORY.	2025-02-18 18:07:49 +00:00
Zoltan Herczeg	861a8aae41	Improve named group handling (#700 ) Add a simple hash code for group names to improve search speed. Ignore duplicates when group names are searched. Improve finding of duplicates (they have the same name pointer). Improve creating name table (duplicates are handled in one step). Create a new file for name management.	2025-02-18 18:04:14 +01:00
Nicholas Wilson	db3b532aa0	Improve RunTest to continue after a test failure (#696 ) This makes it easier to se all the failures at once, rather than having to fix one at a time. The output is now grouped into directories.	2025-02-15 11:50:25 +00:00
Nicholas Wilson	0d0ac3aa0f	Update EBCDIC support to support testing on normal ASCII systems (#656 ) The pcre2test utility needs quite a few changes to accommodate this. It is simpler to add a new mode to it, than to make it fully EBCDIC-native. On an ASCII system, pcre2test performs ASCII I/O, but tranlates the input when passing it to the fully-EBCDIC-supporting library.	2025-02-12 22:31:00 +00:00
github-actions[bot]	191a4fd073	Sync autogenerated files #noupdate	2025-02-07 09:15:02 +00:00
Lucas Trzesniewski	b52de60d67	Fix typo (#690 )	2025-02-07 09:13:37 +00:00
Nicholas Wilson	2aa7681fb5	Update with my release procedure (#684 )	2025-02-05 09:53:59 +00:00
Nicholas Wilson	1fffb0d44e	Updates to the README and some documentation (#681 )	2025-02-01 15:50:20 +00:00
github-actions[bot]	eb8737f4f7	Sync autogenerated files #noupdate	2025-01-24 11:45:16 +00:00
MatthewVernon	0d579d3568	pcre2grep.1 - fix warning about undefined macro 0 (#673 ) Debian's "lintian" picked this up - line 950 in the man page starts with a ' which is how you start a roff request. You can reproduce the warning thus: ``` LC_ALL=C.UTF-8 MANROFFSEQ='' MANWIDTH=80 \ man --warnings -E UTF-8 -l -Tutf8 -Z doc/pcre2grep.1 >/dev/null ``` The fix is to add a zero-width space (`\&`) to the start of the relevant line (indeed `groff_man(7)` suggests exactly this use for \&). --------- Co-authored-by: Matthew Vernon <matthew@debian.org>	2025-01-24 11:44:47 +00:00
github-actions[bot]	2ec34b9099	Sync autogenerated files	2025-01-12 15:30:09 +00:00
Nicholas Wilson	e02e52804c	Update release number to 10.46-DEV #noupdate (#667 )	2025-01-12 15:28:37 +00:00
Nicholas Wilson	f724b6117b	Declutter one cmake file (#662 )	2025-01-11 10:29:49 +00:00
Nicholas Wilson	236853194f	Update modification dates #noupdate	2024-12-27 00:49:58 +00:00
Nicholas Wilson	64613feb6d	Update modified dates #noupdate	2024-12-26 23:50:55 +00:00
Nicholas Wilson	23b4df750b	Completely redo the substitute-case-callout work (#638 ) Fixes #564 The previous API was not extensible to handle multi-character case rules. It required a fair bit of reworking in order to accommodate this. I had to delay the casing transformations to be done later, by buffering up the string to transform, and then allowing the callback to do an in-place transformation on the entire input to be transformed.	2024-12-26 23:46:21 +00:00
Nicholas Wilson	af03ceaf97	Update ChangeLog and NEWS for 10.45 (#643 )	2024-12-26 15:12:15 +00:00
Nicholas Wilson	09c07ac7ab	Small improvement to combination of substitution callout + overflow (#637 ) I reckon that callers are assuming that when you use the PCRE2_SUBSTITUTE_OVERFLOW_LENGTH option, it will calculate the entire memory requirement in one go. Just two calls should be sufficient (rather than needing to loop with a gradually-increasing buffer size). However, with a substitution callout this is not true. If you call once with PCRE2_SUBSTITUTE_OVERFLOW_LENGTH, the buffer length returned might still not be sufficient for the second call to succeed. This is because the callout might not be called the first time, but the second time it will be called and can affect control flow, by requiring even more buffer to be used. This occurs even if the callout is completely stateless, idempotent and well-behaved. This fix ensures that when we skip a callout (due to overflow), we still request enough buffer size for either option that the callout might return.	2024-12-19 10:46:03 +00:00
Nicholas Wilson	f15bdd334d	Update all man page dates #noupdate (#634 )	2024-12-18 14:12:58 +00:00
Nicholas Wilson	f0819ca7c5	Update references to maintainers in the README (#633 )	2024-12-18 13:54:08 +00:00
Nicholas Wilson	ac528f2d26	Details on new maintainership (#603 ) * Add details on new maintainership * Remove checked-in autoconf outputs * Sync & cleanup files with Detrail * Add CI job for ensuring PrepareRelease is run * Add Ubuntu-20.04 autoconf runner * Make CMake installed files match autoconf * Update acknowledgements	2024-12-11 09:53:59 +00:00
Nicholas Wilson	aee5e9a97e	Fix null-dereference bug in pcre2_substitute (#618 ) Avoid one crash introduced with recent changes to substitute code as well as clarify what the expected offset value should be when overflowing the provided buffer. --------- Co-authored-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>	2024-12-10 14:27:06 +00:00
Nicholas Wilson	0f22e67e7c	Auto-format and minimal cleanup to CMake (#592 ) I haven't tackled any controversial steps in this PR - simply tidying the formatting. I have used the `gersemi` tool, which simply "does its thing". I have additionally renamed a few variables to match standard casing conventions (but I am aware that some lowercased variables are used, for example in package-config files, and have left those alone).	2024-12-07 19:31:05 +00:00
Nicholas Wilson	e8a5cd749e	Add folding and simplication for OP_ECLASS (#586 ) Fixes #537	2024-12-04 15:03:23 +00:00
Philip Hazel	55fda7f384	Update EBCDIC documentation; in pcre2pattern move it all into a separate section.	2024-11-27 17:28:11 +00:00
Nicholas Wilson	e36d0dd2d6	Tiny documentation/comment fixes (#585 )	2024-11-27 09:22:09 +00:00
Nicholas Wilson	98fd117282	Hunt for references to "PCRE" (#584 )	2024-11-27 09:21:50 +00:00
Philip Hazel	adab4b69d8	Expand documentation and error messages for extended character classes	2024-11-26 16:00:01 +00:00
Zoltan Herczeg	1cb968d116	Move character matching code into pcre2_jit_char_inc.h (#569 ) Useful for eclass jit implementation.	2024-11-26 12:27:39 +01:00
Nicholas Wilson	e0d4eee05e	Implement Perl extended character classes (#553 ) Fixes #536	2024-11-15 15:55:10 +01:00
Nicholas Wilson	fc38d9e784	Implement ALT_EXTENDED_CLASS flag (#523 ) * Move some existing character class code into pcre2_compile_class.c * Add a new flag PCRE2_ALT_EXTENDED_CLASS to change the behaviour of parsing [...] character classes, to emit new META codes, and new OP_ECLASS codes for nested character classes with operators * Document the behaviour relative to the UTS#18 standard * No JIT support; it falls back to the interpreter. DFA is supported.	2024-10-30 11:33:29 +01:00
Carlo Marcelo Arenas Belón	ef11bee735	pcre2_jit_compile: avoid potential wraparound if framesize <= 0 (#531 ) Change the minimum framesize value to match what the code can support, while at it, refactor some of the conditionals used so that extracting the framesize is more reliable (as the assert is polymorphic) and update other seemingly unrelated bits	2024-10-21 15:05:07 +01:00
Carlo Marcelo Arenas Belón	1e09555d69	perltest: add support for hex modifier (#529 ) * pcre2test: tighten \N{U+hh...} support When \N{U+hh...} was added it was meant to support all unicode characters that can be encoded by pcre2test and Perl, but its use outside what is officially considered valid can be confusing so print a warning for those cases. * perltest: add support for hex modifier The use of \xhh can be ambiguous when used together with the utf modifier, so allow for describing code points individually in the pattern using hex, with the same syntax that is already supported by pcre2test.	2024-10-17 16:42:31 +01:00
Carlo Marcelo Arenas Belón	03be4d2d7f	pcre2test: add support for \N{U+hh...} escapes in subject (#528 ) When providing escaped values in the subject, the syntax can be ambiguous, so add support for a new escape that is always meant to refer to a Unicode character and that is already supported by the library in utf mode. While at it, refactor the code to support octal escapes and fix bugs with overlong numbers, as well to simplify the logic that decides if an escape is encoded as a code unit or as an Unicode character, that could require multiple code units.	2024-10-16 15:23:57 +01:00

1 2 3 4 5 ...

576 Commits