pcre2

mirror of https://github.com/PCRE2Project/pcre2.git synced 2025-10-19 19:44:08 +08:00

Author	SHA1	Message	Date
Isaac Oscar Gariano	0288afe12c	Print `pcre2test` output in colour. (#811 ) * Add a --colour flag to pcre2test to colourise the output. * Comments from the inputfile are in grey (but not those entered in interactively) * All other input is in green * Messages related to PCRE2 api errors are in magenta * Messages related to errors with using pcre2test itself are in red * Timing and memory usage information is in blue * Normal output is in your terminal's default foreground colour --------- Co-authored-by: Nicholas Wilson <nicholas@nicholaswilson.me.uk>	2025-10-12 09:52:34 +01:00
Nicholas Wilson	ed69a3a70b	Add pcre2_substitute checks to enforce pattern, subject, offset and options haven't changed (#807 ) * Check for pattern/subject/offset/option changes when using PCRE2_SUBSTITUTE_MATCHED. * Return PCRE2_ERROR_DFA_UFUNC if using PCRE2_SUBSTITUTE_MATCHED after a call to pcre2_dfa_match(). * Add new error codes to pcre2_substitute when using PCRE2_SUBSTITUTE_MATCHED. * Change the behaviour of the matching methods so that the match_data fields are populated on all matches with "(rc >= 0 \|\| rc==NO_MATCH \|\| rc==PARTIAL)". We previously ensured that every call to a match method guarantees to set the rc field on the match_data. * Add modifiers to pcre2test to better exercise these pcre2_substitute conditions --------- Co-authored-by: Isaac Oscar Gariano <IsaacOscar@live.com.au>	2025-10-03 09:53:00 +01:00
Nicholas Wilson	07e1dea8fa	Chip away at more coverage gaps (#799 )	2025-09-26 07:23:58 +01:00
Nicholas Wilson	cbfe089624	Automatic update of doc files #noupdate	2025-08-28 10:39:03 +00:00
Nicholas Wilson	eb3bd3cf14	New pcre2_next_match() API to simplify pcre2demo, test, and substitute (#733 ) * The primary purpose of pcre2_next_match() is to make it much easier for PCRE2 clients to iterate over matches, without needing an advanced knowledge of regular expressions. * Secondly, we can simplify our own code by merging the three duplicate implementations of the /g global match behaviour: pcre2demo, pcre2_substitute, and pcre2test. * Thirdly, as I look closely at the issue, I can improve the documentation. * Fourthly, I would like to actually simplify the logic, removing a complex loop which makes several match attempts, swallows duplicate matches, and more. We can have identical behaviour with a simple retry using PCRE2_NOTEMPTY_ATSTART.	2025-03-24 13:29:52 +00:00
Nicholas Wilson	500c68b986	Add testing for malloc() failures (#697 ) An additional testing argument, `-malloc` is added to pcre2test and to RunTest. The ManyConfig tests run this now in CI. We exercise each malloc failure in the core code by counting how many mallocs are done, then repeating compilation and matching with a failure on each successive malloc.	2025-02-23 09:51:32 +00:00
Nicholas Wilson	fb3b380abb	Another batch of very small typos & issues (#707 )	2025-02-22 12:31:53 +00:00
Joshua Rogers	fc04890d63	Fix documentation pcre2test.1 (#701 ) The error -47 corresponds to PCRE2_ERROR_MATCHLIMIT not PCRE2_ERROR_NOMEMORY.	2025-02-18 18:07:49 +00:00
Nicholas Wilson	0d0ac3aa0f	Update EBCDIC support to support testing on normal ASCII systems (#656 ) The pcre2test utility needs quite a few changes to accommodate this. It is simpler to add a new mode to it, than to make it fully EBCDIC-native. On an ASCII system, pcre2test performs ASCII I/O, but tranlates the input when passing it to the fully-EBCDIC-supporting library.	2025-02-12 22:31:00 +00:00
Nicholas Wilson	e02e52804c	Update release number to 10.46-DEV #noupdate (#667 )	2025-01-12 15:28:37 +00:00
Nicholas Wilson	64613feb6d	Update modified dates #noupdate	2024-12-26 23:50:55 +00:00
Nicholas Wilson	23b4df750b	Completely redo the substitute-case-callout work (#638 ) Fixes #564 The previous API was not extensible to handle multi-character case rules. It required a fair bit of reworking in order to accommodate this. I had to delay the casing transformations to be done later, by buffering up the string to transform, and then allowing the callback to do an in-place transformation on the entire input to be transformed.	2024-12-26 23:46:21 +00:00
Nicholas Wilson	f15bdd334d	Update all man page dates #noupdate (#634 )	2024-12-18 14:12:58 +00:00
Nicholas Wilson	e8a5cd749e	Add folding and simplication for OP_ECLASS (#586 ) Fixes #537	2024-12-04 15:03:23 +00:00
Philip Hazel	55fda7f384	Update EBCDIC documentation; in pcre2pattern move it all into a separate section.	2024-11-27 17:28:11 +00:00
Nicholas Wilson	98fd117282	Hunt for references to "PCRE" (#584 )	2024-11-27 09:21:50 +00:00
Nicholas Wilson	fc38d9e784	Implement ALT_EXTENDED_CLASS flag (#523 ) * Move some existing character class code into pcre2_compile_class.c * Add a new flag PCRE2_ALT_EXTENDED_CLASS to change the behaviour of parsing [...] character classes, to emit new META codes, and new OP_ECLASS codes for nested character classes with operators * Document the behaviour relative to the UTS#18 standard * No JIT support; it falls back to the interpreter. DFA is supported.	2024-10-30 11:33:29 +01:00
Carlo Marcelo Arenas Belón	1e09555d69	perltest: add support for hex modifier (#529 ) * pcre2test: tighten \N{U+hh...} support When \N{U+hh...} was added it was meant to support all unicode characters that can be encoded by pcre2test and Perl, but its use outside what is officially considered valid can be confusing so print a warning for those cases. * perltest: add support for hex modifier The use of \xhh can be ambiguous when used together with the utf modifier, so allow for describing code points individually in the pattern using hex, with the same syntax that is already supported by pcre2test.	2024-10-17 16:42:31 +01:00
Carlo Marcelo Arenas Belón	03be4d2d7f	pcre2test: add support for \N{U+hh...} escapes in subject (#528 ) When providing escaped values in the subject, the syntax can be ambiguous, so add support for a new escape that is always meant to refer to a Unicode character and that is already supported by the library in utf mode. While at it, refactor the code to support octal escapes and fix bugs with overlong numbers, as well to simplify the logic that decides if an escape is encoded as a code unit or as an Unicode character, that could require multiple code units.	2024-10-16 15:23:57 +01:00
Nicholas Wilson	b72cc97186	Add support for Turkish I casefolding (#521 ) New flag: PCRE2_EXTRA_TURKISH_CASING, and pre-pattern flag (TURKISH_CASING). Also added a pre-pattern flag (CASELESS_RESTRICT) for this existing flag.	2024-10-14 17:00:06 +01:00
Philip Hazel	60fd745ebc	Minor documentation updates	2024-10-04 17:21:33 +01:00
Nicholas Wilson	9503e68b7c	Add substitute case callout function (#512 ) * Add substitute case callout function * Fix foolish misunderstanding * Fix trivial build error * Fix non-Unicode tests	2024-10-04 16:57:58 +01:00
Carlo Marcelo Arenas Belón	c0d86f7d21	pcre2test: tighten \x{...} parsing in subject (#504 ) Eventhough it is documented that invalid escapes will be reported, the code would fallback in that case and result in a NUL being generated whenever an incompete \x{ escape was being parsed. Refactor the code to report the error instead and fix the logic used for overlong numbers so that the truncation doesn't result in an unexpected value being used. There was an old (from PCRE 4.0) test that was affected but which is no longer relevant, because it could only be triggered with invalid UTF (which isn't supported), and that was therefore removed as a result. Additionally, it was found that the same syntax error was affecting perltest so correct that as well by reporting syntax errors in the subject lines. While at it update related documentation for Perl's compatibility.	2024-10-02 12:13:37 +01:00
Nicholas Wilson	32f03ad588	Add option to disable callouts (#499 ) * Add option to disable callouts * Fix pcre2grep issue, and docs * Add pcre2test docs	2024-10-02 12:00:02 +01:00
Alex Dowad	b868b411e2	Add new API function pcre2_set_optimization() for controlling enabled optimizations (#471 ) It is anticipated that over time, more and more optimizations will be added to PCRE2, and we want to be able to switch optimizations off/on, both for testing purposes and to be able to work around bugs in a released library version. The number of free bits left in the compile options word is very small. Hence, we will start putting all optimization enable/disable flags in a separate word. To switch these off/on, the new API function pcre2_set_optimization() will be used. The values which can be passed to pcre2_set_optimization() are different from the internal flag bit values. The values accepted by pcre2_set_optimization() are contiguous integers, so there is no danger of ever running out of them. This means in the future, the internal representation can be changed at any time without breaking backwards compatibility. Further, the 'directives' passed to pcre2_set_optimization() are not restricted to control a single, specific optimization. As an example, passing PCRE2_OPTIMIZATION_FULL will turn on all optimizations supported by whatever version of PCRE2 the client program happens to be linked with. Co-authored-by: Carlo Marcelo Arenas Belón <carenas@gmail.com> Co-authored-by: Zoltan Herczeg <hzmester@freemail.hu>	2024-09-21 14:14:32 +01:00
Philip Hazel	f964982eec	Add documentation for PCRE2_EXTRA_BS0 and PCRE2_EXTRA_PYTHON_OCTAL	2024-09-21 10:17:10 +01:00
Alex Dowad	64137d23e9	Add missing 'expand' modifier to list in pcre2test manpage (#458 )	2024-09-04 11:48:53 +01:00
Philip Hazel	4249b67c7f	Document JIT allocation test feature and add to pcre2test	2024-07-24 14:53:21 +01:00
Philip Hazel	05aafb2e30	Implement pcre2_set_max_pattern_compiled_length() and set this limit in the fuzzer	2024-04-24 09:32:25 +01:00
Philip Hazel	7d59ddebb1	Implement PCRE2_DISABLE_RECURSELOOP_CHECK	2024-01-27 15:54:07 +00:00
Philip Hazel	d71e89b6ea	Check documentation for double-word typos	2024-01-19 16:48:53 +00:00
Philip Hazel	80053ba153	Documentation and tests update	2023-09-20 13:26:10 +01:00
Carlo Marcelo Arenas Belón	14e0c41be1	admin: update ChangeLog and config.h for recent changes (#286 )	2023-08-16 16:56:34 +01:00
Philip Hazel	5974a84364	Update documentation for variable-length lookbehinds	2023-08-11 18:38:20 +01:00
Philip Hazel	7cc9d63fd9	Update pcre2test documentation	2023-07-17 17:41:26 +01:00
Carlo Marcelo Arenas Belón	29d65e0cd3	pcre2test: print library bitwidth in banner for usability (#227 ) While at it update related documentation and missed changes in Changelog	2023-04-15 15:33:57 +01:00
Carlo Marcelo Arenas Belón	64549346f0	avoid inconsistency between \d and [:digit:] when using /a (#223 ) Since `a608946` (Additional PCRE2_EXTRA_ASCII_xxx code, 2023-02-01) PCRE2_EXTRA_ASCII_BSD could be used to restrict \d to ASCII causing the following inconsistent behaviour in UCP mode. PCRE2 version 10.43-DEV 2023-01-15 re> /\d/utf,ucp,ascii_bsd data> ٣ No match data> re> /[[:digit:]]/utf,ucp,ascii_bsd data> ٣ 0: \x{663} It has been suggested[1] that the change to match \p{Nd} when Unicode is enabled for [:digit:] might had been unintentional and a bug, as [:digit:] should be able to be POSIX compatible, so add a new flag PCRE2_EXTRA_ASCII_DIGIT to avoid changing its definition in UCP mode. [1] https://lore.kernel.org/git/CANgJU+U+xXsh9psd0z5Xjr+Se5QgdKkjQ7LUQ-PdUULSN3n4+g@mail.gmail.com/	2023-04-09 12:29:46 +01:00
Philip Hazel	6bf8045997	Documentation update for new PCRE2_EXTRA caseless and ASCII options	2023-02-04 17:19:56 +00:00
Philip Hazel	d73a949ec1	Refactor heapframe_size code in pcre2test and update documentation for heap frame information	2023-01-18 17:57:07 +00:00
Carlo Marcelo Arenas Belón	c80c6338ad	add pcre2_get_match_data_heapframes_size() (#191 ) Since PCRE2 10.41, the match data contains a pointer to a vector of frames allocated in the heap and that are used by pcre2_match() when doing non JIT matches. There is though, no outside visibility on the size of it, and therefore the memory it uses is locked away until match_data itself is freed. Add an API that allows getting that value, so an application could decide based on its own experienced memory pressure to keep reusing that match_data or not. While at it, update the documentation of other related functions for clarity.	2023-01-17 15:26:27 +00:00
Edward Betts	db53e4007d	Correct spelling mistakes (#143 )	2022-08-19 08:56:03 +01:00
Philip Hazel	d90fb23878	Refactor match_data() to always use the heap instead of having an initial frames vector on the stack; some consequential adjustmentsneeded.	2022-07-27 17:44:55 +01:00
Philip Hazel	3103b8f20a	Final file tidies for 10.40	2022-04-15 16:57:57 +01:00
Philip Hazel	bf35c0518c	Add -LP and -LS (list properties, list scripts) features to pcre2test.	2022-01-12 15:01:14 +00:00
Philip Hazel	cb854a912e	Add options for NULL pointers to pcre2test.	2021-11-28 16:22:24 +00:00
Carlo Marcelo Arenas Belón	587b94277b	doc: formatting/typo fixes to documentation (#47 ) * doc: fix incorrect use of JOIN and typo Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com> * doc: reformat of pcre2_substitute to align options includes some rewording to fit better in an 80 char wide troff output. Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com> * doc: update names to pcre2	2021-11-27 16:27:49 +00:00
Philip Hazel	8f3e11a355	Doc file tidies for 10.38-RC1	2021-08-31 17:14:42 +01:00
Philip Hazel	21c26698b3	Lock out \K in lookaround assertions by default, but provide an option to re-enable the old behaviour, just in case.	2021-08-30 16:57:44 +01:00
Philip Hazel	5ff1daffa0	Clarify delimiter handling in pcre2test documentation.	2021-08-28 12:46:50 +01:00
Philip.Hazel	cd45050ee4	Final file tidies for 10.37-RC1	2021-04-28 16:44:51 +00:00

1 2 3 4

173 Commits