mirror of
https://github.com/PCRE2Project/pcre2.git
synced 2025-10-22 07:31:15 +08:00
Documentation for added interpretation in replacement strings (PR #483)
This commit is contained in:
@@ -93,6 +93,8 @@ Perl.
|
||||
|
||||
18. Merged PR473, which implements Python-style backrefs in substitutions.
|
||||
|
||||
19. Merged PR483, which adding \g<n> and $<name> to replacement strings.
|
||||
|
||||
|
||||
Version 10.44 07-June-2024
|
||||
--------------------------
|
||||
|
@@ -3680,14 +3680,18 @@ character (backslash is treated as literal). The following forms are always
|
||||
recognized:
|
||||
<pre>
|
||||
$$ insert a dollar character
|
||||
$<n> or ${<n>} insert the contents of group <n>
|
||||
$n or ${n} insert the contents of group <i>n</i>
|
||||
$*MARK or ${*MARK} insert a control verb name
|
||||
</pre>
|
||||
Either a group number or a group name can be given for <n>. Curly brackets are
|
||||
required only if the following character would be interpreted as part of the
|
||||
number or name. The number may be zero to include the entire matched string.
|
||||
For example, if the pattern a(b)c is matched with "=abc=" and the replacement
|
||||
string "+$1$0$1+", the result is "=+babcb+=".
|
||||
Either a group number or a group name can be given for <i>n</i>, for example $2 or
|
||||
$NAME. Curly brackets are required only if the following character would be
|
||||
interpreted as part of the number or name. The number may be zero to include
|
||||
the entire matched string. For example, if the pattern a(b)c is matched with
|
||||
"=abc=" and the replacement string "+$1$0$1+", the result is "=+babcb+=".
|
||||
</P>
|
||||
<P>
|
||||
The JavaScript form $<name>, where the angle brackets are part of the syntax,
|
||||
is also recognized for group names, but not for group numbers or *MARK.
|
||||
</P>
|
||||
<P>
|
||||
$*MARK inserts the name from the last encountered backtracking control verb on
|
||||
@@ -3757,6 +3761,11 @@ in a pattern, which in Perl has some ambiguities. Details are given in the
|
||||
page.
|
||||
</P>
|
||||
<P>
|
||||
The Python form \g<n>, where the angle brackets are part of the syntax and <i>n</i>
|
||||
is either a group name or number, is recognized as an altertive way of
|
||||
inserting the contents of a group, for example \g<3>.
|
||||
</P>
|
||||
<P>
|
||||
There are also four escape sequences for forcing the case of inserted letters.
|
||||
Case forcing applies to all inserted characters, including those from capture
|
||||
groups and letters within \Q...\E quoted sequences. The insertion mechanism
|
||||
@@ -3794,16 +3803,16 @@ The second effect of setting PCRE2_SUBSTITUTE_EXTENDED is to add more
|
||||
flexibility to capture group substitution. The syntax is similar to that used
|
||||
by Bash:
|
||||
<pre>
|
||||
${<n>:-<string>}
|
||||
${<n>:+<string1>:<string2>}
|
||||
${n:-string}
|
||||
${n:+string1:string2}
|
||||
</pre>
|
||||
As before, <n> may be a group number or a name. The first form specifies a
|
||||
default value. If group <n> is set, its value is inserted; if not, <string> is
|
||||
As before, <i>n</i> may be a group number or a name. The first form specifies a
|
||||
default value. If group <i>n</i> is set, its value is inserted; if not, the string is
|
||||
expanded and the result inserted. The second form specifies strings that are
|
||||
expanded and inserted when group <n> is set or unset, respectively. The first
|
||||
expanded and inserted when group <i>n</i> is set or unset, respectively. The first
|
||||
form is just a convenient shorthand for
|
||||
<pre>
|
||||
${<n>:+${<n>}:<string>}
|
||||
${n:+${n}:string}
|
||||
</pre>
|
||||
Backslash can be used to escape colons and closing curly brackets in the
|
||||
replacement strings. A change of the case forcing state within a replacement
|
||||
@@ -4205,7 +4214,7 @@ Cambridge, England.
|
||||
</P>
|
||||
<br><a name="SEC43" href="#TOC1">REVISION</a><br>
|
||||
<P>
|
||||
Last updated: 17 September 2024
|
||||
Last updated: 20 September 2024
|
||||
<br>
|
||||
Copyright © 1997-2024 University of Cambridge.
|
||||
<br>
|
||||
|
@@ -43,16 +43,21 @@ please consult the man page, in case the conversion went wrong.
|
||||
<li><a name="TOC28" href="#SEC28">CONDITIONAL PATTERNS</a>
|
||||
<li><a name="TOC29" href="#SEC29">BACKTRACKING CONTROL</a>
|
||||
<li><a name="TOC30" href="#SEC30">CALLOUTS</a>
|
||||
<li><a name="TOC31" href="#SEC31">SEE ALSO</a>
|
||||
<li><a name="TOC32" href="#SEC32">AUTHOR</a>
|
||||
<li><a name="TOC33" href="#SEC33">REVISION</a>
|
||||
<li><a name="TOC31" href="#SEC31">REPLACEMENT STRINGS</a>
|
||||
<li><a name="TOC32" href="#SEC32">SEE ALSO</a>
|
||||
<li><a name="TOC33" href="#SEC33">AUTHOR</a>
|
||||
<li><a name="TOC34" href="#SEC34">REVISION</a>
|
||||
</ul>
|
||||
<br><a name="SEC1" href="#TOC1">PCRE2 REGULAR EXPRESSION SYNTAX SUMMARY</a><br>
|
||||
<P>
|
||||
The full syntax and semantics of the regular expressions that are supported by
|
||||
PCRE2 are described in the
|
||||
The full syntax and semantics of the regular expression patterns that are
|
||||
supported by PCRE2 are described in the
|
||||
<a href="pcre2pattern.html"><b>pcre2pattern</b></a>
|
||||
documentation. This document contains a quick-reference summary of the syntax.
|
||||
documentation. This document contains a quick-reference summary of the pattern
|
||||
syntax followed by the syntax of replacement strings in substitution function.
|
||||
The full description of the latter is in the
|
||||
<a href="pcre2api.html"><b>pcre2api</b></a>
|
||||
documentation.
|
||||
</P>
|
||||
<br><a name="SEC2" href="#TOC1">QUOTING</a><br>
|
||||
<P>
|
||||
@@ -634,12 +639,46 @@ The allowed string delimiters are ` ' " ^ % # $ (which are the same for the
|
||||
start and the end), and the starting delimiter { matched with the ending
|
||||
delimiter }. To encode the ending delimiter within the string, double it.
|
||||
</P>
|
||||
<br><a name="SEC31" href="#TOC1">SEE ALSO</a><br>
|
||||
<br><a name="SEC31" href="#TOC1">REPLACEMENT STRINGS</a><br>
|
||||
<P>
|
||||
If the PCRE2_SUBSTITUTE_LITERAL option is set, a replacement string for
|
||||
<b>pcre2_substitute()</b> is not interpreted. Otherwise, by default, the only
|
||||
special character is the dollar character in one of the following forms:
|
||||
<pre>
|
||||
$$ insert a dollar character
|
||||
$n or ${n} insert the contents of group <i>n</i> (name or number)
|
||||
$<name> insert the contents of named group
|
||||
$*MARK or ${*MARK} insert a control verb name
|
||||
</pre>
|
||||
If PCRE2_SUBSTITUTE_EXTENDED is set, there is additional interpretation:
|
||||
</P>
|
||||
<P>
|
||||
1. Backslash is an escape character, and the forms described in "ESCAPED
|
||||
CHARACTERS" above are recognized. Also:
|
||||
<pre>
|
||||
\Q...\E can be used to suppress interpretation
|
||||
\l force the next character to lower case
|
||||
\u force the next character to upper case
|
||||
\L force subsequent characters to lower case
|
||||
\U force subsequent characters to upper case
|
||||
\u\L force next character to upper case, then all lower
|
||||
\l\U force next character to lower case, then all upper
|
||||
\E end \L or \U case forcing
|
||||
</pre>
|
||||
2. Capture substitution supports the following additional forms:
|
||||
<pre>
|
||||
${n:-string} default for unset group
|
||||
${n:+string1:string2} values for set/unset group
|
||||
</pre>
|
||||
The substitution strings themselves are expanded. Backslash can be used to
|
||||
escape colons and closing curly brackets.
|
||||
</P>
|
||||
<br><a name="SEC32" href="#TOC1">SEE ALSO</a><br>
|
||||
<P>
|
||||
<b>pcre2pattern</b>(3), <b>pcre2api</b>(3), <b>pcre2callout</b>(3),
|
||||
<b>pcre2matching</b>(3), <b>pcre2</b>(3).
|
||||
</P>
|
||||
<br><a name="SEC32" href="#TOC1">AUTHOR</a><br>
|
||||
<br><a name="SEC33" href="#TOC1">AUTHOR</a><br>
|
||||
<P>
|
||||
Philip Hazel
|
||||
<br>
|
||||
@@ -648,9 +687,9 @@ Retired from University Computing Service
|
||||
Cambridge, England.
|
||||
<br>
|
||||
</P>
|
||||
<br><a name="SEC33" href="#TOC1">REVISION</a><br>
|
||||
<br><a name="SEC34" href="#TOC1">REVISION</a><br>
|
||||
<P>
|
||||
Last updated: 17 September 2024
|
||||
Last updated: 20 September 2024
|
||||
<br>
|
||||
Copyright © 1997-2024 University of Cambridge.
|
||||
<br>
|
||||
|
@@ -3550,15 +3550,19 @@ CREATING A NEW STRING WITH SUBSTITUTIONS
|
||||
eral). The following forms are always recognized:
|
||||
|
||||
$$ insert a dollar character
|
||||
$<n> or ${<n>} insert the contents of group <n>
|
||||
$n or ${n} insert the contents of group n
|
||||
$*MARK or ${*MARK} insert a control verb name
|
||||
|
||||
Either a group number or a group name can be given for <n>. Curly
|
||||
brackets are required only if the following character would be inter-
|
||||
preted as part of the number or name. The number may be zero to include
|
||||
the entire matched string. For example, if the pattern a(b)c is
|
||||
matched with "=abc=" and the replacement string "+$1$0$1+", the result
|
||||
is "=+babcb+=".
|
||||
Either a group number or a group name can be given for n, for example
|
||||
$2 or $NAME. Curly brackets are required only if the following charac-
|
||||
ter would be interpreted as part of the number or name. The number may
|
||||
be zero to include the entire matched string. For example, if the pat-
|
||||
tern a(b)c is matched with "=abc=" and the replacement string
|
||||
"+$1$0$1+", the result is "=+babcb+=".
|
||||
|
||||
The JavaScript form $<name>, where the angle brackets are part of the
|
||||
syntax, is also recognized for group names, but not for group numbers
|
||||
or *MARK.
|
||||
|
||||
$*MARK inserts the name from the last encountered backtracking control
|
||||
verb on the matching path that has a name. (*MARK) must always include
|
||||
@@ -3622,6 +3626,10 @@ CREATING A NEW STRING WITH SUBSTITUTIONS
|
||||
same as in a pattern, which in Perl has some ambiguities. Details are
|
||||
given in the pcre2pattern page.
|
||||
|
||||
The Python form \g<n>, where the angle brackets are part of the syntax
|
||||
and n is either a group name or number, is recognized as an altertive
|
||||
way of inserting the contents of a group, for example \g<3>.
|
||||
|
||||
There are also four escape sequences for forcing the case of inserted
|
||||
letters. Case forcing applies to all inserted characters, including
|
||||
those from capture groups and letters within \Q...\E quoted sequences.
|
||||
@@ -3657,17 +3665,16 @@ CREATING A NEW STRING WITH SUBSTITUTIONS
|
||||
flexibility to capture group substitution. The syntax is similar to
|
||||
that used by Bash:
|
||||
|
||||
${<n>:-<string>}
|
||||
${<n>:+<string1>:<string2>}
|
||||
${n:-string}
|
||||
${n:+string1:string2}
|
||||
|
||||
As before, <n> may be a group number or a name. The first form speci-
|
||||
fies a default value. If group <n> is set, its value is inserted; if
|
||||
not, <string> is expanded and the result inserted. The second form
|
||||
specifies strings that are expanded and inserted when group <n> is set
|
||||
or unset, respectively. The first form is just a convenient shorthand
|
||||
for
|
||||
As before, n may be a group number or a name. The first form specifies
|
||||
a default value. If group n is set, its value is inserted; if not, the
|
||||
string is expanded and the result inserted. The second form specifies
|
||||
strings that are expanded and inserted when group n is set or unset,
|
||||
respectively. The first form is just a convenient shorthand for
|
||||
|
||||
${<n>:+${<n>}:<string>}
|
||||
${n:+${n}:string}
|
||||
|
||||
Backslash can be used to escape colons and closing curly brackets in
|
||||
the replacement strings. A change of the case forcing state within a
|
||||
@@ -4035,11 +4042,11 @@ AUTHOR
|
||||
|
||||
REVISION
|
||||
|
||||
Last updated: 17 September 2024
|
||||
Last updated: 20 September 2024
|
||||
Copyright (c) 1997-2024 University of Cambridge.
|
||||
|
||||
|
||||
PCRE2 10.45 17 September 2024 PCRE2API(3)
|
||||
PCRE2 10.45 20 September 2024 PCRE2API(3)
|
||||
------------------------------------------------------------------------------
|
||||
|
||||
|
||||
@@ -11069,9 +11076,11 @@ NAME
|
||||
|
||||
PCRE2 REGULAR EXPRESSION SYNTAX SUMMARY
|
||||
|
||||
The full syntax and semantics of the regular expressions that are sup-
|
||||
ported by PCRE2 are described in the pcre2pattern documentation. This
|
||||
document contains a quick-reference summary of the syntax.
|
||||
The full syntax and semantics of the regular expression patterns that
|
||||
are supported by PCRE2 are described in the pcre2pattern documentation.
|
||||
This document contains a quick-reference summary of the pattern syntax
|
||||
followed by the syntax of replacement strings in substitution function.
|
||||
The full description of the latter is in the pcre2api documentation.
|
||||
|
||||
|
||||
QUOTING
|
||||
@@ -11645,6 +11654,42 @@ CALLOUTS
|
||||
double it.
|
||||
|
||||
|
||||
REPLACEMENT STRINGS
|
||||
|
||||
If the PCRE2_SUBSTITUTE_LITERAL option is set, a replacement string for
|
||||
pcre2_substitute() is not interpreted. Otherwise, by default, the only
|
||||
special character is the dollar character in one of the following
|
||||
forms:
|
||||
|
||||
$$ insert a dollar character
|
||||
$n or ${n} insert the contents of group n (name or number)
|
||||
$<name> insert the contents of named group
|
||||
$*MARK or ${*MARK} insert a control verb name
|
||||
|
||||
If PCRE2_SUBSTITUTE_EXTENDED is set, there is additional interpreta-
|
||||
tion:
|
||||
|
||||
1. Backslash is an escape character, and the forms described in "ES-
|
||||
CAPED CHARACTERS" above are recognized. Also:
|
||||
|
||||
\Q...\E can be used to suppress interpretation
|
||||
\l force the next character to lower case
|
||||
\u force the next character to upper case
|
||||
\L force subsequent characters to lower case
|
||||
\U force subsequent characters to upper case
|
||||
\u\L force next character to upper case, then all lower
|
||||
\l\U force next character to lower case, then all upper
|
||||
\E end \L or \U case forcing
|
||||
|
||||
2. Capture substitution supports the following additional forms:
|
||||
|
||||
${n:-string} default for unset group
|
||||
${n:+string1:string2} values for set/unset group
|
||||
|
||||
The substitution strings themselves are expanded. Backslash can be used
|
||||
to escape colons and closing curly brackets.
|
||||
|
||||
|
||||
SEE ALSO
|
||||
|
||||
pcre2pattern(3), pcre2api(3), pcre2callout(3), pcre2matching(3),
|
||||
@@ -11660,11 +11705,11 @@ AUTHOR
|
||||
|
||||
REVISION
|
||||
|
||||
Last updated: 17 September 2024
|
||||
Last updated: 20 September 2024
|
||||
Copyright (c) 1997-2024 University of Cambridge.
|
||||
|
||||
|
||||
PCRE2 10.45 17 September 2024 PCRE2SYNTAX(3)
|
||||
PCRE2 10.45 20 September 2024 PCRE2SYNTAX(3)
|
||||
------------------------------------------------------------------------------
|
||||
|
||||
|
||||
|
@@ -1,4 +1,4 @@
|
||||
.TH PCRE2API 3 "17 September 2024" "PCRE2 10.45"
|
||||
.TH PCRE2API 3 "20 September 2024" "PCRE2 10.45"
|
||||
.SH NAME
|
||||
PCRE2 - Perl-compatible regular expressions (revised API)
|
||||
.sp
|
||||
@@ -3684,14 +3684,17 @@ character (backslash is treated as literal). The following forms are always
|
||||
recognized:
|
||||
.sp
|
||||
$$ insert a dollar character
|
||||
$<n> or ${<n>} insert the contents of group <n>
|
||||
$n or ${n} insert the contents of group \fIn\fP
|
||||
$*MARK or ${*MARK} insert a control verb name
|
||||
.sp
|
||||
Either a group number or a group name can be given for <n>. Curly brackets are
|
||||
required only if the following character would be interpreted as part of the
|
||||
number or name. The number may be zero to include the entire matched string.
|
||||
For example, if the pattern a(b)c is matched with "=abc=" and the replacement
|
||||
string "+$1$0$1+", the result is "=+babcb+=".
|
||||
Either a group number or a group name can be given for \fIn\fP, for example $2 or
|
||||
$NAME. Curly brackets are required only if the following character would be
|
||||
interpreted as part of the number or name. The number may be zero to include
|
||||
the entire matched string. For example, if the pattern a(b)c is matched with
|
||||
"=abc=" and the replacement string "+$1$0$1+", the result is "=+babcb+=".
|
||||
.P
|
||||
The JavaScript form $<name>, where the angle brackets are part of the syntax,
|
||||
is also recognized for group names, but not for group numbers or *MARK.
|
||||
.P
|
||||
$*MARK inserts the name from the last encountered backtracking control verb on
|
||||
the matching path that has a name. (*MARK) must always include a name, but the
|
||||
@@ -3755,6 +3758,10 @@ in a pattern, which in Perl has some ambiguities. Details are given in the
|
||||
.\"
|
||||
page.
|
||||
.P
|
||||
The Python form \eg<n>, where the angle brackets are part of the syntax and \fIn\fP
|
||||
is either a group name or number, is recognized as an altertive way of
|
||||
inserting the contents of a group, for example \eg<3>.
|
||||
.P
|
||||
There are also four escape sequences for forcing the case of inserted letters.
|
||||
Case forcing applies to all inserted characters, including those from capture
|
||||
groups and letters within \eQ...\eE quoted sequences. The insertion mechanism
|
||||
@@ -3788,16 +3795,16 @@ The second effect of setting PCRE2_SUBSTITUTE_EXTENDED is to add more
|
||||
flexibility to capture group substitution. The syntax is similar to that used
|
||||
by Bash:
|
||||
.sp
|
||||
${<n>:-<string>}
|
||||
${<n>:+<string1>:<string2>}
|
||||
${n:-string}
|
||||
${n:+string1:string2}
|
||||
.sp
|
||||
As before, <n> may be a group number or a name. The first form specifies a
|
||||
default value. If group <n> is set, its value is inserted; if not, <string> is
|
||||
As before, \fIn\fP may be a group number or a name. The first form specifies a
|
||||
default value. If group \fIn\fP is set, its value is inserted; if not, the string is
|
||||
expanded and the result inserted. The second form specifies strings that are
|
||||
expanded and inserted when group <n> is set or unset, respectively. The first
|
||||
expanded and inserted when group \fIn\fP is set or unset, respectively. The first
|
||||
form is just a convenient shorthand for
|
||||
.sp
|
||||
${<n>:+${<n>}:<string>}
|
||||
${n:+${n}:string}
|
||||
.sp
|
||||
Backslash can be used to escape colons and closing curly brackets in the
|
||||
replacement strings. A change of the case forcing state within a replacement
|
||||
@@ -4208,6 +4215,6 @@ Cambridge, England.
|
||||
.rs
|
||||
.sp
|
||||
.nf
|
||||
Last updated: 17 September 2024
|
||||
Last updated: 20 September 2024
|
||||
Copyright (c) 1997-2024 University of Cambridge.
|
||||
.fi
|
||||
|
@@ -1,4 +1,4 @@
|
||||
.TH PCRE2DEMO 3 "17 September 2024" "PCRE2 10.44"
|
||||
.TH PCRE2DEMO 3 "20 September 2024" "PCRE2 10.44"
|
||||
.\"AUTOMATICALLY GENERATED BY PrepareRelease - do not EDIT!
|
||||
.SH NAME
|
||||
PCRE2DEMO - A demonstration C program for PCRE2
|
||||
|
@@ -1,16 +1,21 @@
|
||||
.TH PCRE2SYNTAX 3 "17 September 2024" "PCRE2 10.45"
|
||||
.TH PCRE2SYNTAX 3 "20 September 2024" "PCRE2 10.45"
|
||||
.SH NAME
|
||||
PCRE2 - Perl-compatible regular expressions (revised API)
|
||||
.SH "PCRE2 REGULAR EXPRESSION SYNTAX SUMMARY"
|
||||
.rs
|
||||
.sp
|
||||
The full syntax and semantics of the regular expressions that are supported by
|
||||
PCRE2 are described in the
|
||||
The full syntax and semantics of the regular expression patterns that are
|
||||
supported by PCRE2 are described in the
|
||||
.\" HREF
|
||||
\fBpcre2pattern\fP
|
||||
.\"
|
||||
documentation. This document contains a quick-reference summary of the syntax.
|
||||
.
|
||||
documentation. This document contains a quick-reference summary of the pattern
|
||||
syntax followed by the syntax of replacement strings in substitution function.
|
||||
The full description of the latter is in the
|
||||
.\" HREF
|
||||
\fBpcre2api\fP
|
||||
.\"
|
||||
documentation.
|
||||
.
|
||||
.SH "QUOTING"
|
||||
.rs
|
||||
@@ -618,6 +623,41 @@ start and the end), and the starting delimiter { matched with the ending
|
||||
delimiter }. To encode the ending delimiter within the string, double it.
|
||||
.
|
||||
.
|
||||
.SH "REPLACEMENT STRINGS"
|
||||
.rs
|
||||
.sp
|
||||
If the PCRE2_SUBSTITUTE_LITERAL option is set, a replacement string for
|
||||
\fBpcre2_substitute()\fP is not interpreted. Otherwise, by default, the only
|
||||
special character is the dollar character in one of the following forms:
|
||||
.sp
|
||||
$$ insert a dollar character
|
||||
$n or ${n} insert the contents of group \fIn\fP (name or number)
|
||||
$<name> insert the contents of named group
|
||||
$*MARK or ${*MARK} insert a control verb name
|
||||
.sp
|
||||
If PCRE2_SUBSTITUTE_EXTENDED is set, there is additional interpretation:
|
||||
.P
|
||||
1. Backslash is an escape character, and the forms described in "ESCAPED
|
||||
CHARACTERS" above are recognized. Also:
|
||||
.sp
|
||||
\eQ...\eE can be used to suppress interpretation
|
||||
\el force the next character to lower case
|
||||
\eu force the next character to upper case
|
||||
\eL force subsequent characters to lower case
|
||||
\eU force subsequent characters to upper case
|
||||
\eu\eL force next character to upper case, then all lower
|
||||
\el\eU force next character to lower case, then all upper
|
||||
\eE end \eL or \eU case forcing
|
||||
.sp
|
||||
2. Capture substitution supports the following additional forms:
|
||||
.sp
|
||||
${n:-string} default for unset group
|
||||
${n:+string1:string2} values for set/unset group
|
||||
.sp
|
||||
The substitution strings themselves are expanded. Backslash can be used to
|
||||
escape colons and closing curly brackets.
|
||||
.
|
||||
.
|
||||
.SH "SEE ALSO"
|
||||
.rs
|
||||
.sp
|
||||
@@ -639,6 +679,6 @@ Cambridge, England.
|
||||
.rs
|
||||
.sp
|
||||
.nf
|
||||
Last updated: 17 September 2024
|
||||
Last updated: 20 September 2024
|
||||
Copyright (c) 1997-2024 University of Cambridge.
|
||||
.fi
|
||||
|
Reference in New Issue
Block a user