annotate doc/regex.texi @ 13553:8fc3314fe460

Document not_eol and remove mention of regex.c.
author Reuben Thomas <rrt@sc3d.org>
date Sat, 14 Aug 2010 16:40:16 +0100
parents bb0ceefd22dc
children 3a3b9d29af1b
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
13533
ca70a11e70e2 Integrate the regex documentation.
Bruno Haible <bruno@clisp.org>
parents: 13532
diff changeset
1 @node Overview
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2 @chapter Overview
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
3
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
4 A @dfn{regular expression} (or @dfn{regexp}, or @dfn{pattern}) is a text
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
5 string that describes some (mathematical) set of strings. A regexp
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
6 @var{r} @dfn{matches} a string @var{s} if @var{s} is in the set of
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
7 strings described by @var{r}.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
8
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
9 Using the Regex library, you can:
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
10
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
11 @itemize @bullet
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
12
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
13 @item
13532
b0bea693e638 Whitespace cleanup.
Bruno Haible <bruno@clisp.org>
parents: 13531
diff changeset
14 see if a string matches a specified pattern as a whole, and
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
15
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
16 @item
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
17 search within a string for a substring matching a specified pattern.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
18
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
19 @end itemize
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
20
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
21 Some regular expressions match only one string, i.e., the set they
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
22 describe has only one member. For example, the regular expression
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
23 @samp{foo} matches the string @samp{foo} and no others. Other regular
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
24 expressions match more than one string, i.e., the set they describe has
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
25 more than one member. For example, the regular expression @samp{f*}
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
26 matches the set of strings made up of any number (including zero) of
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
27 @samp{f}s. As you can see, some characters in regular expressions match
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
28 themselves (such as @samp{f}) and some don't (such as @samp{*}); the
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
29 ones that don't match themselves instead let you specify patterns that
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
30 describe many different strings.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
31
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
32 To either match or search for a regular expression with the Regex
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
33 library functions, you must first compile it with a Regex pattern
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
34 compiling function. A @dfn{compiled pattern} is a regular expression
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
35 converted to the internal format used by the library functions. Once
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
36 you've compiled a pattern, you can use it for matching or searching any
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
37 number of times.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
38
13553
8fc3314fe460 Document not_eol and remove mention of regex.c.
Reuben Thomas <rrt@sc3d.org>
parents: 13549
diff changeset
39 The Regex library is used by including @file{regex.h}.
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
40 @pindex regex.h
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
41 Regex provides three groups of functions with which you can operate on
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
42 regular expressions. One group---the @sc{gnu} group---is more powerful
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
43 but not completely compatible with the other two, namely the @sc{posix}
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
44 and Berkeley @sc{unix} groups; its interface was designed specifically
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
45 for @sc{gnu}. The other groups have the same interfaces as do the
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
46 regular expression functions in @sc{posix} and Berkeley
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
47 @sc{unix}.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
48
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
49 We wrote this chapter with programmers in mind, not users of
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
50 programs---such as Emacs---that use Regex. We describe the Regex
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
51 library in its entirety, not how to write regular expressions that a
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
52 particular program understands.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
53
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
54
13533
ca70a11e70e2 Integrate the regex documentation.
Bruno Haible <bruno@clisp.org>
parents: 13532
diff changeset
55 @node Regular Expression Syntax
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
56 @chapter Regular Expression Syntax
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
57
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
58 @cindex regular expressions, syntax of
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
59 @cindex syntax of regular expressions
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
60
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
61 @dfn{Characters} are things you can type. @dfn{Operators} are things in
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
62 a regular expression that match one or more characters. You compose
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
63 regular expressions from operators, which in turn you specify using one
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
64 or more characters.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
65
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
66 Most characters represent what we call the match-self operator, i.e.,
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
67 they match themselves; we call these characters @dfn{ordinary}. Other
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
68 characters represent either all or parts of fancier operators; e.g.,
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
69 @samp{.} represents what we call the match-any-character operator
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
70 (which, no surprise, matches (almost) any character); we call these
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
71 characters @dfn{special}. Two different things determine what
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
72 characters represent what operators:
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
73
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
74 @enumerate
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
75 @item
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
76 the regular expression syntax your program has told the Regex library to
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
77 recognize, and
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
78
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
79 @item
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
80 the context of the character in the regular expression.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
81 @end enumerate
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
82
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
83 In the following sections, we describe these things in more detail.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
84
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
85 @menu
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
86 * Syntax Bits::
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
87 * Predefined Syntaxes::
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
88 * Collating Elements vs. Characters::
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
89 * The Backslash Character::
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
90 @end menu
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
91
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
92
13533
ca70a11e70e2 Integrate the regex documentation.
Bruno Haible <bruno@clisp.org>
parents: 13532
diff changeset
93 @node Syntax Bits
13532
b0bea693e638 Whitespace cleanup.
Bruno Haible <bruno@clisp.org>
parents: 13531
diff changeset
94 @section Syntax Bits
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
95
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
96 @cindex syntax bits
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
97
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
98 In any particular syntax for regular expressions, some characters are
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
99 always special, others are sometimes special, and others are never
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
100 special. The particular syntax that Regex recognizes for a given
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
101 regular expression depends on the value in the @code{syntax} field of
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
102 the pattern buffer of that regular expression.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
103
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
104 You get a pattern buffer by compiling a regular expression. @xref{GNU
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
105 Pattern Buffers}, and @ref{POSIX Pattern Buffers}, for more information
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
106 on pattern buffers. @xref{GNU Regular Expression Compiling}, @ref{POSIX
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
107 Regular Expression Compiling}, and @ref{BSD Regular Expression
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
108 Compiling}, for more information on compiling.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
109
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
110 Regex considers the value of the @code{syntax} field to be a collection
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
111 of bits; we refer to these bits as @dfn{syntax bits}. In most cases,
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
112 they affect what characters represent what operators. We describe the
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
113 meanings of the operators to which we refer in @ref{Common Operators},
13532
b0bea693e638 Whitespace cleanup.
Bruno Haible <bruno@clisp.org>
parents: 13531
diff changeset
114 @ref{GNU Operators}, and @ref{GNU Emacs Operators}.
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
115
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
116 For reference, here is the complete list of syntax bits, in alphabetical
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
117 order:
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
118
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
119 @table @code
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
120
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
121 @cnindex RE_BACKSLASH_ESCAPE_IN_LIST
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
122 @item RE_BACKSLASH_ESCAPE_IN_LISTS
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
123 If this bit is set, then @samp{\} inside a list (@pxref{List Operators}
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
124 quotes (makes ordinary, if it's special) the following character; if
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
125 this bit isn't set, then @samp{\} is an ordinary character inside lists.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
126 (@xref{The Backslash Character}, for what `\' does outside of lists.)
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
127
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
128 @cnindex RE_BK_PLUS_QM
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
129 @item RE_BK_PLUS_QM
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
130 If this bit is set, then @samp{\+} represents the match-one-or-more
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
131 operator and @samp{\?} represents the match-zero-or-more operator; if
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
132 this bit isn't set, then @samp{+} represents the match-one-or-more
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
133 operator and @samp{?} represents the match-zero-or-one operator. This
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
134 bit is irrelevant if @code{RE_LIMITED_OPS} is set.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
135
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
136 @cnindex RE_CHAR_CLASSES
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
137 @item RE_CHAR_CLASSES
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
138 If this bit is set, then you can use character classes in lists; if this
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
139 bit isn't set, then you can't.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
140
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
141 @cnindex RE_CONTEXT_INDEP_ANCHORS
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
142 @item RE_CONTEXT_INDEP_ANCHORS
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
143 If this bit is set, then @samp{^} and @samp{$} are special anywhere outside
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
144 a list; if this bit isn't set, then these characters are special only in
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
145 certain contexts. @xref{Match-beginning-of-line Operator}, and
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
146 @ref{Match-end-of-line Operator}.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
147
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
148 @cnindex RE_CONTEXT_INDEP_OPS
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
149 @item RE_CONTEXT_INDEP_OPS
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
150 If this bit is set, then certain characters are special anywhere outside
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
151 a list; if this bit isn't set, then those characters are special only in
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
152 some contexts and are ordinary elsewhere. Specifically, if this bit
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
153 isn't set then @samp{*}, and (if the syntax bit @code{RE_LIMITED_OPS}
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
154 isn't set) @samp{+} and @samp{?} (or @samp{\+} and @samp{\?}, depending
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
155 on the syntax bit @code{RE_BK_PLUS_QM}) represent repetition operators
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
156 only if they're not first in a regular expression or just after an
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
157 open-group or alternation operator. The same holds for @samp{@{} (or
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
158 @samp{\@{}, depending on the syntax bit @code{RE_NO_BK_BRACES}) if
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
159 it is the beginning of a valid interval and the syntax bit
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
160 @code{RE_INTERVALS} is set.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
161
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
162 @cnindex RE_CONTEXT_INVALID_OPS
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
163 @item RE_CONTEXT_INVALID_OPS
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
164 If this bit is set, then repetition and alternation operators can't be
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
165 in certain positions within a regular expression. Specifically, the
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
166 regular expression is invalid if it has:
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
167
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
168 @itemize @bullet
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
169
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
170 @item
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
171 a repetition operator first in the regular expression or just after a
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
172 match-beginning-of-line, open-group, or alternation operator; or
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
173
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
174 @item
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
175 an alternation operator first or last in the regular expression, just
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
176 before a match-end-of-line operator, or just after an alternation or
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
177 open-group operator.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
178
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
179 @end itemize
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
180
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
181 If this bit isn't set, then you can put the characters representing the
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
182 repetition and alternation characters anywhere in a regular expression.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
183 Whether or not they will in fact be operators in certain positions
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
184 depends on other syntax bits.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
185
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
186 @cnindex RE_DOT_NEWLINE
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
187 @item RE_DOT_NEWLINE
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
188 If this bit is set, then the match-any-character operator matches
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
189 a newline; if this bit isn't set, then it doesn't.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
190
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
191 @cnindex RE_DOT_NOT_NULL
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
192 @item RE_DOT_NOT_NULL
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
193 If this bit is set, then the match-any-character operator doesn't match
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
194 a null character; if this bit isn't set, then it does.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
195
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
196 @cnindex RE_INTERVALS
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
197 @item RE_INTERVALS
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
198 If this bit is set, then Regex recognizes interval operators; if this bit
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
199 isn't set, then it doesn't.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
200
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
201 @cnindex RE_LIMITED_OPS
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
202 @item RE_LIMITED_OPS
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
203 If this bit is set, then Regex doesn't recognize the match-one-or-more,
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
204 match-zero-or-one or alternation operators; if this bit isn't set, then
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
205 it does.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
206
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
207 @cnindex RE_NEWLINE_ALT
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
208 @item RE_NEWLINE_ALT
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
209 If this bit is set, then newline represents the alternation operator; if
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
210 this bit isn't set, then newline is ordinary.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
211
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
212 @cnindex RE_NO_BK_BRACES
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
213 @item RE_NO_BK_BRACES
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
214 If this bit is set, then @samp{@{} represents the open-interval operator
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
215 and @samp{@}} represents the close-interval operator; if this bit isn't
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
216 set, then @samp{\@{} represents the open-interval operator and
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
217 @samp{\@}} represents the close-interval operator. This bit is relevant
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
218 only if @code{RE_INTERVALS} is set.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
219
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
220 @cnindex RE_NO_BK_PARENS
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
221 @item RE_NO_BK_PARENS
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
222 If this bit is set, then @samp{(} represents the open-group operator and
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
223 @samp{)} represents the close-group operator; if this bit isn't set, then
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
224 @samp{\(} represents the open-group operator and @samp{\)} represents
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
225 the close-group operator.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
226
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
227 @cnindex RE_NO_BK_REFS
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
228 @item RE_NO_BK_REFS
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
229 If this bit is set, then Regex doesn't recognize @samp{\}@var{digit} as
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
230 the back reference operator; if this bit isn't set, then it does.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
231
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
232 @cnindex RE_NO_BK_VBAR
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
233 @item RE_NO_BK_VBAR
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
234 If this bit is set, then @samp{|} represents the alternation operator;
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
235 if this bit isn't set, then @samp{\|} represents the alternation
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
236 operator. This bit is irrelevant if @code{RE_LIMITED_OPS} is set.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
237
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
238 @cnindex RE_NO_EMPTY_RANGES
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
239 @item RE_NO_EMPTY_RANGES
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
240 If this bit is set, then a regular expression with a range whose ending
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
241 point collates lower than its starting point is invalid; if this bit
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
242 isn't set, then Regex considers such a range to be empty.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
243
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
244 @cnindex RE_UNMATCHED_RIGHT_PAREN_ORD
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
245 @item RE_UNMATCHED_RIGHT_PAREN_ORD
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
246 If this bit is set and the regular expression has no matching open-group
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
247 operator, then Regex considers what would otherwise be a close-group
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
248 operator (based on how @code{RE_NO_BK_PARENS} is set) to match @samp{)}.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
249
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
250 @end table
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
251
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
252
13533
ca70a11e70e2 Integrate the regex documentation.
Bruno Haible <bruno@clisp.org>
parents: 13532
diff changeset
253 @node Predefined Syntaxes
13532
b0bea693e638 Whitespace cleanup.
Bruno Haible <bruno@clisp.org>
parents: 13531
diff changeset
254 @section Predefined Syntaxes
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
255
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
256 If you're programming with Regex, you can set a pattern buffer's
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
257 (@pxref{GNU Pattern Buffers}, and @ref{POSIX Pattern Buffers})
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
258 @code{syntax} field either to an arbitrary combination of syntax bits
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
259 (@pxref{Syntax Bits}) or else to the configurations defined by Regex.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
260 These configurations define the syntaxes used by certain
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
261 programs---@sc{gnu} Emacs,
13532
b0bea693e638 Whitespace cleanup.
Bruno Haible <bruno@clisp.org>
parents: 13531
diff changeset
262 @cindex Emacs
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
263 @sc{posix} Awk,
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
264 @cindex POSIX Awk
13532
b0bea693e638 Whitespace cleanup.
Bruno Haible <bruno@clisp.org>
parents: 13531
diff changeset
265 traditional Awk,
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
266 @cindex Awk
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
267 Grep,
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
268 @cindex Grep
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
269 @cindex Egrep
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
270 Egrep---in addition to syntaxes for @sc{posix} basic and extended
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
271 regular expressions.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
272
13549
bb0ceefd22dc avoid some overlong lines from posix urls, etc.
Karl Berry <karl@freefriends.org>
parents: 13537
diff changeset
273 The predefined syntaxes---taken directly from @file{regex.h}---are:
bb0ceefd22dc avoid some overlong lines from posix urls, etc.
Karl Berry <karl@freefriends.org>
parents: 13537
diff changeset
274
bb0ceefd22dc avoid some overlong lines from posix urls, etc.
Karl Berry <karl@freefriends.org>
parents: 13537
diff changeset
275 @smallexample
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
276 #define RE_SYNTAX_EMACS 0
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
277
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
278 #define RE_SYNTAX_AWK \
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
279 (RE_BACKSLASH_ESCAPE_IN_LISTS | RE_DOT_NOT_NULL \
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
280 | RE_NO_BK_PARENS | RE_NO_BK_REFS \
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
281 | RE_NO_BK_VBAR | RE_NO_EMPTY_RANGES \
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
282 | RE_UNMATCHED_RIGHT_PAREN_ORD)
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
283
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
284 #define RE_SYNTAX_POSIX_AWK \
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
285 (RE_SYNTAX_POSIX_EXTENDED | RE_BACKSLASH_ESCAPE_IN_LISTS)
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
286
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
287 #define RE_SYNTAX_GREP \
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
288 (RE_BK_PLUS_QM | RE_CHAR_CLASSES \
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
289 | RE_HAT_LISTS_NOT_NEWLINE | RE_INTERVALS \
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
290 | RE_NEWLINE_ALT)
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
291
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
292 #define RE_SYNTAX_EGREP \
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
293 (RE_CHAR_CLASSES | RE_CONTEXT_INDEP_ANCHORS \
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
294 | RE_CONTEXT_INDEP_OPS | RE_HAT_LISTS_NOT_NEWLINE \
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
295 | RE_NEWLINE_ALT | RE_NO_BK_PARENS \
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
296 | RE_NO_BK_VBAR)
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
297
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
298 #define RE_SYNTAX_POSIX_EGREP \
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
299 (RE_SYNTAX_EGREP | RE_INTERVALS | RE_NO_BK_BRACES)
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
300
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
301 /* P1003.2/D11.2, section 4.20.7.1, lines 5078ff. */
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
302 #define RE_SYNTAX_ED RE_SYNTAX_POSIX_BASIC
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
303
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
304 #define RE_SYNTAX_SED RE_SYNTAX_POSIX_BASIC
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
305
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
306 /* Syntax bits common to both basic and extended POSIX regex syntax. */
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
307 #define _RE_SYNTAX_POSIX_COMMON \
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
308 (RE_CHAR_CLASSES | RE_DOT_NEWLINE | RE_DOT_NOT_NULL \
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
309 | RE_INTERVALS | RE_NO_EMPTY_RANGES)
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
310
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
311 #define RE_SYNTAX_POSIX_BASIC \
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
312 (_RE_SYNTAX_POSIX_COMMON | RE_BK_PLUS_QM)
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
313
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
314 /* Differs from ..._POSIX_BASIC only in that RE_BK_PLUS_QM becomes
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
315 RE_LIMITED_OPS, i.e., \? \+ \| are not recognized. Actually, this
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
316 isn't minimal, since other operators, such as \`, aren't disabled. */
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
317 #define RE_SYNTAX_POSIX_MINIMAL_BASIC \
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
318 (_RE_SYNTAX_POSIX_COMMON | RE_LIMITED_OPS)
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
319
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
320 #define RE_SYNTAX_POSIX_EXTENDED \
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
321 (_RE_SYNTAX_POSIX_COMMON | RE_CONTEXT_INDEP_ANCHORS \
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
322 | RE_CONTEXT_INDEP_OPS | RE_NO_BK_BRACES \
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
323 | RE_NO_BK_PARENS | RE_NO_BK_VBAR \
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
324 | RE_UNMATCHED_RIGHT_PAREN_ORD)
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
325
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
326 /* Differs from ..._POSIX_EXTENDED in that RE_CONTEXT_INVALID_OPS
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
327 replaces RE_CONTEXT_INDEP_OPS and RE_NO_BK_REFS is added. */
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
328 #define RE_SYNTAX_POSIX_MINIMAL_EXTENDED \
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
329 (_RE_SYNTAX_POSIX_COMMON | RE_CONTEXT_INDEP_ANCHORS \
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
330 | RE_CONTEXT_INVALID_OPS | RE_NO_BK_BRACES \
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
331 | RE_NO_BK_PARENS | RE_NO_BK_REFS \
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
332 | RE_NO_BK_VBAR | RE_UNMATCHED_RIGHT_PAREN_ORD)
13549
bb0ceefd22dc avoid some overlong lines from posix urls, etc.
Karl Berry <karl@freefriends.org>
parents: 13537
diff changeset
333 @end smallexample
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
334
13533
ca70a11e70e2 Integrate the regex documentation.
Bruno Haible <bruno@clisp.org>
parents: 13532
diff changeset
335 @node Collating Elements vs. Characters
13532
b0bea693e638 Whitespace cleanup.
Bruno Haible <bruno@clisp.org>
parents: 13531
diff changeset
336 @section Collating Elements vs.@: Characters
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
337
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
338 @sc{posix} generalizes the notion of a character to that of a
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
339 collating element. It defines a @dfn{collating element} to be ``a
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
340 sequence of one or more bytes defined in the current collating sequence
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
341 as a unit of collation.''
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
342
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
343 This generalizes the notion of a character in
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
344 two ways. First, a single character can map into two or more collating
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
345 elements. For example, the German
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
346 @tex
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
347 `\ss'
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
348 @end tex
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
349 @ifinfo
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
350 ``es-zet''
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
351 @end ifinfo
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
352 collates as the collating element @samp{s} followed by another collating
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
353 element @samp{s}. Second, two or more characters can map into one
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
354 collating element. For example, the Spanish @samp{ll} collates after
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
355 @samp{l} and before @samp{m}.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
356
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
357 Since @sc{posix}'s ``collating element'' preserves the essential idea of
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
358 a ``character,'' we use the latter, more familiar, term in this document.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
359
13533
ca70a11e70e2 Integrate the regex documentation.
Bruno Haible <bruno@clisp.org>
parents: 13532
diff changeset
360 @node The Backslash Character
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
361 @section The Backslash Character
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
362
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
363 @cindex \
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
364 The @samp{\} character has one of four different meanings, depending on
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
365 the context in which you use it and what syntax bits are set
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
366 (@pxref{Syntax Bits}). It can: 1) stand for itself, 2) quote the next
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
367 character, 3) introduce an operator, or 4) do nothing.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
368
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
369 @enumerate
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
370 @item
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
371 It stands for itself inside a list
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
372 (@pxref{List Operators}) if the syntax bit
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
373 @code{RE_BACKSLASH_ESCAPE_IN_LISTS} is not set. For example, @samp{[\]}
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
374 would match @samp{\}.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
375
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
376 @item
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
377 It quotes (makes ordinary, if it's special) the next character when you
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
378 use it either:
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
379
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
380 @itemize @bullet
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
381 @item
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
382 outside a list,@footnote{Sometimes
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
383 you don't have to explicitly quote special characters to make
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
384 them ordinary. For instance, most characters lose any special meaning
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
385 inside a list (@pxref{List Operators}). In addition, if the syntax bits
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
386 @code{RE_CONTEXT_INVALID_OPS} and @code{RE_CONTEXT_INDEP_OPS}
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
387 aren't set, then (for historical reasons) the matcher considers special
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
388 characters ordinary if they are in contexts where the operations they
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
389 represent make no sense; for example, then the match-zero-or-more
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
390 operator (represented by @samp{*}) matches itself in the regular
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
391 expression @samp{*foo} because there is no preceding expression on which
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
392 it can operate. It is poor practice, however, to depend on this
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
393 behavior; if you want a special character to be ordinary outside a list,
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
394 it's better to always quote it, regardless.} or
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
395
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
396 @item
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
397 inside a list and the syntax bit @code{RE_BACKSLASH_ESCAPE_IN_LISTS} is set.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
398
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
399 @end itemize
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
400
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
401 @item
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
402 It introduces an operator when followed by certain ordinary
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
403 characters---sometimes only when certain syntax bits are set. See the
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
404 cases @code{RE_BK_PLUS_QM}, @code{RE_NO_BK_BRACES}, @code{RE_NO_BK_VAR},
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
405 @code{RE_NO_BK_PARENS}, @code{RE_NO_BK_REF} in @ref{Syntax Bits}. Also:
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
406
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
407 @itemize @bullet
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
408 @item
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
409 @samp{\b} represents the match-word-boundary operator
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
410 (@pxref{Match-word-boundary Operator}).
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
411
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
412 @item
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
413 @samp{\B} represents the match-within-word operator
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
414 (@pxref{Match-within-word Operator}).
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
415
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
416 @item
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
417 @samp{\<} represents the match-beginning-of-word operator @*
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
418 (@pxref{Match-beginning-of-word Operator}).
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
419
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
420 @item
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
421 @samp{\>} represents the match-end-of-word operator
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
422 (@pxref{Match-end-of-word Operator}).
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
423
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
424 @item
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
425 @samp{\w} represents the match-word-constituent operator
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
426 (@pxref{Match-word-constituent Operator}).
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
427
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
428 @item
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
429 @samp{\W} represents the match-non-word-constituent operator
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
430 (@pxref{Match-non-word-constituent Operator}).
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
431
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
432 @item
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
433 @samp{\`} represents the match-beginning-of-buffer
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
434 operator and @samp{\'} represents the match-end-of-buffer operator
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
435 (@pxref{Buffer Operators}).
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
436
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
437 @item
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
438 If Regex was compiled with the C preprocessor symbol @code{emacs}
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
439 defined, then @samp{\s@var{class}} represents the match-syntactic-class
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
440 operator and @samp{\S@var{class}} represents the
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
441 match-not-syntactic-class operator (@pxref{Syntactic Class Operators}).
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
442
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
443 @end itemize
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
444
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
445 @item
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
446 In all other cases, Regex ignores @samp{\}. For example,
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
447 @samp{\n} matches @samp{n}.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
448
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
449 @end enumerate
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
450
13533
ca70a11e70e2 Integrate the regex documentation.
Bruno Haible <bruno@clisp.org>
parents: 13532
diff changeset
451 @node Common Operators
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
452 @chapter Common Operators
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
453
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
454 You compose regular expressions from operators. In the following
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
455 sections, we describe the regular expression operators specified by
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
456 @sc{posix}; @sc{gnu} also uses these. Most operators have more than one
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
457 representation as characters. @xref{Regular Expression Syntax}, for
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
458 what characters represent what operators under what circumstances.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
459
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
460 For most operators that can be represented in two ways, one
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
461 representation is a single character and the other is that character
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
462 preceded by @samp{\}. For example, either @samp{(} or @samp{\(}
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
463 represents the open-group operator. Which one does depends on the
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
464 setting of a syntax bit, in this case @code{RE_NO_BK_PARENS}. Why is
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
465 this so? Historical reasons dictate some of the varying
13532
b0bea693e638 Whitespace cleanup.
Bruno Haible <bruno@clisp.org>
parents: 13531
diff changeset
466 representations, while @sc{posix} dictates others.
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
467
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
468 Finally, almost all characters lose any special meaning inside a list
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
469 (@pxref{List Operators}).
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
470
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
471 @menu
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
472 * Match-self Operator:: Ordinary characters.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
473 * Match-any-character Operator:: .
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
474 * Concatenation Operator:: Juxtaposition.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
475 * Repetition Operators:: * + ? @{@}
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
476 * Alternation Operator:: |
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
477 * List Operators:: [...] [^...]
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
478 * Grouping Operators:: (...)
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
479 * Back-reference Operator:: \digit
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
480 * Anchoring Operators:: ^ $
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
481 @end menu
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
482
13533
ca70a11e70e2 Integrate the regex documentation.
Bruno Haible <bruno@clisp.org>
parents: 13532
diff changeset
483 @node Match-self Operator
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
484 @section The Match-self Operator (@var{ordinary character})
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
485
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
486 This operator matches the character itself. All ordinary characters
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
487 (@pxref{Regular Expression Syntax}) represent this operator. For
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
488 example, @samp{f} is always an ordinary character, so the regular
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
489 expression @samp{f} matches only the string @samp{f}. In
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
490 particular, it does @emph{not} match the string @samp{ff}.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
491
13533
ca70a11e70e2 Integrate the regex documentation.
Bruno Haible <bruno@clisp.org>
parents: 13532
diff changeset
492 @node Match-any-character Operator
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
493 @section The Match-any-character Operator (@code{.})
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
494
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
495 @cindex @samp{.}
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
496
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
497 This operator matches any single printing or nonprinting character
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
498 except it won't match a:
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
499
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
500 @table @asis
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
501 @item newline
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
502 if the syntax bit @code{RE_DOT_NEWLINE} isn't set.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
503
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
504 @item null
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
505 if the syntax bit @code{RE_DOT_NOT_NULL} is set.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
506
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
507 @end table
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
508
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
509 The @samp{.} (period) character represents this operator. For example,
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
510 @samp{a.b} matches any three-character string beginning with @samp{a}
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
511 and ending with @samp{b}.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
512
13533
ca70a11e70e2 Integrate the regex documentation.
Bruno Haible <bruno@clisp.org>
parents: 13532
diff changeset
513 @node Concatenation Operator
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
514 @section The Concatenation Operator
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
515
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
516 This operator concatenates two regular expressions @var{a} and @var{b}.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
517 No character represents this operator; you simply put @var{b} after
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
518 @var{a}. The result is a regular expression that will match a string if
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
519 @var{a} matches its first part and @var{b} matches the rest. For
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
520 example, @samp{xy} (two match-self operators) matches @samp{xy}.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
521
13533
ca70a11e70e2 Integrate the regex documentation.
Bruno Haible <bruno@clisp.org>
parents: 13532
diff changeset
522 @node Repetition Operators
13532
b0bea693e638 Whitespace cleanup.
Bruno Haible <bruno@clisp.org>
parents: 13531
diff changeset
523 @section Repetition Operators
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
524
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
525 Repetition operators repeat the preceding regular expression a specified
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
526 number of times.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
527
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
528 @menu
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
529 * Match-zero-or-more Operator:: *
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
530 * Match-one-or-more Operator:: +
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
531 * Match-zero-or-one Operator:: ?
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
532 * Interval Operators:: @{@}
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
533 @end menu
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
534
13533
ca70a11e70e2 Integrate the regex documentation.
Bruno Haible <bruno@clisp.org>
parents: 13532
diff changeset
535 @node Match-zero-or-more Operator
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
536 @subsection The Match-zero-or-more Operator (@code{*})
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
537
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
538 @cindex @samp{*}
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
539
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
540 This operator repeats the smallest possible preceding regular expression
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
541 as many times as necessary (including zero) to match the pattern.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
542 @samp{*} represents this operator. For example, @samp{o*}
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
543 matches any string made up of zero or more @samp{o}s. Since this
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
544 operator operates on the smallest preceding regular expression,
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
545 @samp{fo*} has a repeating @samp{o}, not a repeating @samp{fo}. So,
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
546 @samp{fo*} matches @samp{f}, @samp{fo}, @samp{foo}, and so on.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
547
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
548 Since the match-zero-or-more operator is a suffix operator, it may be
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
549 useless as such when no regular expression precedes it. This is the
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
550 case when it:
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
551
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
552 @itemize @bullet
13532
b0bea693e638 Whitespace cleanup.
Bruno Haible <bruno@clisp.org>
parents: 13531
diff changeset
553 @item
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
554 is first in a regular expression, or
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
555
13532
b0bea693e638 Whitespace cleanup.
Bruno Haible <bruno@clisp.org>
parents: 13531
diff changeset
556 @item
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
557 follows a match-beginning-of-line, open-group, or alternation
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
558 operator.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
559
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
560 @end itemize
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
561
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
562 @noindent
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
563 Three different things can happen in these cases:
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
564
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
565 @enumerate
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
566 @item
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
567 If the syntax bit @code{RE_CONTEXT_INVALID_OPS} is set, then the
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
568 regular expression is invalid.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
569
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
570 @item
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
571 If @code{RE_CONTEXT_INVALID_OPS} isn't set, but
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
572 @code{RE_CONTEXT_INDEP_OPS} is, then @samp{*} represents the
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
573 match-zero-or-more operator (which then operates on the empty string).
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
574
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
575 @item
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
576 Otherwise, @samp{*} is ordinary.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
577
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
578 @end enumerate
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
579
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
580 @cindex backtracking
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
581 The matcher processes a match-zero-or-more operator by first matching as
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
582 many repetitions of the smallest preceding regular expression as it can.
13532
b0bea693e638 Whitespace cleanup.
Bruno Haible <bruno@clisp.org>
parents: 13531
diff changeset
583 Then it continues to match the rest of the pattern.
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
584
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
585 If it can't match the rest of the pattern, it backtracks (as many times
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
586 as necessary), each time discarding one of the matches until it can
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
587 either match the entire pattern or be certain that it cannot get a
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
588 match. For example, when matching @samp{ca*ar} against @samp{caaar},
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
589 the matcher first matches all three @samp{a}s of the string with the
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
590 @samp{a*} of the regular expression. However, it cannot then match the
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
591 final @samp{ar} of the regular expression against the final @samp{r} of
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
592 the string. So it backtracks, discarding the match of the last @samp{a}
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
593 in the string. It can then match the remaining @samp{ar}.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
594
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
595
13533
ca70a11e70e2 Integrate the regex documentation.
Bruno Haible <bruno@clisp.org>
parents: 13532
diff changeset
596 @node Match-one-or-more Operator
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
597 @subsection The Match-one-or-more Operator (@code{+} or @code{\+})
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
598
13532
b0bea693e638 Whitespace cleanup.
Bruno Haible <bruno@clisp.org>
parents: 13531
diff changeset
599 @cindex @samp{+}
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
600
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
601 If the syntax bit @code{RE_LIMITED_OPS} is set, then Regex doesn't recognize
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
602 this operator. Otherwise, if the syntax bit @code{RE_BK_PLUS_QM} isn't
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
603 set, then @samp{+} represents this operator; if it is, then @samp{\+}
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
604 does.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
605
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
606 This operator is similar to the match-zero-or-more operator except that
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
607 it repeats the preceding regular expression at least once;
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
608 @pxref{Match-zero-or-more Operator}, for what it operates on, how some
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
609 syntax bits affect it, and how Regex backtracks to match it.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
610
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
611 For example, supposing that @samp{+} represents the match-one-or-more
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
612 operator; then @samp{ca+r} matches, e.g., @samp{car} and
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
613 @samp{caaaar}, but not @samp{cr}.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
614
13533
ca70a11e70e2 Integrate the regex documentation.
Bruno Haible <bruno@clisp.org>
parents: 13532
diff changeset
615 @node Match-zero-or-one Operator
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
616 @subsection The Match-zero-or-one Operator (@code{?} or @code{\?})
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
617 @cindex @samp{?}
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
618
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
619 If the syntax bit @code{RE_LIMITED_OPS} is set, then Regex doesn't
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
620 recognize this operator. Otherwise, if the syntax bit
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
621 @code{RE_BK_PLUS_QM} isn't set, then @samp{?} represents this operator;
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
622 if it is, then @samp{\?} does.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
623
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
624 This operator is similar to the match-zero-or-more operator except that
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
625 it repeats the preceding regular expression once or not at all;
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
626 @pxref{Match-zero-or-more Operator}, to see what it operates on, how
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
627 some syntax bits affect it, and how Regex backtracks to match it.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
628
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
629 For example, supposing that @samp{?} represents the match-zero-or-one
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
630 operator; then @samp{ca?r} matches both @samp{car} and @samp{cr}, but
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
631 nothing else.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
632
13533
ca70a11e70e2 Integrate the regex documentation.
Bruno Haible <bruno@clisp.org>
parents: 13532
diff changeset
633 @node Interval Operators
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
634 @subsection Interval Operators (@code{@{} @dots{} @code{@}} or @code{\@{} @dots{} @code{\@}})
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
635
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
636 @cindex interval expression
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
637 @cindex @samp{@{}
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
638 @cindex @samp{@}}
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
639 @cindex @samp{\@{}
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
640 @cindex @samp{\@}}
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
641
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
642 If the syntax bit @code{RE_INTERVALS} is set, then Regex recognizes
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
643 @dfn{interval expressions}. They repeat the smallest possible preceding
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
644 regular expression a specified number of times.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
645
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
646 If the syntax bit @code{RE_NO_BK_BRACES} is set, @samp{@{} represents
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
647 the @dfn{open-interval operator} and @samp{@}} represents the
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
648 @dfn{close-interval operator} ; otherwise, @samp{\@{} and @samp{\@}} do.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
649
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
650 Specifically, supposing that @samp{@{} and @samp{@}} represent the
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
651 open-interval and close-interval operators; then:
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
652
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
653 @table @code
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
654 @item @{@var{count}@}
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
655 matches exactly @var{count} occurrences of the preceding regular
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
656 expression.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
657
13537
77dd6d58a96b erroneous commas inside @var
Karl Berry <karl@freefriends.org>
parents: 13533
diff changeset
658 @item @{@var{min},@}
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
659 matches @var{min} or more occurrences of the preceding regular
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
660 expression.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
661
13537
77dd6d58a96b erroneous commas inside @var
Karl Berry <karl@freefriends.org>
parents: 13533
diff changeset
662 @item @{@var{min}, @var{max}@}
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
663 matches at least @var{min} but no more than @var{max} occurrences of
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
664 the preceding regular expression.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
665
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
666 @end table
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
667
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
668 The interval expression (but not necessarily the regular expression that
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
669 contains it) is invalid if:
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
670
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
671 @itemize @bullet
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
672 @item
13532
b0bea693e638 Whitespace cleanup.
Bruno Haible <bruno@clisp.org>
parents: 13531
diff changeset
673 @var{min} is greater than @var{max}, or
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
674
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
675 @item
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
676 any of @var{count}, @var{min}, or @var{max} are outside the range
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
677 zero to @code{RE_DUP_MAX} (which symbol @file{regex.h}
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
678 defines).
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
679
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
680 @end itemize
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
681
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
682 If the interval expression is invalid and the syntax bit
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
683 @code{RE_NO_BK_BRACES} is set, then Regex considers all the
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
684 characters in the would-be interval to be ordinary. If that bit
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
685 isn't set, then the regular expression is invalid.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
686
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
687 If the interval expression is valid but there is no preceding regular
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
688 expression on which to operate, then if the syntax bit
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
689 @code{RE_CONTEXT_INVALID_OPS} is set, the regular expression is invalid.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
690 If that bit isn't set, then Regex considers all the characters---other
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
691 than backslashes, which it ignores---in the would-be interval to be
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
692 ordinary.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
693
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
694
13533
ca70a11e70e2 Integrate the regex documentation.
Bruno Haible <bruno@clisp.org>
parents: 13532
diff changeset
695 @node Alternation Operator
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
696 @section The Alternation Operator (@code{|} or @code{\|})
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
697
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
698 @kindex |
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
699 @kindex \|
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
700 @cindex alternation operator
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
701 @cindex or operator
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
702
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
703 If the syntax bit @code{RE_LIMITED_OPS} is set, then Regex doesn't
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
704 recognize this operator. Otherwise, if the syntax bit
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
705 @code{RE_NO_BK_VBAR} is set, then @samp{|} represents this operator;
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
706 otherwise, @samp{\|} does.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
707
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
708 Alternatives match one of a choice of regular expressions:
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
709 if you put the character(s) representing the alternation operator between
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
710 any two regular expressions @var{a} and @var{b}, the result matches
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
711 the union of the strings that @var{a} and @var{b} match. For
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
712 example, supposing that @samp{|} is the alternation operator, then
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
713 @samp{foo|bar|quux} would match any of @samp{foo}, @samp{bar} or
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
714 @samp{quux}.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
715
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
716 @ignore
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
717 @c Nobody needs to disallow empty alternatives any more.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
718 If the syntax bit @code{RE_NO_EMPTY_ALTS} is set, then if either of the regular
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
719 expressions @var{a} or @var{b} is empty, the
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
720 regular expression is invalid. More precisely, if this syntax bit is
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
721 set, then the alternation operator can't:
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
722
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
723 @itemize @bullet
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
724 @item
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
725 be first or last in a regular expression;
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
726
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
727 @item
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
728 follow either another alternation operator or an open-group operator
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
729 (@pxref{Grouping Operators}); or
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
730
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
731 @item
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
732 precede a close-group operator.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
733
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
734 @end itemize
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
735
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
736 @noindent
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
737 For example, supposing @samp{(} and @samp{)} represent the open and
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
738 close-group operators, then @samp{|foo}, @samp{foo|}, @samp{foo||bar},
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
739 @samp{foo(|bar)}, and @samp{(foo|)bar} would all be invalid.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
740 @end ignore
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
741
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
742 The alternation operator operates on the @emph{largest} possible
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
743 surrounding regular expressions. (Put another way, it has the lowest
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
744 precedence of any regular expression operator.)
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
745 Thus, the only way you can
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
746 delimit its arguments is to use grouping. For example, if @samp{(} and
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
747 @samp{)} are the open and close-group operators, then @samp{fo(o|b)ar}
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
748 would match either @samp{fooar} or @samp{fobar}. (@samp{foo|bar} would
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
749 match @samp{foo} or @samp{bar}.)
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
750
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
751 @cindex backtracking
13532
b0bea693e638 Whitespace cleanup.
Bruno Haible <bruno@clisp.org>
parents: 13531
diff changeset
752 The matcher usually tries all combinations of alternatives so as to
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
753 match the longest possible string. For example, when matching
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
754 @samp{(fooq|foo)*(qbarquux|bar)} against @samp{fooqbarquux}, it cannot
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
755 take, say, the first (``depth-first'') combination it could match, since
13532
b0bea693e638 Whitespace cleanup.
Bruno Haible <bruno@clisp.org>
parents: 13531
diff changeset
756 then it would be content to match just @samp{fooqbar}.
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
757
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
758 @comment xx something about leftmost-longest
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
759
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
760
13533
ca70a11e70e2 Integrate the regex documentation.
Bruno Haible <bruno@clisp.org>
parents: 13532
diff changeset
761 @node List Operators
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
762 @section List Operators (@code{[} @dots{} @code{]} and @code{[^} @dots{} @code{]})
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
763
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
764 @cindex matching list
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
765 @cindex @samp{[}
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
766 @cindex @samp{]}
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
767 @cindex @samp{^}
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
768 @cindex @samp{-}
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
769 @cindex @samp{\}
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
770 @cindex @samp{[^}
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
771 @cindex nonmatching list
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
772 @cindex matching newline
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
773 @cindex bracket expression
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
774
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
775 @dfn{Lists}, also called @dfn{bracket expressions}, are a set of one or
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
776 more items. An @dfn{item} is a character,
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
777 @ignore
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
778 (These get added when they get implemented.)
13532
b0bea693e638 Whitespace cleanup.
Bruno Haible <bruno@clisp.org>
parents: 13531
diff changeset
779 a collating symbol, an equivalence class expression,
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
780 @end ignore
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
781 a character class expression, or a range expression. The syntax bits
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
782 affect which kinds of items you can put in a list. We explain the last
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
783 two items in subsections below. Empty lists are invalid.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
784
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
785 A @dfn{matching list} matches a single character represented by one of
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
786 the list items. You form a matching list by enclosing one or more items
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
787 within an @dfn{open-matching-list operator} (represented by @samp{[})
13532
b0bea693e638 Whitespace cleanup.
Bruno Haible <bruno@clisp.org>
parents: 13531
diff changeset
788 and a @dfn{close-list operator} (represented by @samp{]}).
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
789
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
790 For example, @samp{[ab]} matches either @samp{a} or @samp{b}.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
791 @samp{[ad]*} matches the empty string and any string composed of just
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
792 @samp{a}s and @samp{d}s in any order. Regex considers invalid a regular
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
793 expression with a @samp{[} but no matching
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
794 @samp{]}.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
795
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
796 @dfn{Nonmatching lists} are similar to matching lists except that they
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
797 match a single character @emph{not} represented by one of the list
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
798 items. You use an @dfn{open-nonmatching-list operator} (represented by
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
799 @samp{[^}@footnote{Regex therefore doesn't consider the @samp{^} to be
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
800 the first character in the list. If you put a @samp{^} character first
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
801 in (what you think is) a matching list, you'll turn it into a
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
802 nonmatching list.}) instead of an open-matching-list operator to start a
13532
b0bea693e638 Whitespace cleanup.
Bruno Haible <bruno@clisp.org>
parents: 13531
diff changeset
803 nonmatching list.
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
804
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
805 For example, @samp{[^ab]} matches any character except @samp{a} or
13532
b0bea693e638 Whitespace cleanup.
Bruno Haible <bruno@clisp.org>
parents: 13531
diff changeset
806 @samp{b}.
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
807
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
808 If the @code{posix_newline} field in the pattern buffer (@pxref{GNU
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
809 Pattern Buffers} is set, then nonmatching lists do not match a newline.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
810
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
811 Most characters lose any special meaning inside a list. The special
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
812 characters inside a list follow.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
813
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
814 @table @samp
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
815 @item ]
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
816 ends the list if it's not the first list item. So, if you want to make
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
817 the @samp{]} character a list item, you must put it first.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
818
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
819 @item \
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
820 quotes the next character if the syntax bit @code{RE_BACKSLASH_ESCAPE_IN_LISTS} is
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
821 set.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
822
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
823 @ignore
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
824 Put these in if they get implemented.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
825
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
826 @item [.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
827 represents the open-collating-symbol operator (@pxref{Collating Symbol
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
828 Operators}).
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
829
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
830 @item .]
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
831 represents the close-collating-symbol operator.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
832
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
833 @item [=
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
834 represents the open-equivalence-class operator (@pxref{Equivalence Class
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
835 Operators}).
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
836
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
837 @item =]
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
838 represents the close-equivalence-class operator.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
839
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
840 @end ignore
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
841
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
842 @item [:
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
843 represents the open-character-class operator (@pxref{Character Class
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
844 Operators}) if the syntax bit @code{RE_CHAR_CLASSES} is set and what
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
845 follows is a valid character class expression.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
846
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
847 @item :]
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
848 represents the close-character-class operator if the syntax bit
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
849 @code{RE_CHAR_CLASSES} is set and what precedes it is an
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
850 open-character-class operator followed by a valid character class name.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
851
13532
b0bea693e638 Whitespace cleanup.
Bruno Haible <bruno@clisp.org>
parents: 13531
diff changeset
852 @item -
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
853 represents the range operator (@pxref{Range Operator}) if it's
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
854 not first or last in a list or the ending point of a range.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
855
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
856 @end table
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
857
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
858 @noindent
13532
b0bea693e638 Whitespace cleanup.
Bruno Haible <bruno@clisp.org>
parents: 13531
diff changeset
859 All other characters are ordinary. For example, @samp{[.*]} matches
b0bea693e638 Whitespace cleanup.
Bruno Haible <bruno@clisp.org>
parents: 13531
diff changeset
860 @samp{.} and @samp{*}.
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
861
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
862 @menu
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
863 * Character Class Operators:: [:class:]
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
864 * Range Operator:: start-end
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
865 @end menu
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
866
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
867 @ignore
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
868 (If collating symbols and equivalence class expressions get implemented,
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
869 then add this.)
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
870
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
871 node Collating Symbol Operators
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
872 subsubsection Collating Symbol Operators (@code{[.} @dots{} @code{.]})
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
873
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
874 If the syntax bit @code{XX} is set, then you can represent
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
875 collating symbols inside lists. You form a @dfn{collating symbol} by
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
876 putting a collating element between an @dfn{open-collating-symbol
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
877 operator} and an @dfn{close-collating-symbol operator}. @samp{[.}
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
878 represents the open-collating-symbol operator and @samp{.]} represents
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
879 the close-collating-symbol operator. For example, if @samp{ll} is a
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
880 collating element, then @samp{[[.ll.]]} would match @samp{ll}.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
881
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
882 node Equivalence Class Operators
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
883 subsubsection Equivalence Class Operators (@code{[=} @dots{} @code{=]})
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
884 @cindex equivalence class expression in regex
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
885 @cindex @samp{[=} in regex
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
886 @cindex @samp{=]} in regex
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
887
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
888 If the syntax bit @code{XX} is set, then Regex recognizes equivalence class
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
889 expressions inside lists. A @dfn{equivalence class expression} is a set
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
890 of collating elements which all belong to the same equivalence class.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
891 You form an equivalence class expression by putting a collating
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
892 element between an @dfn{open-equivalence-class operator} and a
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
893 @dfn{close-equivalence-class operator}. @samp{[=} represents the
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
894 open-equivalence-class operator and @samp{=]} represents the
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
895 close-equivalence-class operator. For example, if @samp{a} and @samp{A}
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
896 were an equivalence class, then both @samp{[[=a=]]} and @samp{[[=A=]]}
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
897 would match both @samp{a} and @samp{A}. If the collating element in an
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
898 equivalence class expression isn't part of an equivalence class, then
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
899 the matcher considers the equivalence class expression to be a collating
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
900 symbol.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
901
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
902 @end ignore
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
903
13533
ca70a11e70e2 Integrate the regex documentation.
Bruno Haible <bruno@clisp.org>
parents: 13532
diff changeset
904 @node Character Class Operators
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
905 @subsection Character Class Operators (@code{[:} @dots{} @code{:]})
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
906
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
907 @cindex character classes
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
908 @cindex @samp{[:} in regex
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
909 @cindex @samp{:]} in regex
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
910
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
911 If the syntax bit @code{RE_CHARACTER_CLASSES} is set, then Regex
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
912 recognizes character class expressions inside lists. A @dfn{character
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
913 class expression} matches one character from a given class. You form a
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
914 character class expression by putting a character class name between an
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
915 @dfn{open-character-class operator} (represented by @samp{[:}) and a
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
916 @dfn{close-character-class operator} (represented by @samp{:]}). The
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
917 character class names and their meanings are:
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
918
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
919 @table @code
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
920
13532
b0bea693e638 Whitespace cleanup.
Bruno Haible <bruno@clisp.org>
parents: 13531
diff changeset
921 @item alnum
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
922 letters and digits
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
923
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
924 @item alpha
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
925 letters
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
926
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
927 @item blank
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
928 system-dependent; for @sc{gnu}, a space or tab
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
929
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
930 @item cntrl
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
931 control characters (in the @sc{ascii} encoding, code 0177 and codes
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
932 less than 040)
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
933
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
934 @item digit
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
935 digits
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
936
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
937 @item graph
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
938 same as @code{print} except omits space
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
939
13532
b0bea693e638 Whitespace cleanup.
Bruno Haible <bruno@clisp.org>
parents: 13531
diff changeset
940 @item lower
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
941 lowercase letters
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
942
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
943 @item print
13532
b0bea693e638 Whitespace cleanup.
Bruno Haible <bruno@clisp.org>
parents: 13531
diff changeset
944 printable characters (in the @sc{ascii} encoding, space
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
945 tilde---codes 040 through 0176)
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
946
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
947 @item punct
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
948 neither control nor alphanumeric characters
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
949
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
950 @item space
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
951 space, carriage return, newline, vertical tab, and form feed
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
952
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
953 @item upper
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
954 uppercase letters
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
955
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
956 @item xdigit
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
957 hexadecimal digits: @code{0}--@code{9}, @code{a}--@code{f}, @code{A}--@code{F}
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
958
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
959 @end table
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
960
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
961 @noindent
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
962 These correspond to the definitions in the C library's @file{<ctype.h>}
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
963 facility. For example, @samp{[:alpha:]} corresponds to the standard
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
964 facility @code{isalpha}. Regex recognizes character class expressions
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
965 only inside of lists; so @samp{[[:alpha:]]} matches any letter, but
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
966 @samp{[:alpha:]} outside of a bracket expression and not followed by a
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
967 repetition operator matches just itself.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
968
13533
ca70a11e70e2 Integrate the regex documentation.
Bruno Haible <bruno@clisp.org>
parents: 13532
diff changeset
969 @node Range Operator
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
970 @subsection The Range Operator (@code{-})
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
971
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
972 Regex recognizes @dfn{range expressions} inside a list. They represent
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
973 those characters
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
974 that fall between two elements in the current collating sequence. You
13532
b0bea693e638 Whitespace cleanup.
Bruno Haible <bruno@clisp.org>
parents: 13531
diff changeset
975 form a range expression by putting a @dfn{range operator} between two
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
976 @ignore
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
977 (If these get implemented, then substitute this for ``characters.'')
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
978 of any of the following: characters, collating elements, collating symbols,
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
979 and equivalence class expressions. The starting point of the range and
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
980 the ending point of the range don't have to be the same kind of item,
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
981 e.g., the starting point could be a collating element and the ending
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
982 point could be an equivalence class expression. If a range's ending
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
983 point is an equivalence class, then all the collating elements in that
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
984 class will be in the range.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
985 @end ignore
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
986 characters.@footnote{You can't use a character class for the starting
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
987 or ending point of a range, since a character class is not a single
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
988 character.} @samp{-} represents the range operator. For example,
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
989 @samp{a-f} within a list represents all the characters from @samp{a}
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
990 through @samp{f}
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
991 inclusively.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
992
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
993 If the syntax bit @code{RE_NO_EMPTY_RANGES} is set, then if the range's
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
994 ending point collates less than its starting point, the range (and the
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
995 regular expression containing it) is invalid. For example, the regular
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
996 expression @samp{[z-a]} would be invalid. If this bit isn't set, then
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
997 Regex considers such a range to be empty.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
998
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
999 Since @samp{-} represents the range operator, if you want to make a
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1000 @samp{-} character itself
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1001 a list item, you must do one of the following:
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1002
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1003 @itemize @bullet
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1004 @item
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1005 Put the @samp{-} either first or last in the list.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1006
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1007 @item
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1008 Include a range whose starting point collates strictly lower than
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1009 @samp{-} and whose ending point collates equal or higher. Unless a
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1010 range is the first item in a list, a @samp{-} can't be its starting
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1011 point, but @emph{can} be its ending point. That is because Regex
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1012 considers @samp{-} to be the range operator unless it is preceded by
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1013 another @samp{-}. For example, in the @sc{ascii} encoding, @samp{)},
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1014 @samp{*}, @samp{+}, @samp{,}, @samp{-}, @samp{.}, and @samp{/} are
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1015 contiguous characters in the collating sequence. You might think that
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1016 @samp{[)-+--/]} has two ranges: @samp{)-+} and @samp{--/}. Rather, it
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1017 has the ranges @samp{)-+} and @samp{+--}, plus the character @samp{/}, so
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1018 it matches, e.g., @samp{,}, not @samp{.}.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1019
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1020 @item
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1021 Put a range whose starting point is @samp{-} first in the list.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1022
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1023 @end itemize
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1024
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1025 For example, @samp{[-a-z]} matches a lowercase letter or a hyphen (in
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1026 English, in @sc{ascii}).
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1027
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1028
13533
ca70a11e70e2 Integrate the regex documentation.
Bruno Haible <bruno@clisp.org>
parents: 13532
diff changeset
1029 @node Grouping Operators
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1030 @section Grouping Operators (@code{(} @dots{} @code{)} or @code{\(} @dots{} @code{\)})
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1031
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1032 @kindex (
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1033 @kindex )
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1034 @kindex \(
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1035 @kindex \)
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1036 @cindex grouping
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1037 @cindex subexpressions
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1038 @cindex parenthesizing
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1039
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1040 A @dfn{group}, also known as a @dfn{subexpression}, consists of an
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1041 @dfn{open-group operator}, any number of other operators, and a
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1042 @dfn{close-group operator}. Regex treats this sequence as a unit, just
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1043 as mathematics and programming languages treat a parenthesized
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1044 expression as a unit.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1045
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1046 Therefore, using @dfn{groups}, you can:
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1047
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1048 @itemize @bullet
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1049 @item
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1050 delimit the argument(s) to an alternation operator (@pxref{Alternation
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1051 Operator}) or a repetition operator (@pxref{Repetition
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1052 Operators}).
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1053
13532
b0bea693e638 Whitespace cleanup.
Bruno Haible <bruno@clisp.org>
parents: 13531
diff changeset
1054 @item
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1055 keep track of the indices of the substring that matched a given group.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1056 @xref{Using Registers}, for a precise explanation.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1057 This lets you:
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1058
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1059 @itemize @bullet
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1060 @item
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1061 use the back-reference operator (@pxref{Back-reference Operator}).
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1062
13532
b0bea693e638 Whitespace cleanup.
Bruno Haible <bruno@clisp.org>
parents: 13531
diff changeset
1063 @item
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1064 use registers (@pxref{Using Registers}).
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1065
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1066 @end itemize
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1067
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1068 @end itemize
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1069
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1070 If the syntax bit @code{RE_NO_BK_PARENS} is set, then @samp{(} represents
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1071 the open-group operator and @samp{)} represents the
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1072 close-group operator; otherwise, @samp{\(} and @samp{\)} do.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1073
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1074 If the syntax bit @code{RE_UNMATCHED_RIGHT_PAREN_ORD} is set and a
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1075 close-group operator has no matching open-group operator, then Regex
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1076 considers it to match @samp{)}.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1077
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1078
13533
ca70a11e70e2 Integrate the regex documentation.
Bruno Haible <bruno@clisp.org>
parents: 13532
diff changeset
1079 @node Back-reference Operator
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1080 @section The Back-reference Operator (@dfn{\}@var{digit})
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1081
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1082 @cindex back references
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1083
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1084 If the syntax bit @code{RE_NO_BK_REF} isn't set, then Regex recognizes
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1085 back references. A back reference matches a specified preceding group.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1086 The back reference operator is represented by @samp{\@var{digit}}
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1087 anywhere after the end of a regular expression's @w{@var{digit}-th}
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1088 group (@pxref{Grouping Operators}).
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1089
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1090 @var{digit} must be between @samp{1} and @samp{9}. The matcher assigns
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1091 numbers 1 through 9 to the first nine groups it encounters. By using
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1092 one of @samp{\1} through @samp{\9} after the corresponding group's
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1093 close-group operator, you can match a substring identical to the
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1094 one that the group does.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1095
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1096 Back references match according to the following (in all examples below,
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1097 @samp{(} represents the open-group, @samp{)} the close-group, @samp{@{}
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1098 the open-interval and @samp{@}} the close-interval operator):
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1099
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1100 @itemize @bullet
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1101 @item
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1102 If the group matches a substring, the back reference matches an
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1103 identical substring. For example, @samp{(a)\1} matches @samp{aa} and
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1104 @samp{(bana)na\1bo\1} matches @samp{bananabanabobana}. Likewise,
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1105 @samp{(.*)\1} matches any (newline-free if the syntax bit
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1106 @code{RE_DOT_NEWLINE} isn't set) string that is composed of two
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1107 identical halves; the @samp{(.*)} matches the first half and the
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1108 @samp{\1} matches the second half.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1109
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1110 @item
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1111 If the group matches more than once (as it might if followed
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1112 by, e.g., a repetition operator), then the back reference matches the
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1113 substring the group @emph{last} matched. For example,
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1114 @samp{((a*)b)*\1\2} matches @samp{aabababa}; first @w{group 1} (the
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1115 outer one) matches @samp{aab} and @w{group 2} (the inner one) matches
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1116 @samp{aa}. Then @w{group 1} matches @samp{ab} and @w{group 2} matches
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1117 @samp{a}. So, @samp{\1} matches @samp{ab} and @samp{\2} matches
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1118 @samp{a}.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1119
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1120 @item
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1121 If the group doesn't participate in a match, i.e., it is part of an
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1122 alternative not taken or a repetition operator allows zero repetitions
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1123 of it, then the back reference makes the whole match fail. For example,
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1124 @samp{(one()|two())-and-(three\2|four\3)} matches @samp{one-and-three}
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1125 and @samp{two-and-four}, but not @samp{one-and-four} or
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1126 @samp{two-and-three}. For example, if the pattern matches
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1127 @samp{one-and-}, then its @w{group 2} matches the empty string and its
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1128 @w{group 3} doesn't participate in the match. So, if it then matches
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1129 @samp{four}, then when it tries to back reference @w{group 3}---which it
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1130 will attempt to do because @samp{\3} follows the @samp{four}---the match
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1131 will fail because @w{group 3} didn't participate in the match.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1132
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1133 @end itemize
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1134
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1135 You can use a back reference as an argument to a repetition operator. For
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1136 example, @samp{(a(b))\2*} matches @samp{a} followed by two or more
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1137 @samp{b}s. Similarly, @samp{(a(b))\2@{3@}} matches @samp{abbbb}.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1138
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1139 If there is no preceding @w{@var{digit}-th} subexpression, the regular
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1140 expression is invalid.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1141
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1142
13533
ca70a11e70e2 Integrate the regex documentation.
Bruno Haible <bruno@clisp.org>
parents: 13532
diff changeset
1143 @node Anchoring Operators
13532
b0bea693e638 Whitespace cleanup.
Bruno Haible <bruno@clisp.org>
parents: 13531
diff changeset
1144 @section Anchoring Operators
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1145
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1146 @cindex anchoring
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1147 @cindex regexp anchoring
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1148
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1149 These operators can constrain a pattern to match only at the beginning or
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1150 end of the entire string or at the beginning or end of a line.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1151
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1152 @menu
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1153 * Match-beginning-of-line Operator:: ^
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1154 * Match-end-of-line Operator:: $
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1155 @end menu
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1156
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1157
13533
ca70a11e70e2 Integrate the regex documentation.
Bruno Haible <bruno@clisp.org>
parents: 13532
diff changeset
1158 @node Match-beginning-of-line Operator
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1159 @subsection The Match-beginning-of-line Operator (@code{^})
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1160
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1161 @kindex ^
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1162 @cindex beginning-of-line operator
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1163 @cindex anchors
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1164
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1165 This operator can match the empty string either at the beginning of the
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1166 string or after a newline character. Thus, it is said to @dfn{anchor}
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1167 the pattern to the beginning of a line.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1168
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1169 In the cases following, @samp{^} represents this operator. (Otherwise,
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1170 @samp{^} is ordinary.)
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1171
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1172 @itemize @bullet
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1173
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1174 @item
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1175 It (the @samp{^}) is first in the pattern, as in @samp{^foo}.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1176
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1177 @cnindex RE_CONTEXT_INDEP_ANCHORS @r{(and @samp{^})}
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1178 @item
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1179 The syntax bit @code{RE_CONTEXT_INDEP_ANCHORS} is set, and it is outside
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1180 a bracket expression.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1181
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1182 @cindex open-group operator and @samp{^}
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1183 @cindex alternation operator and @samp{^}
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1184 @item
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1185 It follows an open-group or alternation operator, as in @samp{a\(^b\)}
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1186 and @samp{a\|^b}. @xref{Grouping Operators}, and @ref{Alternation
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1187 Operator}.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1188
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1189 @end itemize
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1190
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1191 These rules imply that some valid patterns containing @samp{^} cannot be
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1192 matched; for example, @samp{foo^bar} if @code{RE_CONTEXT_INDEP_ANCHORS}
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1193 is set.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1194
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1195 @vindex not_bol @r{field in pattern buffer}
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1196 If the @code{not_bol} field is set in the pattern buffer (@pxref{GNU
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1197 Pattern Buffers}), then @samp{^} fails to match at the beginning of the
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1198 string. @xref{POSIX Matching}, for when you might find this useful.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1199
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1200 @vindex newline_anchor @r{field in pattern buffer}
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1201 If the @code{newline_anchor} field is set in the pattern buffer, then
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1202 @samp{^} fails to match after a newline. This is useful when you do not
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1203 regard the string to be matched as broken into lines.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1204
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1205
13533
ca70a11e70e2 Integrate the regex documentation.
Bruno Haible <bruno@clisp.org>
parents: 13532
diff changeset
1206 @node Match-end-of-line Operator
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1207 @subsection The Match-end-of-line Operator (@code{$})
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1208
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1209 @kindex $
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1210 @cindex end-of-line operator
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1211 @cindex anchors
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1212
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1213 This operator can match the empty string either at the end of
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1214 the string or before a newline character in the string. Thus, it is
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1215 said to @dfn{anchor} the pattern to the end of a line.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1216
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1217 It is always represented by @samp{$}. For example, @samp{foo$} usually
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1218 matches, e.g., @samp{foo} and, e.g., the first three characters of
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1219 @samp{foo\nbar}.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1220
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1221 Its interaction with the syntax bits and pattern buffer fields is
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1222 exactly the dual of @samp{^}'s; see the previous section. (That is,
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1223 ``beginning'' becomes ``end'', ``next'' becomes ``previous'', and
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1224 ``after'' becomes ``before''.)
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1225
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1226
13533
ca70a11e70e2 Integrate the regex documentation.
Bruno Haible <bruno@clisp.org>
parents: 13532
diff changeset
1227 @node GNU Operators
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1228 @chapter GNU Operators
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1229
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1230 Following are operators that @sc{gnu} defines (and @sc{posix} doesn't).
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1231
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1232 @menu
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1233 * Word Operators::
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1234 * Buffer Operators::
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1235 @end menu
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1236
13533
ca70a11e70e2 Integrate the regex documentation.
Bruno Haible <bruno@clisp.org>
parents: 13532
diff changeset
1237 @node Word Operators
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1238 @section Word Operators
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1239
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1240 The operators in this section require Regex to recognize parts of words.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1241 Regex uses a syntax table to determine whether or not a character is
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1242 part of a word, i.e., whether or not it is @dfn{word-constituent}.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1243
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1244 @menu
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1245 * Non-Emacs Syntax Tables::
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1246 * Match-word-boundary Operator:: \b
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1247 * Match-within-word Operator:: \B
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1248 * Match-beginning-of-word Operator:: \<
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1249 * Match-end-of-word Operator:: \>
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1250 * Match-word-constituent Operator:: \w
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1251 * Match-non-word-constituent Operator:: \W
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1252 @end menu
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1253
13533
ca70a11e70e2 Integrate the regex documentation.
Bruno Haible <bruno@clisp.org>
parents: 13532
diff changeset
1254 @node Non-Emacs Syntax Tables
13532
b0bea693e638 Whitespace cleanup.
Bruno Haible <bruno@clisp.org>
parents: 13531
diff changeset
1255 @subsection Non-Emacs Syntax Tables
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1256
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1257 A @dfn{syntax table} is an array indexed by the characters in your
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1258 character set. In the @sc{ascii} encoding, therefore, a syntax table
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1259 has 256 elements. Regex always uses a @code{char *} variable
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1260 @code{re_syntax_table} as its syntax table. In some cases, it
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1261 initializes this variable and in others it expects you to initialize it.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1262
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1263 @itemize @bullet
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1264 @item
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1265 If Regex is compiled with the preprocessor symbols @code{emacs} and
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1266 @code{SYNTAX_TABLE} both undefined, then Regex allocates
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1267 @code{re_syntax_table} and initializes an element @var{i} either to
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1268 @code{Sword} (which it defines) if @var{i} is a letter, number, or
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1269 @samp{_}, or to zero if it's not.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1270
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1271 @item
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1272 If Regex is compiled with @code{emacs} undefined but @code{SYNTAX_TABLE}
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1273 defined, then Regex expects you to define a @code{char *} variable
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1274 @code{re_syntax_table} to be a valid syntax table.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1275
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1276 @item
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1277 @xref{Emacs Syntax Tables}, for what happens when Regex is compiled with
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1278 the preprocessor symbol @code{emacs} defined.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1279
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1280 @end itemize
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1281
13533
ca70a11e70e2 Integrate the regex documentation.
Bruno Haible <bruno@clisp.org>
parents: 13532
diff changeset
1282 @node Match-word-boundary Operator
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1283 @subsection The Match-word-boundary Operator (@code{\b})
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1284
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1285 @cindex @samp{\b}
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1286 @cindex word boundaries, matching
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1287
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1288 This operator (represented by @samp{\b}) matches the empty string at
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1289 either the beginning or the end of a word. For example, @samp{\brat\b}
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1290 matches the separate word @samp{rat}.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1291
13533
ca70a11e70e2 Integrate the regex documentation.
Bruno Haible <bruno@clisp.org>
parents: 13532
diff changeset
1292 @node Match-within-word Operator
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1293 @subsection The Match-within-word Operator (@code{\B})
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1294
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1295 @cindex @samp{\B}
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1296
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1297 This operator (represented by @samp{\B}) matches the empty string within
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1298 a word. For example, @samp{c\Brat\Be} matches @samp{crate}, but
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1299 @samp{dirty \Brat} doesn't match @samp{dirty rat}.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1300
13533
ca70a11e70e2 Integrate the regex documentation.
Bruno Haible <bruno@clisp.org>
parents: 13532
diff changeset
1301 @node Match-beginning-of-word Operator
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1302 @subsection The Match-beginning-of-word Operator (@code{\<})
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1303
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1304 @cindex @samp{\<}
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1305
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1306 This operator (represented by @samp{\<}) matches the empty string at the
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1307 beginning of a word.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1308
13533
ca70a11e70e2 Integrate the regex documentation.
Bruno Haible <bruno@clisp.org>
parents: 13532
diff changeset
1309 @node Match-end-of-word Operator
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1310 @subsection The Match-end-of-word Operator (@code{\>})
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1311
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1312 @cindex @samp{\>}
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1313
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1314 This operator (represented by @samp{\>}) matches the empty string at the
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1315 end of a word.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1316
13533
ca70a11e70e2 Integrate the regex documentation.
Bruno Haible <bruno@clisp.org>
parents: 13532
diff changeset
1317 @node Match-word-constituent Operator
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1318 @subsection The Match-word-constituent Operator (@code{\w})
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1319
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1320 @cindex @samp{\w}
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1321
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1322 This operator (represented by @samp{\w}) matches any word-constituent
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1323 character.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1324
13533
ca70a11e70e2 Integrate the regex documentation.
Bruno Haible <bruno@clisp.org>
parents: 13532
diff changeset
1325 @node Match-non-word-constituent Operator
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1326 @subsection The Match-non-word-constituent Operator (@code{\W})
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1327
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1328 @cindex @samp{\W}
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1329
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1330 This operator (represented by @samp{\W}) matches any character that is
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1331 not word-constituent.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1332
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1333
13533
ca70a11e70e2 Integrate the regex documentation.
Bruno Haible <bruno@clisp.org>
parents: 13532
diff changeset
1334 @node Buffer Operators
13532
b0bea693e638 Whitespace cleanup.
Bruno Haible <bruno@clisp.org>
parents: 13531
diff changeset
1335 @section Buffer Operators
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1336
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1337 Following are operators which work on buffers. In Emacs, a @dfn{buffer}
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1338 is, naturally, an Emacs buffer. For other programs, Regex considers the
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1339 entire string to be matched as the buffer.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1340
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1341 @menu
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1342 * Match-beginning-of-buffer Operator:: \`
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1343 * Match-end-of-buffer Operator:: \'
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1344 @end menu
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1345
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1346
13533
ca70a11e70e2 Integrate the regex documentation.
Bruno Haible <bruno@clisp.org>
parents: 13532
diff changeset
1347 @node Match-beginning-of-buffer Operator
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1348 @subsection The Match-beginning-of-buffer Operator (@code{\`})
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1349
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1350 @cindex @samp{\`}
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1351
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1352 This operator (represented by @samp{\`}) matches the empty string at the
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1353 beginning of the buffer.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1354
13533
ca70a11e70e2 Integrate the regex documentation.
Bruno Haible <bruno@clisp.org>
parents: 13532
diff changeset
1355 @node Match-end-of-buffer Operator
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1356 @subsection The Match-end-of-buffer Operator (@code{\'})
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1357
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1358 @cindex @samp{\'}
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1359
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1360 This operator (represented by @samp{\'}) matches the empty string at the
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1361 end of the buffer.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1362
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1363
13533
ca70a11e70e2 Integrate the regex documentation.
Bruno Haible <bruno@clisp.org>
parents: 13532
diff changeset
1364 @node GNU Emacs Operators
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1365 @chapter GNU Emacs Operators
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1366
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1367 Following are operators that @sc{gnu} defines (and @sc{posix} doesn't)
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1368 that you can use only when Regex is compiled with the preprocessor
13532
b0bea693e638 Whitespace cleanup.
Bruno Haible <bruno@clisp.org>
parents: 13531
diff changeset
1369 symbol @code{emacs} defined.
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1370
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1371 @menu
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1372 * Syntactic Class Operators::
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1373 @end menu
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1374
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1375
13533
ca70a11e70e2 Integrate the regex documentation.
Bruno Haible <bruno@clisp.org>
parents: 13532
diff changeset
1376 @node Syntactic Class Operators
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1377 @section Syntactic Class Operators
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1378
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1379 The operators in this section require Regex to recognize the syntactic
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1380 classes of characters. Regex uses a syntax table to determine this.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1381
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1382 @menu
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1383 * Emacs Syntax Tables::
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1384 * Match-syntactic-class Operator:: \sCLASS
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1385 * Match-not-syntactic-class Operator:: \SCLASS
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1386 @end menu
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1387
13533
ca70a11e70e2 Integrate the regex documentation.
Bruno Haible <bruno@clisp.org>
parents: 13532
diff changeset
1388 @node Emacs Syntax Tables
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1389 @subsection Emacs Syntax Tables
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1390
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1391 A @dfn{syntax table} is an array indexed by the characters in your
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1392 character set. In the @sc{ascii} encoding, therefore, a syntax table
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1393 has 256 elements.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1394
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1395 If Regex is compiled with the preprocessor symbol @code{emacs} defined,
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1396 then Regex expects you to define and initialize the variable
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1397 @code{re_syntax_table} to be an Emacs syntax table. Emacs' syntax
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1398 tables are more complicated than Regex's own (@pxref{Non-Emacs Syntax
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1399 Tables}). @xref{Syntax, , Syntax, emacs, The GNU Emacs User's Manual},
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1400 for a description of Emacs' syntax tables.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1401
13533
ca70a11e70e2 Integrate the regex documentation.
Bruno Haible <bruno@clisp.org>
parents: 13532
diff changeset
1402 @node Match-syntactic-class Operator
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1403 @subsection The Match-syntactic-class Operator (@code{\s}@var{class})
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1404
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1405 @cindex @samp{\s}
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1406
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1407 This operator matches any character whose syntactic class is represented
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1408 by a specified character. @samp{\s@var{class}} represents this operator
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1409 where @var{class} is the character representing the syntactic class you
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1410 want. For example, @samp{w} represents the syntactic
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1411 class of word-constituent characters, so @samp{\sw} matches any
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1412 word-constituent character.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1413
13533
ca70a11e70e2 Integrate the regex documentation.
Bruno Haible <bruno@clisp.org>
parents: 13532
diff changeset
1414 @node Match-not-syntactic-class Operator
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1415 @subsection The Match-not-syntactic-class Operator (@code{\S}@var{class})
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1416
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1417 @cindex @samp{\S}
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1418
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1419 This operator is similar to the match-syntactic-class operator except
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1420 that it matches any character whose syntactic class is @emph{not}
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1421 represented by the specified character. @samp{\S@var{class}} represents
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1422 this operator. For example, @samp{w} represents the syntactic class of
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1423 word-constituent characters, so @samp{\Sw} matches any character that is
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1424 not word-constituent.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1425
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1426
13533
ca70a11e70e2 Integrate the regex documentation.
Bruno Haible <bruno@clisp.org>
parents: 13532
diff changeset
1427 @node What Gets Matched?
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1428 @chapter What Gets Matched?
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1429
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1430 Regex usually matches strings according to the ``leftmost longest''
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1431 rule; that is, it chooses the longest of the leftmost matches. This
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1432 does not mean that for a regular expression containing subexpressions
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1433 that it simply chooses the longest match for each subexpression, left to
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1434 right; the overall match must also be the longest possible one.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1435
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1436 For example, @samp{(ac*)(c*d[ac]*)\1} matches @samp{acdacaaa}, not
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1437 @samp{acdac}, as it would if it were to choose the longest match for the
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1438 first subexpression.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1439
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1440
13533
ca70a11e70e2 Integrate the regex documentation.
Bruno Haible <bruno@clisp.org>
parents: 13532
diff changeset
1441 @node Programming with Regex
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1442 @chapter Programming with Regex
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1443
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1444 Here we describe how you use the Regex data structures and functions in
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1445 C programs. Regex has three interfaces: one designed for @sc{gnu}, one
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1446 compatible with @sc{posix} and one compatible with Berkeley @sc{unix}.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1447
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1448 @menu
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1449 * GNU Regex Functions::
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1450 * POSIX Regex Functions::
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1451 * BSD Regex Functions::
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1452 @end menu
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1453
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1454
13533
ca70a11e70e2 Integrate the regex documentation.
Bruno Haible <bruno@clisp.org>
parents: 13532
diff changeset
1455 @node GNU Regex Functions
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1456 @section GNU Regex Functions
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1457
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1458 If you're writing code that doesn't need to be compatible with either
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1459 @sc{posix} or Berkeley @sc{unix}, you can use these functions. They
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1460 provide more options than the other interfaces.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1461
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1462 @menu
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1463 * GNU Pattern Buffers:: The re_pattern_buffer type.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1464 * GNU Regular Expression Compiling:: re_compile_pattern ()
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1465 * GNU Matching:: re_match ()
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1466 * GNU Searching:: re_search ()
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1467 * Matching/Searching with Split Data:: re_match_2 (), re_search_2 ()
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1468 * Searching with Fastmaps:: re_compile_fastmap ()
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1469 * GNU Translate Tables:: The `translate' field.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1470 * Using Registers:: The re_registers type and related fns.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1471 * Freeing GNU Pattern Buffers:: regfree ()
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1472 @end menu
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1473
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1474
13533
ca70a11e70e2 Integrate the regex documentation.
Bruno Haible <bruno@clisp.org>
parents: 13532
diff changeset
1475 @node GNU Pattern Buffers
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1476 @subsection GNU Pattern Buffers
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1477
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1478 @cindex pattern buffer, definition of
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1479 @tindex re_pattern_buffer @r{definition}
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1480 @tindex struct re_pattern_buffer @r{definition}
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1481
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1482 To compile, match, or search for a given regular expression, you must
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1483 supply a pattern buffer. A @dfn{pattern buffer} holds one compiled
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1484 regular expression.@footnote{Regular expressions are also referred to as
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1485 ``patterns,'' hence the name ``pattern buffer.''}
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1486
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1487 You can have several different pattern buffers simultaneously, each
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1488 holding a compiled pattern for a different regular expression.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1489
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1490 @file{regex.h} defines the pattern buffer @code{struct} as follows:
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1491
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1492 @example
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1493 /* Space that holds the compiled pattern. It is declared as
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1494 `unsigned char *' because its elements are
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1495 sometimes used as array indexes. */
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1496 unsigned char *buffer;
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1497
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1498 /* Number of bytes to which `buffer' points. */
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1499 unsigned long allocated;
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1500
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1501 /* Number of bytes actually used in `buffer'. */
13532
b0bea693e638 Whitespace cleanup.
Bruno Haible <bruno@clisp.org>
parents: 13531
diff changeset
1502 unsigned long used;
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1503
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1504 /* Syntax setting with which the pattern was compiled. */
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1505 reg_syntax_t syntax;
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1506
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1507 /* Pointer to a fastmap, if any, otherwise zero. re_search uses
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1508 the fastmap, if there is one, to skip over impossible
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1509 starting points for matches. */
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1510 char *fastmap;
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1511
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1512 /* Either a translate table to apply to all characters before
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1513 comparing them, or zero for no translation. The translation
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1514 is applied to a pattern when it is compiled and to a string
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1515 when it is matched. */
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1516 char *translate;
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1517
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1518 /* Number of subexpressions found by the compiler. */
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1519 size_t re_nsub;
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1520
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1521 /* Zero if this pattern cannot match the empty string, one else.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1522 Well, in truth it's used only in `re_search_2', to see
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1523 whether or not we should use the fastmap, so we don't set
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1524 this absolutely perfectly; see `re_compile_fastmap' (the
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1525 `duplicate' case). */
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1526 unsigned can_be_null : 1;
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1527
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1528 /* If REGS_UNALLOCATED, allocate space in the `regs' structure
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1529 for `max (RE_NREGS, re_nsub + 1)' groups.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1530 If REGS_REALLOCATE, reallocate space if necessary.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1531 If REGS_FIXED, use what's there. */
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1532 #define REGS_UNALLOCATED 0
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1533 #define REGS_REALLOCATE 1
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1534 #define REGS_FIXED 2
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1535 unsigned regs_allocated : 2;
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1536
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1537 /* Set to zero when `regex_compile' compiles a pattern; set to one
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1538 by `re_compile_fastmap' if it updates the fastmap. */
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1539 unsigned fastmap_accurate : 1;
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1540
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1541 /* If set, `re_match_2' does not return information about
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1542 subexpressions. */
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1543 unsigned no_sub : 1;
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1544
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1545 /* If set, a beginning-of-line anchor doesn't match at the
13532
b0bea693e638 Whitespace cleanup.
Bruno Haible <bruno@clisp.org>
parents: 13531
diff changeset
1546 beginning of the string. */
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1547 unsigned not_bol : 1;
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1548
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1549 /* Similarly for an end-of-line anchor. */
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1550 unsigned not_eol : 1;
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1551
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1552 /* If true, an anchor at a newline matches. */
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1553 unsigned newline_anchor : 1;
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1554
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1555 @end example
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1556
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1557
13533
ca70a11e70e2 Integrate the regex documentation.
Bruno Haible <bruno@clisp.org>
parents: 13532
diff changeset
1558 @node GNU Regular Expression Compiling
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1559 @subsection GNU Regular Expression Compiling
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1560
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1561 In @sc{gnu}, you can both match and search for a given regular
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1562 expression. To do either, you must first compile it in a pattern buffer
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1563 (@pxref{GNU Pattern Buffers}).
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1564
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1565 @cindex syntax initialization
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1566 @vindex re_syntax_options @r{initialization}
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1567 Regular expressions match according to the syntax with which they were
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1568 compiled; with @sc{gnu}, you indicate what syntax you want by setting
13553
8fc3314fe460 Document not_eol and remove mention of regex.c.
Reuben Thomas <rrt@sc3d.org>
parents: 13549
diff changeset
1569 the variable @code{re_syntax_options} (declared in @file{regex.h})
8fc3314fe460 Document not_eol and remove mention of regex.c.
Reuben Thomas <rrt@sc3d.org>
parents: 13549
diff changeset
1570 before calling the compiling function, @code{re_compile_pattern} (see
8fc3314fe460 Document not_eol and remove mention of regex.c.
Reuben Thomas <rrt@sc3d.org>
parents: 13549
diff changeset
1571 below). @xref{Syntax Bits}, and @ref{Predefined Syntaxes}.
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1572
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1573 You can change the value of @code{re_syntax_options} at any time.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1574 Usually, however, you set its value once and then never change it.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1575
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1576 @cindex pattern buffer initialization
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1577 @code{re_compile_pattern} takes a pattern buffer as an argument. You
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1578 must initialize the following fields:
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1579
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1580 @table @code
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1581
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1582 @item translate @r{initialization}
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1583
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1584 @item translate
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1585 @vindex translate @r{initialization}
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1586 Initialize this to point to a translate table if you want one, or to
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1587 zero if you don't. We explain translate tables in @ref{GNU Translate
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1588 Tables}.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1589
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1590 @item fastmap
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1591 @vindex fastmap @r{initialization}
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1592 Initialize this to nonzero if you want a fastmap, or to zero if you
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1593 don't.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1594
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1595 @item buffer
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1596 @itemx allocated
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1597 @vindex buffer @r{initialization}
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1598 @vindex allocated @r{initialization}
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1599 @findex malloc
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1600 If you want @code{re_compile_pattern} to allocate memory for the
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1601 compiled pattern, set both of these to zero. If you have an existing
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1602 block of memory (allocated with @code{malloc}) you want Regex to use,
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1603 set @code{buffer} to its address and @code{allocated} to its size (in
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1604 bytes).
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1605
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1606 @code{re_compile_pattern} uses @code{realloc} to extend the space for
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1607 the compiled pattern as necessary.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1608
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1609 @end table
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1610
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1611 To compile a pattern buffer, use:
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1612
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1613 @findex re_compile_pattern
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1614 @example
13532
b0bea693e638 Whitespace cleanup.
Bruno Haible <bruno@clisp.org>
parents: 13531
diff changeset
1615 char *
b0bea693e638 Whitespace cleanup.
Bruno Haible <bruno@clisp.org>
parents: 13531
diff changeset
1616 re_compile_pattern (const char *@var{regex}, const int @var{regex_size},
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1617 struct re_pattern_buffer *@var{pattern_buffer})
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1618 @end example
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1619
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1620 @noindent
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1621 @var{regex} is the regular expression's address, @var{regex_size} is its
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1622 length, and @var{pattern_buffer} is the pattern buffer's address.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1623
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1624 If @code{re_compile_pattern} successfully compiles the regular
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1625 expression, it returns zero and sets @code{*@var{pattern_buffer}} to the
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1626 compiled pattern. It sets the pattern buffer's fields as follows:
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1627
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1628 @table @code
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1629 @item buffer
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1630 @vindex buffer @r{field, set by @code{re_compile_pattern}}
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1631 to the compiled pattern.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1632
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1633 @item used
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1634 @vindex used @r{field, set by @code{re_compile_pattern}}
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1635 to the number of bytes the compiled pattern in @code{buffer} occupies.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1636
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1637 @item syntax
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1638 @vindex syntax @r{field, set by @code{re_compile_pattern}}
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1639 to the current value of @code{re_syntax_options}.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1640
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1641 @item re_nsub
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1642 @vindex re_nsub @r{field, set by @code{re_compile_pattern}}
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1643 to the number of subexpressions in @var{regex}.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1644
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1645 @item fastmap_accurate
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1646 @vindex fastmap_accurate @r{field, set by @code{re_compile_pattern}}
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1647 to zero on the theory that the pattern you're compiling is different
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1648 than the one previously compiled into @code{buffer}; in that case (since
13532
b0bea693e638 Whitespace cleanup.
Bruno Haible <bruno@clisp.org>
parents: 13531
diff changeset
1649 you can't make a fastmap without a compiled pattern),
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1650 @code{fastmap} would either contain an incompatible fastmap, or nothing
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1651 at all.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1652
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1653 @c xx what else?
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1654 @end table
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1655
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1656 If @code{re_compile_pattern} can't compile @var{regex}, it returns an
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1657 error string corresponding to one of the errors listed in @ref{POSIX
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1658 Regular Expression Compiling}.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1659
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1660
13533
ca70a11e70e2 Integrate the regex documentation.
Bruno Haible <bruno@clisp.org>
parents: 13532
diff changeset
1661 @node GNU Matching
13532
b0bea693e638 Whitespace cleanup.
Bruno Haible <bruno@clisp.org>
parents: 13531
diff changeset
1662 @subsection GNU Matching
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1663
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1664 @cindex matching with GNU functions
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1665
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1666 Matching the @sc{gnu} way means trying to match as much of a string as
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1667 possible starting at a position within it you specify. Once you've compiled
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1668 a pattern into a pattern buffer (@pxref{GNU Regular Expression
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1669 Compiling}), you can ask the matcher to match that pattern against a
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1670 string using:
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1671
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1672 @findex re_match
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1673 @example
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1674 int
13532
b0bea693e638 Whitespace cleanup.
Bruno Haible <bruno@clisp.org>
parents: 13531
diff changeset
1675 re_match (struct re_pattern_buffer *@var{pattern_buffer},
b0bea693e638 Whitespace cleanup.
Bruno Haible <bruno@clisp.org>
parents: 13531
diff changeset
1676 const char *@var{string}, const int @var{size},
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1677 const int @var{start}, struct re_registers *@var{regs})
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1678 @end example
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1679
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1680 @noindent
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1681 @var{pattern_buffer} is the address of a pattern buffer containing a
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1682 compiled pattern. @var{string} is the string you want to match; it can
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1683 contain newline and null characters. @var{size} is the length of that
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1684 string. @var{start} is the string index at which you want to
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1685 begin matching; the first character of @var{string} is at index zero.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1686 @xref{Using Registers}, for a explanation of @var{regs}; you can safely
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1687 pass zero.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1688
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1689 @code{re_match} matches the regular expression in @var{pattern_buffer}
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1690 against the string @var{string} according to the syntax in
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1691 @var{pattern_buffers}'s @code{syntax} field. (@xref{GNU Regular
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1692 Expression Compiling}, for how to set it.) The function returns
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1693 @math{-1} if the compiled pattern does not match any part of
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1694 @var{string} and @math{-2} if an internal error happens; otherwise, it
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1695 returns how many (possibly zero) characters of @var{string} the pattern
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1696 matched.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1697
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1698 An example: suppose @var{pattern_buffer} points to a pattern buffer
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1699 containing the compiled pattern for @samp{a*}, and @var{string} points
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1700 to @samp{aaaaab} (whereupon @var{size} should be 6). Then if @var{start}
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1701 is 2, @code{re_match} returns 3, i.e., @samp{a*} would have matched the
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1702 last three @samp{a}s in @var{string}. If @var{start} is 0,
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1703 @code{re_match} returns 5, i.e., @samp{a*} would have matched all the
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1704 @samp{a}s in @var{string}. If @var{start} is either 5 or 6, it returns
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1705 zero.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1706
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1707 If @var{start} is not between zero and @var{size}, then
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1708 @code{re_match} returns @math{-1}.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1709
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1710
13533
ca70a11e70e2 Integrate the regex documentation.
Bruno Haible <bruno@clisp.org>
parents: 13532
diff changeset
1711 @node GNU Searching
13532
b0bea693e638 Whitespace cleanup.
Bruno Haible <bruno@clisp.org>
parents: 13531
diff changeset
1712 @subsection GNU Searching
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1713
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1714 @cindex searching with GNU functions
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1715
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1716 @dfn{Searching} means trying to match starting at successive positions
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1717 within a string. The function @code{re_search} does this.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1718
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1719 Before calling @code{re_search}, you must compile your regular
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1720 expression. @xref{GNU Regular Expression Compiling}.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1721
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1722 Here is the function declaration:
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1723
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1724 @findex re_search
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1725 @example
13532
b0bea693e638 Whitespace cleanup.
Bruno Haible <bruno@clisp.org>
parents: 13531
diff changeset
1726 int
b0bea693e638 Whitespace cleanup.
Bruno Haible <bruno@clisp.org>
parents: 13531
diff changeset
1727 re_search (struct re_pattern_buffer *@var{pattern_buffer},
b0bea693e638 Whitespace cleanup.
Bruno Haible <bruno@clisp.org>
parents: 13531
diff changeset
1728 const char *@var{string}, const int @var{size},
b0bea693e638 Whitespace cleanup.
Bruno Haible <bruno@clisp.org>
parents: 13531
diff changeset
1729 const int @var{start}, const int @var{range},
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1730 struct re_registers *@var{regs})
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1731 @end example
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1732
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1733 @noindent
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1734 @vindex start @r{argument to @code{re_search}}
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1735 @vindex range @r{argument to @code{re_search}}
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1736 whose arguments are the same as those to @code{re_match} (@pxref{GNU
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1737 Matching}) except that the two arguments @var{start} and @var{range}
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1738 replace @code{re_match}'s argument @var{start}.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1739
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1740 If @var{range} is positive, then @code{re_search} attempts a match
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1741 starting first at index @var{start}, then at @math{@var{start} + 1} if
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1742 that fails, and so on, up to @math{@var{start} + @var{range}}; if
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1743 @var{range} is negative, then it attempts a match starting first at
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1744 index @var{start}, then at @math{@var{start} -1} if that fails, and so
13532
b0bea693e638 Whitespace cleanup.
Bruno Haible <bruno@clisp.org>
parents: 13531
diff changeset
1745 on.
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1746
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1747 If @var{start} is not between zero and @var{size}, then @code{re_search}
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1748 returns @math{-1}. When @var{range} is positive, @code{re_search}
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1749 adjusts @var{range} so that @math{@var{start} + @var{range} - 1} is
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1750 between zero and @var{size}, if necessary; that way it won't search
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1751 outside of @var{string}. Similarly, when @var{range} is negative,
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1752 @code{re_search} adjusts @var{range} so that @math{@var{start} +
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1753 @var{range} + 1} is between zero and @var{size}, if necessary.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1754
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1755 If the @code{fastmap} field of @var{pattern_buffer} is zero,
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1756 @code{re_search} matches starting at consecutive positions; otherwise,
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1757 it uses @code{fastmap} to make the search more efficient.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1758 @xref{Searching with Fastmaps}.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1759
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1760 If no match is found, @code{re_search} returns @math{-1}. If
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1761 a match is found, it returns the index where the match began. If an
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1762 internal error happens, it returns @math{-2}.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1763
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1764
13533
ca70a11e70e2 Integrate the regex documentation.
Bruno Haible <bruno@clisp.org>
parents: 13532
diff changeset
1765 @node Matching/Searching with Split Data
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1766 @subsection Matching and Searching with Split Data
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1767
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1768 Using the functions @code{re_match_2} and @code{re_search_2}, you can
13532
b0bea693e638 Whitespace cleanup.
Bruno Haible <bruno@clisp.org>
parents: 13531
diff changeset
1769 match or search in data that is divided into two strings.
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1770
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1771 The function:
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1772
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1773 @findex re_match_2
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1774 @example
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1775 int
13532
b0bea693e638 Whitespace cleanup.
Bruno Haible <bruno@clisp.org>
parents: 13531
diff changeset
1776 re_match_2 (struct re_pattern_buffer *@var{buffer},
b0bea693e638 Whitespace cleanup.
Bruno Haible <bruno@clisp.org>
parents: 13531
diff changeset
1777 const char *@var{string1}, const int @var{size1},
b0bea693e638 Whitespace cleanup.
Bruno Haible <bruno@clisp.org>
parents: 13531
diff changeset
1778 const char *@var{string2}, const int @var{size2},
b0bea693e638 Whitespace cleanup.
Bruno Haible <bruno@clisp.org>
parents: 13531
diff changeset
1779 const int @var{start},
b0bea693e638 Whitespace cleanup.
Bruno Haible <bruno@clisp.org>
parents: 13531
diff changeset
1780 struct re_registers *@var{regs},
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1781 const int @var{stop})
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1782 @end example
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1783
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1784 @noindent
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1785 is similar to @code{re_match} (@pxref{GNU Matching}) except that you
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1786 pass @emph{two} data strings and sizes, and an index @var{stop} beyond
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1787 which you don't want the matcher to try matching. As with
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1788 @code{re_match}, if it succeeds, @code{re_match_2} returns how many
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1789 characters of @var{string} it matched. Regard @var{string1} and
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1790 @var{string2} as concatenated when you set the arguments @var{start} and
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1791 @var{stop} and use the contents of @var{regs}; @code{re_match_2} never
13532
b0bea693e638 Whitespace cleanup.
Bruno Haible <bruno@clisp.org>
parents: 13531
diff changeset
1792 returns a value larger than @math{@var{size1} + @var{size2}}.
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1793
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1794 The function:
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1795
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1796 @findex re_search_2
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1797 @example
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1798 int
13532
b0bea693e638 Whitespace cleanup.
Bruno Haible <bruno@clisp.org>
parents: 13531
diff changeset
1799 re_search_2 (struct re_pattern_buffer *@var{buffer},
b0bea693e638 Whitespace cleanup.
Bruno Haible <bruno@clisp.org>
parents: 13531
diff changeset
1800 const char *@var{string1}, const int @var{size1},
b0bea693e638 Whitespace cleanup.
Bruno Haible <bruno@clisp.org>
parents: 13531
diff changeset
1801 const char *@var{string2}, const int @var{size2},
b0bea693e638 Whitespace cleanup.
Bruno Haible <bruno@clisp.org>
parents: 13531
diff changeset
1802 const int @var{start}, const int @var{range},
b0bea693e638 Whitespace cleanup.
Bruno Haible <bruno@clisp.org>
parents: 13531
diff changeset
1803 struct re_registers *@var{regs},
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1804 const int @var{stop})
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1805 @end example
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1806
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1807 @noindent
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1808 is similarly related to @code{re_search}.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1809
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1810
13533
ca70a11e70e2 Integrate the regex documentation.
Bruno Haible <bruno@clisp.org>
parents: 13532
diff changeset
1811 @node Searching with Fastmaps
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1812 @subsection Searching with Fastmaps
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1813
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1814 @cindex fastmaps
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1815 If you're searching through a long string, you should use a fastmap.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1816 Without one, the searcher tries to match at consecutive positions in the
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1817 string. Generally, most of the characters in the string could not start
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1818 a match. It takes much longer to try matching at a given position in the
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1819 string than it does to check in a table whether or not the character at
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1820 that position could start a match. A @dfn{fastmap} is such a table.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1821
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1822 More specifically, a fastmap is an array indexed by the characters in
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1823 your character set. Under the @sc{ascii} encoding, therefore, a fastmap
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1824 has 256 elements. If you want the searcher to use a fastmap with a
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1825 given pattern buffer, you must allocate the array and assign the array's
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1826 address to the pattern buffer's @code{fastmap} field. You either can
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1827 compile the fastmap yourself or have @code{re_search} do it for you;
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1828 when @code{fastmap} is nonzero, it automatically compiles a fastmap the
13532
b0bea693e638 Whitespace cleanup.
Bruno Haible <bruno@clisp.org>
parents: 13531
diff changeset
1829 first time you search using a particular compiled pattern.
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1830
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1831 To compile a fastmap yourself, use:
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1832
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1833 @findex re_compile_fastmap
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1834 @example
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1835 int
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1836 re_compile_fastmap (struct re_pattern_buffer *@var{pattern_buffer})
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1837 @end example
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1838
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1839 @noindent
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1840 @var{pattern_buffer} is the address of a pattern buffer. If the
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1841 character @var{c} could start a match for the pattern,
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1842 @code{re_compile_fastmap} makes
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1843 @code{@var{pattern_buffer}->fastmap[@var{c}]} nonzero. It returns
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1844 @math{0} if it can compile a fastmap and @math{-2} if there is an
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1845 internal error. For example, if @samp{|} is the alternation operator
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1846 and @var{pattern_buffer} holds the compiled pattern for @samp{a|b}, then
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1847 @code{re_compile_fastmap} sets @code{fastmap['a']} and
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1848 @code{fastmap['b']} (and no others).
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1849
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1850 @code{re_search} uses a fastmap as it moves along in the string: it
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1851 checks the string's characters until it finds one that's in the fastmap.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1852 Then it tries matching at that character. If the match fails, it
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1853 repeats the process. So, by using a fastmap, @code{re_search} doesn't
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1854 waste time trying to match at positions in the string that couldn't
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1855 start a match.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1856
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1857 If you don't want @code{re_search} to use a fastmap,
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1858 store zero in the @code{fastmap} field of the pattern buffer before
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1859 calling @code{re_search}.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1860
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1861 Once you've initialized a pattern buffer's @code{fastmap} field, you
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1862 need never do so again---even if you compile a new pattern in
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1863 it---provided the way the field is set still reflects whether or not you
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1864 want a fastmap. @code{re_search} will still either do nothing if
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1865 @code{fastmap} is null or, if it isn't, compile a new fastmap for the
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1866 new pattern.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1867
13533
ca70a11e70e2 Integrate the regex documentation.
Bruno Haible <bruno@clisp.org>
parents: 13532
diff changeset
1868 @node GNU Translate Tables
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1869 @subsection GNU Translate Tables
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1870
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1871 If you set the @code{translate} field of a pattern buffer to a translate
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1872 table, then the @sc{gnu} Regex functions to which you've passed that
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1873 pattern buffer use it to apply a simple transformation
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1874 to all the regular expression and string characters at which they look.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1875
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1876 A @dfn{translate table} is an array indexed by the characters in your
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1877 character set. Under the @sc{ascii} encoding, therefore, a translate
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1878 table has 256 elements. The array's elements are also characters in
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1879 your character set. When the Regex functions see a character @var{c},
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1880 they use @code{translate[@var{c}]} in its place, with one exception: the
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1881 character after a @samp{\} is not translated. (This ensures that, the
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1882 operators, e.g., @samp{\B} and @samp{\b}, are always distinguishable.)
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1883
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1884 For example, a table that maps all lowercase letters to the
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1885 corresponding uppercase ones would cause the matcher to ignore
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1886 differences in case.@footnote{A table that maps all uppercase letters to
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1887 the corresponding lowercase ones would work just as well for this
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1888 purpose.} Such a table would map all characters except lowercase letters
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1889 to themselves, and lowercase letters to the corresponding uppercase
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1890 ones. Under the @sc{ascii} encoding, here's how you could initialize
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1891 such a table (we'll call it @code{case_fold}):
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1892
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1893 @example
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1894 for (i = 0; i < 256; i++)
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1895 case_fold[i] = i;
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1896 for (i = 'a'; i <= 'z'; i++)
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1897 case_fold[i] = i - ('a' - 'A');
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1898 @end example
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1899
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1900 You tell Regex to use a translate table on a given pattern buffer by
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1901 assigning that table's address to the @code{translate} field of that
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1902 buffer. If you don't want Regex to do any translation, put zero into
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1903 this field. You'll get weird results if you change the table's contents
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1904 anytime between compiling the pattern buffer, compiling its fastmap, and
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1905 matching or searching with the pattern buffer.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1906
13533
ca70a11e70e2 Integrate the regex documentation.
Bruno Haible <bruno@clisp.org>
parents: 13532
diff changeset
1907 @node Using Registers
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1908 @subsection Using Registers
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1909
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1910 A group in a regular expression can match a (posssibly empty) substring
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1911 of the string that regular expression as a whole matched. The matcher
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1912 remembers the beginning and end of the substring matched by
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1913 each group.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1914
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1915 To find out what they matched, pass a nonzero @var{regs} argument to a
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1916 @sc{gnu} matching or searching function (@pxref{GNU Matching} and
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1917 @ref{GNU Searching}), i.e., the address of a structure of this type, as
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1918 defined in @file{regex.h}:
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1919
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1920 @c We don't bother to include this directly from regex.h,
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1921 @c since it changes so rarely.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1922 @example
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1923 @tindex re_registers
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1924 @vindex num_regs @r{in @code{struct re_registers}}
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1925 @vindex start @r{in @code{struct re_registers}}
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1926 @vindex end @r{in @code{struct re_registers}}
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1927 struct re_registers
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1928 @{
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1929 unsigned num_regs;
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1930 regoff_t *start;
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1931 regoff_t *end;
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1932 @};
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1933 @end example
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1934
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1935 Except for (possibly) the @var{num_regs}'th element (see below), the
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1936 @var{i}th element of the @code{start} and @code{end} arrays records
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1937 information about the @var{i}th group in the pattern. (They're declared
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1938 as C pointers, but this is only because not all C compilers accept
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1939 zero-length arrays; conceptually, it is simplest to think of them as
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1940 arrays.)
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1941
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1942 The @code{start} and @code{end} arrays are allocated in various ways,
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1943 depending on the value of the @code{regs_allocated}
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1944 @vindex regs_allocated
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1945 field in the pattern buffer passed to the matcher.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1946
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1947 The simplest and perhaps most useful is to let the matcher (re)allocate
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1948 enough space to record information for all the groups in the regular
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1949 expression. If @code{regs_allocated} is @code{REGS_UNALLOCATED},
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1950 @vindex REGS_UNALLOCATED
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1951 the matcher allocates @math{1 + @var{re_nsub}} (another field in the
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1952 pattern buffer; @pxref{GNU Pattern Buffers}). The extra element is set
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1953 to @math{-1}, and sets @code{regs_allocated} to @code{REGS_REALLOCATE}.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1954 @vindex REGS_REALLOCATE
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1955 Then on subsequent calls with the same pattern buffer and @var{regs}
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1956 arguments, the matcher reallocates more space if necessary.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1957
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1958 It would perhaps be more logical to make the @code{regs_allocated} field
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1959 part of the @code{re_registers} structure, instead of part of the
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1960 pattern buffer. But in that case the caller would be forced to
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1961 initialize the structure before passing it. Much existing code doesn't
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1962 do this initialization, and it's arguably better to avoid it anyway.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1963
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1964 @code{re_compile_pattern} sets @code{regs_allocated} to
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1965 @code{REGS_UNALLOCATED},
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1966 so if you use the GNU regular expression
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1967 functions, you get this behavior by default.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1968
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1969 xx document re_set_registers
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1970
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1971 @sc{posix}, on the other hand, requires a different interface: the
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1972 caller is supposed to pass in a fixed-length array which the matcher
13532
b0bea693e638 Whitespace cleanup.
Bruno Haible <bruno@clisp.org>
parents: 13531
diff changeset
1973 fills. Therefore, if @code{regs_allocated} is @code{REGS_FIXED}
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1974 @vindex REGS_FIXED
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1975 the matcher simply fills that array.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1976
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1977 The following examples illustrate the information recorded in the
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1978 @code{re_registers} structure. (In all of them, @samp{(} represents the
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1979 open-group and @samp{)} the close-group operator. The first character
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1980 in the string @var{string} is at index 0.)
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1981
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1982 @c xx i'm not sure this is all true anymore.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1983
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1984 @itemize @bullet
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1985
13532
b0bea693e638 Whitespace cleanup.
Bruno Haible <bruno@clisp.org>
parents: 13531
diff changeset
1986 @item
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1987 If the regular expression has an @w{@var{i}-th}
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1988 group not contained within another group that matches a
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1989 substring of @var{string}, then the function sets
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1990 @code{@w{@var{regs}->}start[@var{i}]} to the index in @var{string} where
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1991 the substring matched by the @w{@var{i}-th} group begins, and
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1992 @code{@w{@var{regs}->}end[@var{i}]} to the index just beyond that
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1993 substring's end. The function sets @code{@w{@var{regs}->}start[0]} and
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1994 @code{@w{@var{regs}->}end[0]} to analogous information about the entire
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1995 pattern.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1996
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1997 For example, when you match @samp{((a)(b))} against @samp{ab}, you get:
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1998
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
1999 @itemize
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2000 @item
13532
b0bea693e638 Whitespace cleanup.
Bruno Haible <bruno@clisp.org>
parents: 13531
diff changeset
2001 0 in @code{@w{@var{regs}->}start[0]} and 2 in @code{@w{@var{regs}->}end[0]}
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2002
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2003 @item
13532
b0bea693e638 Whitespace cleanup.
Bruno Haible <bruno@clisp.org>
parents: 13531
diff changeset
2004 0 in @code{@w{@var{regs}->}start[1]} and 2 in @code{@w{@var{regs}->}end[1]}
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2005
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2006 @item
13532
b0bea693e638 Whitespace cleanup.
Bruno Haible <bruno@clisp.org>
parents: 13531
diff changeset
2007 0 in @code{@w{@var{regs}->}start[2]} and 1 in @code{@w{@var{regs}->}end[2]}
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2008
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2009 @item
13532
b0bea693e638 Whitespace cleanup.
Bruno Haible <bruno@clisp.org>
parents: 13531
diff changeset
2010 1 in @code{@w{@var{regs}->}start[3]} and 2 in @code{@w{@var{regs}->}end[3]}
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2011 @end itemize
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2012
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2013 @item
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2014 If a group matches more than once (as it might if followed by,
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2015 e.g., a repetition operator), then the function reports the information
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2016 about what the group @emph{last} matched.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2017
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2018 For example, when you match the pattern @samp{(a)*} against the string
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2019 @samp{aa}, you get:
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2020
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2021 @itemize
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2022 @item
13532
b0bea693e638 Whitespace cleanup.
Bruno Haible <bruno@clisp.org>
parents: 13531
diff changeset
2023 0 in @code{@w{@var{regs}->}start[0]} and 2 in @code{@w{@var{regs}->}end[0]}
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2024
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2025 @item
13532
b0bea693e638 Whitespace cleanup.
Bruno Haible <bruno@clisp.org>
parents: 13531
diff changeset
2026 1 in @code{@w{@var{regs}->}start[1]} and 2 in @code{@w{@var{regs}->}end[1]}
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2027 @end itemize
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2028
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2029 @item
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2030 If the @w{@var{i}-th} group does not participate in a
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2031 successful match, e.g., it is an alternative not taken or a
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2032 repetition operator allows zero repetitions of it, then the function
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2033 sets @code{@w{@var{regs}->}start[@var{i}]} and
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2034 @code{@w{@var{regs}->}end[@var{i}]} to @math{-1}.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2035
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2036 For example, when you match the pattern @samp{(a)*b} against
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2037 the string @samp{b}, you get:
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2038
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2039 @itemize
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2040 @item
13532
b0bea693e638 Whitespace cleanup.
Bruno Haible <bruno@clisp.org>
parents: 13531
diff changeset
2041 0 in @code{@w{@var{regs}->}start[0]} and 1 in @code{@w{@var{regs}->}end[0]}
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2042
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2043 @item
13532
b0bea693e638 Whitespace cleanup.
Bruno Haible <bruno@clisp.org>
parents: 13531
diff changeset
2044 @math{-1} in @code{@w{@var{regs}->}start[1]} and @math{-1} in @code{@w{@var{regs}->}end[1]}
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2045 @end itemize
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2046
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2047 @item
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2048 If the @w{@var{i}-th} group matches a zero-length string, then the
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2049 function sets @code{@w{@var{regs}->}start[@var{i}]} and
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2050 @code{@w{@var{regs}->}end[@var{i}]} to the index just beyond that
13532
b0bea693e638 Whitespace cleanup.
Bruno Haible <bruno@clisp.org>
parents: 13531
diff changeset
2051 zero-length string.
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2052
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2053 For example, when you match the pattern @samp{(a*)b} against the string
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2054 @samp{b}, you get:
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2055
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2056 @itemize
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2057 @item
13532
b0bea693e638 Whitespace cleanup.
Bruno Haible <bruno@clisp.org>
parents: 13531
diff changeset
2058 0 in @code{@w{@var{regs}->}start[0]} and 1 in @code{@w{@var{regs}->}end[0]}
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2059
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2060 @item
13532
b0bea693e638 Whitespace cleanup.
Bruno Haible <bruno@clisp.org>
parents: 13531
diff changeset
2061 0 in @code{@w{@var{regs}->}start[1]} and 0 in @code{@w{@var{regs}->}end[1]}
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2062 @end itemize
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2063
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2064 @ignore
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2065 The function sets @code{@w{@var{regs}->}start[0]} and
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2066 @code{@w{@var{regs}->}end[0]} to analogous information about the entire
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2067 pattern.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2068
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2069 For example, when you match the pattern @samp{(a*)} against the empty
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2070 string, you get:
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2071
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2072 @itemize
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2073 @item
13532
b0bea693e638 Whitespace cleanup.
Bruno Haible <bruno@clisp.org>
parents: 13531
diff changeset
2074 0 in @code{@w{@var{regs}->}start[0]} and 0 in @code{@w{@var{regs}->}end[0]}
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2075
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2076 @item
13532
b0bea693e638 Whitespace cleanup.
Bruno Haible <bruno@clisp.org>
parents: 13531
diff changeset
2077 0 in @code{@w{@var{regs}->}start[1]} and 0 in @code{@w{@var{regs}->}end[1]}
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2078 @end itemize
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2079 @end ignore
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2080
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2081 @item
13532
b0bea693e638 Whitespace cleanup.
Bruno Haible <bruno@clisp.org>
parents: 13531
diff changeset
2082 If an @w{@var{i}-th} group contains a @w{@var{j}-th} group
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2083 in turn not contained within any other group within group @var{i} and
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2084 the function reports a match of the @w{@var{i}-th} group, then it
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2085 records in @code{@w{@var{regs}->}start[@var{j}]} and
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2086 @code{@w{@var{regs}->}end[@var{j}]} the last match (if it matched) of
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2087 the @w{@var{j}-th} group.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2088
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2089 For example, when you match the pattern @samp{((a*)b)*} against the
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2090 string @samp{abb}, @w{group 2} last matches the empty string, so you
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2091 get what it previously matched:
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2092
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2093 @itemize
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2094 @item
13532
b0bea693e638 Whitespace cleanup.
Bruno Haible <bruno@clisp.org>
parents: 13531
diff changeset
2095 0 in @code{@w{@var{regs}->}start[0]} and 3 in @code{@w{@var{regs}->}end[0]}
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2096
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2097 @item
13532
b0bea693e638 Whitespace cleanup.
Bruno Haible <bruno@clisp.org>
parents: 13531
diff changeset
2098 2 in @code{@w{@var{regs}->}start[1]} and 3 in @code{@w{@var{regs}->}end[1]}
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2099
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2100 @item
13532
b0bea693e638 Whitespace cleanup.
Bruno Haible <bruno@clisp.org>
parents: 13531
diff changeset
2101 2 in @code{@w{@var{regs}->}start[2]} and 2 in @code{@w{@var{regs}->}end[2]}
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2102 @end itemize
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2103
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2104 When you match the pattern @samp{((a)*b)*} against the string
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2105 @samp{abb}, @w{group 2} doesn't participate in the last match, so you
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2106 get:
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2107
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2108 @itemize
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2109 @item
13532
b0bea693e638 Whitespace cleanup.
Bruno Haible <bruno@clisp.org>
parents: 13531
diff changeset
2110 0 in @code{@w{@var{regs}->}start[0]} and 3 in @code{@w{@var{regs}->}end[0]}
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2111
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2112 @item
13532
b0bea693e638 Whitespace cleanup.
Bruno Haible <bruno@clisp.org>
parents: 13531
diff changeset
2113 2 in @code{@w{@var{regs}->}start[1]} and 3 in @code{@w{@var{regs}->}end[1]}
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2114
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2115 @item
13532
b0bea693e638 Whitespace cleanup.
Bruno Haible <bruno@clisp.org>
parents: 13531
diff changeset
2116 0 in @code{@w{@var{regs}->}start[2]} and 1 in @code{@w{@var{regs}->}end[2]}
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2117 @end itemize
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2118
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2119 @item
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2120 If an @w{@var{i}-th} group contains a @w{@var{j}-th} group
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2121 in turn not contained within any other group within group @var{i}
13532
b0bea693e638 Whitespace cleanup.
Bruno Haible <bruno@clisp.org>
parents: 13531
diff changeset
2122 and the function sets
b0bea693e638 Whitespace cleanup.
Bruno Haible <bruno@clisp.org>
parents: 13531
diff changeset
2123 @code{@w{@var{regs}->}start[@var{i}]} and
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2124 @code{@w{@var{regs}->}end[@var{i}]} to @math{-1}, then it also sets
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2125 @code{@w{@var{regs}->}start[@var{j}]} and
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2126 @code{@w{@var{regs}->}end[@var{j}]} to @math{-1}.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2127
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2128 For example, when you match the pattern @samp{((a)*b)*c} against the
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2129 string @samp{c}, you get:
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2130
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2131 @itemize
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2132 @item
13532
b0bea693e638 Whitespace cleanup.
Bruno Haible <bruno@clisp.org>
parents: 13531
diff changeset
2133 0 in @code{@w{@var{regs}->}start[0]} and 1 in @code{@w{@var{regs}->}end[0]}
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2134
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2135 @item
13532
b0bea693e638 Whitespace cleanup.
Bruno Haible <bruno@clisp.org>
parents: 13531
diff changeset
2136 @math{-1} in @code{@w{@var{regs}->}start[1]} and @math{-1} in @code{@w{@var{regs}->}end[1]}
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2137
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2138 @item
13532
b0bea693e638 Whitespace cleanup.
Bruno Haible <bruno@clisp.org>
parents: 13531
diff changeset
2139 @math{-1} in @code{@w{@var{regs}->}start[2]} and @math{-1} in @code{@w{@var{regs}->}end[2]}
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2140 @end itemize
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2141
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2142 @end itemize
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2143
13533
ca70a11e70e2 Integrate the regex documentation.
Bruno Haible <bruno@clisp.org>
parents: 13532
diff changeset
2144 @node Freeing GNU Pattern Buffers
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2145 @subsection Freeing GNU Pattern Buffers
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2146
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2147 To free any allocated fields of a pattern buffer, you can use the
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2148 @sc{posix} function described in @ref{Freeing POSIX Pattern Buffers},
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2149 since the type @code{regex_t}---the type for @sc{posix} pattern
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2150 buffers---is equivalent to the type @code{re_pattern_buffer}. After
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2151 freeing a pattern buffer, you need to again compile a regular expression
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2152 in it (@pxref{GNU Regular Expression Compiling}) before passing it to
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2153 a matching or searching function.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2154
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2155
13533
ca70a11e70e2 Integrate the regex documentation.
Bruno Haible <bruno@clisp.org>
parents: 13532
diff changeset
2156 @node POSIX Regex Functions
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2157 @section POSIX Regex Functions
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2158
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2159 If you're writing code that has to be @sc{posix} compatible, you'll need
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2160 to use these functions. Their interfaces are as specified by @sc{posix},
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2161 draft 1003.2/D11.2.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2162
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2163 @menu
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2164 * POSIX Pattern Buffers:: The regex_t type.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2165 * POSIX Regular Expression Compiling:: regcomp ()
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2166 * POSIX Matching:: regexec ()
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2167 * Reporting Errors:: regerror ()
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2168 * Using Byte Offsets:: The regmatch_t type.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2169 * Freeing POSIX Pattern Buffers:: regfree ()
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2170 @end menu
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2171
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2172
13533
ca70a11e70e2 Integrate the regex documentation.
Bruno Haible <bruno@clisp.org>
parents: 13532
diff changeset
2173 @node POSIX Pattern Buffers
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2174 @subsection POSIX Pattern Buffers
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2175
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2176 To compile or match a given regular expression the @sc{posix} way, you
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2177 must supply a pattern buffer exactly the way you do for @sc{gnu}
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2178 (@pxref{GNU Pattern Buffers}). @sc{posix} pattern buffers have type
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2179 @code{regex_t}, which is equivalent to the @sc{gnu} pattern buffer
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2180 type @code{re_pattern_buffer}.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2181
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2182
13533
ca70a11e70e2 Integrate the regex documentation.
Bruno Haible <bruno@clisp.org>
parents: 13532
diff changeset
2183 @node POSIX Regular Expression Compiling
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2184 @subsection POSIX Regular Expression Compiling
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2185
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2186 With @sc{posix}, you can only search for a given regular expression; you
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2187 can't match it. To do this, you must first compile it in a
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2188 pattern buffer, using @code{regcomp}.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2189
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2190 @ignore
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2191 Before calling @code{regcomp}, you must initialize this pattern buffer
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2192 as you do for @sc{gnu} (@pxref{GNU Regular Expression Compiling}). See
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2193 below, however, for how to choose a syntax with which to compile.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2194 @end ignore
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2195
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2196 To compile a pattern buffer, use:
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2197
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2198 @findex regcomp
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2199 @example
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2200 int
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2201 regcomp (regex_t *@var{preg}, const char *@var{regex}, int @var{cflags})
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2202 @end example
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2203
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2204 @noindent
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2205 @var{preg} is the initialized pattern buffer's address, @var{regex} is
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2206 the regular expression's address, and @var{cflags} is the compilation
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2207 flags, which Regex considers as a collection of bits. Here are the
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2208 valid bits, as defined in @file{regex.h}:
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2209
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2210 @table @code
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2211
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2212 @item REG_EXTENDED
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2213 @vindex REG_EXTENDED
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2214 says to use @sc{posix} Extended Regular Expression syntax; if this isn't
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2215 set, then says to use @sc{posix} Basic Regular Expression syntax.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2216 @code{regcomp} sets @var{preg}'s @code{syntax} field accordingly.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2217
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2218 @item REG_ICASE
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2219 @vindex REG_ICASE
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2220 @cindex ignoring case
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2221 says to ignore case; @code{regcomp} sets @var{preg}'s @code{translate}
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2222 field to a translate table which ignores case, replacing anything you've
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2223 put there before.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2224
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2225 @item REG_NOSUB
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2226 @vindex REG_NOSUB
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2227 says to set @var{preg}'s @code{no_sub} field; @pxref{POSIX Matching},
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2228 for what this means.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2229
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2230 @item REG_NEWLINE
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2231 @vindex REG_NEWLINE
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2232 says that a:
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2233
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2234 @itemize @bullet
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2235
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2236 @item
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2237 match-any-character operator (@pxref{Match-any-character
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2238 Operator}) doesn't match a newline.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2239
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2240 @item
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2241 nonmatching list not containing a newline (@pxref{List
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2242 Operators}) matches a newline.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2243
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2244 @item
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2245 match-beginning-of-line operator (@pxref{Match-beginning-of-line
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2246 Operator}) matches the empty string immediately after a newline,
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2247 regardless of how @code{REG_NOTBOL} is set (@pxref{POSIX Matching}, for
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2248 an explanation of @code{REG_NOTBOL}).
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2249
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2250 @item
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2251 match-end-of-line operator (@pxref{Match-beginning-of-line
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2252 Operator}) matches the empty string immediately before a newline,
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2253 regardless of how @code{REG_NOTEOL} is set (@pxref{POSIX Matching},
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2254 for an explanation of @code{REG_NOTEOL}).
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2255
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2256 @end itemize
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2257
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2258 @end table
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2259
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2260 If @code{regcomp} successfully compiles the regular expression, it
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2261 returns zero and sets @code{*@var{pattern_buffer}} to the compiled
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2262 pattern. Except for @code{syntax} (which it sets as explained above), it
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2263 also sets the same fields the same way as does the @sc{gnu} compiling
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2264 function (@pxref{GNU Regular Expression Compiling}).
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2265
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2266 If @code{regcomp} can't compile the regular expression, it returns one
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2267 of the error codes listed here. (Except when noted differently, the
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2268 syntax of in all examples below is basic regular expression syntax.)
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2269
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2270 @table @code
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2271
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2272 @comment repetitions
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2273 @item REG_BADRPT
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2274 For example, the consecutive repetition operators @samp{**} in
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2275 @samp{a**} are invalid. As another example, if the syntax is extended
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2276 regular expression syntax, then the repetition operator @samp{*} with
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2277 nothing on which to operate in @samp{*} is invalid.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2278
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2279 @item REG_BADBR
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2280 For example, the @var{count} @samp{-1} in @samp{a\@{-1} is invalid.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2281
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2282 @item REG_EBRACE
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2283 For example, @samp{a\@{1} is missing a close-interval operator.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2284
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2285 @comment lists
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2286 @item REG_EBRACK
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2287 For example, @samp{[a} is missing a close-list operator.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2288
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2289 @item REG_ERANGE
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2290 For example, the range ending point @samp{z} that collates lower than
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2291 does its starting point @samp{a} in @samp{[z-a]} is invalid. Also, the
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2292 range with the character class @samp{[:alpha:]} as its starting point in
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2293 @samp{[[:alpha:]-|]}.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2294
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2295 @item REG_ECTYPE
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2296 For example, the character class name @samp{foo} in @samp{[[:foo:]} is
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2297 invalid.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2298
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2299 @comment groups
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2300 @item REG_EPAREN
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2301 For example, @samp{a\)} is missing an open-group operator and @samp{\(a}
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2302 is missing a close-group operator.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2303
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2304 @item REG_ESUBREG
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2305 For example, the back reference @samp{\2} that refers to a nonexistent
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2306 subexpression in @samp{\(a\)\2} is invalid.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2307
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2308 @comment unfinished business
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2309
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2310 @item REG_EEND
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2311 Returned when a regular expression causes no other more specific error.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2312
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2313 @item REG_EESCAPE
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2314 For example, the trailing backslash @samp{\} in @samp{a\} is invalid, as is the
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2315 one in @samp{\}.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2316
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2317 @comment kitchen sink
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2318 @item REG_BADPAT
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2319 For example, in the extended regular expression syntax, the empty group
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2320 @samp{()} in @samp{a()b} is invalid.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2321
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2322 @comment internal
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2323 @item REG_ESIZE
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2324 Returned when a regular expression needs a pattern buffer larger than
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2325 65536 bytes.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2326
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2327 @item REG_ESPACE
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2328 Returned when a regular expression makes Regex to run out of memory.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2329
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2330 @end table
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2331
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2332
13533
ca70a11e70e2 Integrate the regex documentation.
Bruno Haible <bruno@clisp.org>
parents: 13532
diff changeset
2333 @node POSIX Matching
13532
b0bea693e638 Whitespace cleanup.
Bruno Haible <bruno@clisp.org>
parents: 13531
diff changeset
2334 @subsection POSIX Matching
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2335
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2336 Matching the @sc{posix} way means trying to match a null-terminated
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2337 string starting at its first character. Once you've compiled a pattern
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2338 into a pattern buffer (@pxref{POSIX Regular Expression Compiling}), you
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2339 can ask the matcher to match that pattern against a string using:
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2340
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2341 @findex regexec
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2342 @example
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2343 int
13532
b0bea693e638 Whitespace cleanup.
Bruno Haible <bruno@clisp.org>
parents: 13531
diff changeset
2344 regexec (const regex_t *@var{preg}, const char *@var{string},
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2345 size_t @var{nmatch}, regmatch_t @var{pmatch}[], int @var{eflags})
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2346 @end example
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2347
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2348 @noindent
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2349 @var{preg} is the address of a pattern buffer for a compiled pattern.
13532
b0bea693e638 Whitespace cleanup.
Bruno Haible <bruno@clisp.org>
parents: 13531
diff changeset
2350 @var{string} is the string you want to match.
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2351
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2352 @xref{Using Byte Offsets}, for an explanation of @var{pmatch}. If you
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2353 pass zero for @var{nmatch} or you compiled @var{preg} with the
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2354 compilation flag @code{REG_NOSUB} set, then @code{regexec} will ignore
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2355 @var{pmatch}; otherwise, you must allocate it to have at least
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2356 @var{nmatch} elements. @code{regexec} will record @var{nmatch} byte
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2357 offsets in @var{pmatch}, and set to @math{-1} any unused elements up to
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2358 @math{@var{pmatch}@code{[@var{nmatch}]} - 1}.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2359
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2360 @var{eflags} specifies @dfn{execution flags}---namely, the two bits
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2361 @code{REG_NOTBOL} and @code{REG_NOTEOL} (defined in @file{regex.h}). If
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2362 you set @code{REG_NOTBOL}, then the match-beginning-of-line operator
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2363 (@pxref{Match-beginning-of-line Operator}) always fails to match.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2364 This lets you match against pieces of a line, as you would need to if,
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2365 say, searching for repeated instances of a given pattern in a line; it
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2366 would work correctly for patterns both with and without
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2367 match-beginning-of-line operators. @code{REG_NOTEOL} works analogously
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2368 for the match-end-of-line operator (@pxref{Match-end-of-line
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2369 Operator}); it exists for symmetry.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2370
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2371 @code{regexec} tries to find a match for @var{preg} in @var{string}
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2372 according to the syntax in @var{preg}'s @code{syntax} field.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2373 (@xref{POSIX Regular Expression Compiling}, for how to set it.) The
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2374 function returns zero if the compiled pattern matches @var{string} and
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2375 @code{REG_NOMATCH} (defined in @file{regex.h}) if it doesn't.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2376
13533
ca70a11e70e2 Integrate the regex documentation.
Bruno Haible <bruno@clisp.org>
parents: 13532
diff changeset
2377 @node Reporting Errors
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2378 @subsection Reporting Errors
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2379
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2380 If either @code{regcomp} or @code{regexec} fail, they return a nonzero
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2381 error code, the possibilities for which are defined in @file{regex.h}.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2382 @xref{POSIX Regular Expression Compiling}, and @ref{POSIX Matching}, for
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2383 what these codes mean. To get an error string corresponding to these
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2384 codes, you can use:
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2385
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2386 @findex regerror
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2387 @example
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2388 size_t
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2389 regerror (int @var{errcode},
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2390 const regex_t *@var{preg},
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2391 char *@var{errbuf},
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2392 size_t @var{errbuf_size})
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2393 @end example
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2394
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2395 @noindent
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2396 @var{errcode} is an error code, @var{preg} is the address of the pattern
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2397 buffer which provoked the error, @var{errbuf} is the error buffer, and
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2398 @var{errbuf_size} is @var{errbuf}'s size.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2399
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2400 @code{regerror} returns the size in bytes of the error string
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2401 corresponding to @var{errcode} (including its terminating null). If
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2402 @var{errbuf} and @var{errbuf_size} are nonzero, it also returns in
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2403 @var{errbuf} the first @math{@var{errbuf_size} - 1} characters of the
13532
b0bea693e638 Whitespace cleanup.
Bruno Haible <bruno@clisp.org>
parents: 13531
diff changeset
2404 error string, followed by a null.
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2405 @var{errbuf_size} must be a nonnegative number less than or equal to the
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2406 size in bytes of @var{errbuf}.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2407
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2408 You can call @code{regerror} with a null @var{errbuf} and a zero
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2409 @var{errbuf_size} to determine how large @var{errbuf} need be to
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2410 accommodate @code{regerror}'s error string.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2411
13533
ca70a11e70e2 Integrate the regex documentation.
Bruno Haible <bruno@clisp.org>
parents: 13532
diff changeset
2412 @node Using Byte Offsets
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2413 @subsection Using Byte Offsets
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2414
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2415 In @sc{posix}, variables of type @code{regmatch_t} hold analogous
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2416 information, but are not identical to, @sc{gnu}'s registers (@pxref{Using
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2417 Registers}). To get information about registers in @sc{posix}, pass to
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2418 @code{regexec} a nonzero @var{pmatch} of type @code{regmatch_t}, i.e.,
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2419 the address of a structure of this type, defined in
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2420 @file{regex.h}:
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2421
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2422 @tindex regmatch_t
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2423 @example
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2424 typedef struct
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2425 @{
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2426 regoff_t rm_so;
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2427 regoff_t rm_eo;
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2428 @} regmatch_t;
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2429 @end example
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2430
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2431 When reading in @ref{Using Registers}, about how the matching function
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2432 stores the information into the registers, substitute @var{pmatch} for
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2433 @var{regs}, @code{@w{@var{pmatch}[@var{i}]->}rm_so} for
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2434 @code{@w{@var{regs}->}start[@var{i}]} and
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2435 @code{@w{@var{pmatch}[@var{i}]->}rm_eo} for
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2436 @code{@w{@var{regs}->}end[@var{i}]}.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2437
13533
ca70a11e70e2 Integrate the regex documentation.
Bruno Haible <bruno@clisp.org>
parents: 13532
diff changeset
2438 @node Freeing POSIX Pattern Buffers
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2439 @subsection Freeing POSIX Pattern Buffers
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2440
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2441 To free any allocated fields of a pattern buffer, use:
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2442
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2443 @findex regfree
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2444 @example
13532
b0bea693e638 Whitespace cleanup.
Bruno Haible <bruno@clisp.org>
parents: 13531
diff changeset
2445 void
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2446 regfree (regex_t *@var{preg})
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2447 @end example
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2448
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2449 @noindent
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2450 @var{preg} is the pattern buffer whose allocated fields you want freed.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2451 @code{regfree} also sets @var{preg}'s @code{allocated} and @code{used}
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2452 fields to zero. After freeing a pattern buffer, you need to again
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2453 compile a regular expression in it (@pxref{POSIX Regular Expression
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2454 Compiling}) before passing it to the matching function (@pxref{POSIX
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2455 Matching}).
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2456
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2457
13533
ca70a11e70e2 Integrate the regex documentation.
Bruno Haible <bruno@clisp.org>
parents: 13532
diff changeset
2458 @node BSD Regex Functions
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2459 @section BSD Regex Functions
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2460
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2461 If you're writing code that has to be Berkeley @sc{unix} compatible,
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2462 you'll need to use these functions whose interfaces are the same as those
13532
b0bea693e638 Whitespace cleanup.
Bruno Haible <bruno@clisp.org>
parents: 13531
diff changeset
2463 in Berkeley @sc{unix}.
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2464
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2465 @menu
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2466 * BSD Regular Expression Compiling:: re_comp ()
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2467 * BSD Searching:: re_exec ()
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2468 @end menu
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2469
13533
ca70a11e70e2 Integrate the regex documentation.
Bruno Haible <bruno@clisp.org>
parents: 13532
diff changeset
2470 @node BSD Regular Expression Compiling
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2471 @subsection BSD Regular Expression Compiling
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2472
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2473 With Berkeley @sc{unix}, you can only search for a given regular
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2474 expression; you can't match one. To search for it, you must first
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2475 compile it. Before you compile it, you must indicate the regular
13532
b0bea693e638 Whitespace cleanup.
Bruno Haible <bruno@clisp.org>
parents: 13531
diff changeset
2476 expression syntax you want it compiled according to by setting the
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2477 variable @code{re_syntax_options} (declared in @file{regex.h} to some
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2478 syntax (@pxref{Regular Expression Syntax}).
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2479
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2480 To compile a regular expression use:
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2481
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2482 @findex re_comp
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2483 @example
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2484 char *
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2485 re_comp (char *@var{regex})
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2486 @end example
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2487
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2488 @noindent
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2489 @var{regex} is the address of a null-terminated regular expression.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2490 @code{re_comp} uses an internal pattern buffer, so you can use only the
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2491 most recently compiled pattern buffer. This means that if you want to
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2492 use a given regular expression that you've already compiled---but it
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2493 isn't the latest one you've compiled---you'll have to recompile it. If
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2494 you call @code{re_comp} with the null string (@emph{not} the empty
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2495 string) as the argument, it doesn't change the contents of the pattern
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2496 buffer.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2497
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2498 If @code{re_comp} successfully compiles the regular expression, it
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2499 returns zero. If it can't compile the regular expression, it returns
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2500 an error string. @code{re_comp}'s error messages are identical to those
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2501 of @code{re_compile_pattern} (@pxref{GNU Regular Expression
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2502 Compiling}).
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2503
13533
ca70a11e70e2 Integrate the regex documentation.
Bruno Haible <bruno@clisp.org>
parents: 13532
diff changeset
2504 @node BSD Searching
13532
b0bea693e638 Whitespace cleanup.
Bruno Haible <bruno@clisp.org>
parents: 13531
diff changeset
2505 @subsection BSD Searching
13531
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2506
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2507 Searching the Berkeley @sc{unix} way means searching in a string
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2508 starting at its first character and trying successive positions within
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2509 it to find a match. Once you've compiled a pattern using @code{re_comp}
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2510 (@pxref{BSD Regular Expression Compiling}), you can ask Regex
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2511 to search for that pattern in a string using:
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2512
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2513 @findex re_exec
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2514 @example
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2515 int
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2516 re_exec (char *@var{string})
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2517 @end example
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2518
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2519 @noindent
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2520 @var{string} is the address of the null-terminated string in which you
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2521 want to search.
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2522
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2523 @code{re_exec} returns either 1 for success or 0 for failure. It
de7ebb2f1530 Add regex documentation.
Bruno Haible <bruno@clisp.org>
parents:
diff changeset
2524 automatically uses a @sc{gnu} fastmap (@pxref{Searching with Fastmaps}).