octave-kai/gnulib-hg: doc/regex.texi annotate

annotate doc/regex.texi @ 13553:8fc3314fe460

Document not_eol and remove mention of regex.c.

author	Reuben Thomas <rrt@sc3d.org>
date	Sat, 14 Aug 2010 16:40:16 +0100
parents	bb0ceefd22dc
children	3a3b9d29af1b

rev	line source
13533 ca70a11e70e2 Integrate the regex documentation. Bruno Haible <bruno@clisp.org> parents: 13532 diff changeset	1 @node Overview
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2 @chapter Overview
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	3
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	4 A @dfn{regular expression} (or @dfn{regexp}, or @dfn{pattern}) is a text
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	5 string that describes some (mathematical) set of strings. A regexp
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	6 @var{r} @dfn{matches} a string @var{s} if @var{s} is in the set of
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	7 strings described by @var{r}.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	8
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	9 Using the Regex library, you can:
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	10
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	11 @itemize @bullet
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	12
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	13 @item
13532 b0bea693e638 Whitespace cleanup. Bruno Haible <bruno@clisp.org> parents: 13531 diff changeset	14 see if a string matches a specified pattern as a whole, and
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	15
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	16 @item
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	17 search within a string for a substring matching a specified pattern.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	18
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	19 @end itemize
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	20
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	21 Some regular expressions match only one string, i.e., the set they
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	22 describe has only one member. For example, the regular expression
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	23 @samp{foo} matches the string @samp{foo} and no others. Other regular
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	24 expressions match more than one string, i.e., the set they describe has
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	25 more than one member. For example, the regular expression @samp{f*}
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	26 matches the set of strings made up of any number (including zero) of
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	27 @samp{f}s. As you can see, some characters in regular expressions match
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	28 themselves (such as @samp{f}) and some don't (such as @samp{*}); the
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	29 ones that don't match themselves instead let you specify patterns that
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	30 describe many different strings.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	31
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	32 To either match or search for a regular expression with the Regex
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	33 library functions, you must first compile it with a Regex pattern
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	34 compiling function. A @dfn{compiled pattern} is a regular expression
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	35 converted to the internal format used by the library functions. Once
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	36 you've compiled a pattern, you can use it for matching or searching any
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	37 number of times.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	38
13553 8fc3314fe460 Document not_eol and remove mention of regex.c. Reuben Thomas <rrt@sc3d.org> parents: 13549 diff changeset	39 The Regex library is used by including @file{regex.h}.
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	40 @pindex regex.h
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	41 Regex provides three groups of functions with which you can operate on
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	42 regular expressions. One group---the @sc{gnu} group---is more powerful
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	43 but not completely compatible with the other two, namely the @sc{posix}
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	44 and Berkeley @sc{unix} groups; its interface was designed specifically
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	45 for @sc{gnu}. The other groups have the same interfaces as do the
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	46 regular expression functions in @sc{posix} and Berkeley
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	47 @sc{unix}.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	48
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	49 We wrote this chapter with programmers in mind, not users of
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	50 programs---such as Emacs---that use Regex. We describe the Regex
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	51 library in its entirety, not how to write regular expressions that a
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	52 particular program understands.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	53
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	54
13533 ca70a11e70e2 Integrate the regex documentation. Bruno Haible <bruno@clisp.org> parents: 13532 diff changeset	55 @node Regular Expression Syntax
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	56 @chapter Regular Expression Syntax
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	57
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	58 @cindex regular expressions, syntax of
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	59 @cindex syntax of regular expressions
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	60
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	61 @dfn{Characters} are things you can type. @dfn{Operators} are things in
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	62 a regular expression that match one or more characters. You compose
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	63 regular expressions from operators, which in turn you specify using one
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	64 or more characters.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	65
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	66 Most characters represent what we call the match-self operator, i.e.,
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	67 they match themselves; we call these characters @dfn{ordinary}. Other
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	68 characters represent either all or parts of fancier operators; e.g.,
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	69 @samp{.} represents what we call the match-any-character operator
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	70 (which, no surprise, matches (almost) any character); we call these
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	71 characters @dfn{special}. Two different things determine what
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	72 characters represent what operators:
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	73
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	74 @enumerate
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	75 @item
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	76 the regular expression syntax your program has told the Regex library to
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	77 recognize, and
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	78
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	79 @item
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	80 the context of the character in the regular expression.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	81 @end enumerate
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	82
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	83 In the following sections, we describe these things in more detail.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	84
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	85 @menu
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	86 * Syntax Bits::
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	87 * Predefined Syntaxes::
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	88 * Collating Elements vs. Characters::
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	89 * The Backslash Character::
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	90 @end menu
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	91
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	92
13533 ca70a11e70e2 Integrate the regex documentation. Bruno Haible <bruno@clisp.org> parents: 13532 diff changeset	93 @node Syntax Bits
13532 b0bea693e638 Whitespace cleanup. Bruno Haible <bruno@clisp.org> parents: 13531 diff changeset	94 @section Syntax Bits
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	95
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	96 @cindex syntax bits
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	97
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	98 In any particular syntax for regular expressions, some characters are
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	99 always special, others are sometimes special, and others are never
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	100 special. The particular syntax that Regex recognizes for a given
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	101 regular expression depends on the value in the @code{syntax} field of
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	102 the pattern buffer of that regular expression.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	103
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	104 You get a pattern buffer by compiling a regular expression. @xref{GNU
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	105 Pattern Buffers}, and @ref{POSIX Pattern Buffers}, for more information
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	106 on pattern buffers. @xref{GNU Regular Expression Compiling}, @ref{POSIX
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	107 Regular Expression Compiling}, and @ref{BSD Regular Expression
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	108 Compiling}, for more information on compiling.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	109
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	110 Regex considers the value of the @code{syntax} field to be a collection
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	111 of bits; we refer to these bits as @dfn{syntax bits}. In most cases,
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	112 they affect what characters represent what operators. We describe the
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	113 meanings of the operators to which we refer in @ref{Common Operators},
13532 b0bea693e638 Whitespace cleanup. Bruno Haible <bruno@clisp.org> parents: 13531 diff changeset	114 @ref{GNU Operators}, and @ref{GNU Emacs Operators}.
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	115
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	116 For reference, here is the complete list of syntax bits, in alphabetical
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	117 order:
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	118
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	119 @table @code
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	120
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	121 @cnindex RE_BACKSLASH_ESCAPE_IN_LIST
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	122 @item RE_BACKSLASH_ESCAPE_IN_LISTS
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	123 If this bit is set, then @samp{\} inside a list (@pxref{List Operators}
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	124 quotes (makes ordinary, if it's special) the following character; if
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	125 this bit isn't set, then @samp{\} is an ordinary character inside lists.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	126 (@xref{The Backslash Character}, for what `\' does outside of lists.)
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	127
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	128 @cnindex RE_BK_PLUS_QM
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	129 @item RE_BK_PLUS_QM
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	130 If this bit is set, then @samp{\+} represents the match-one-or-more
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	131 operator and @samp{\?} represents the match-zero-or-more operator; if
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	132 this bit isn't set, then @samp{+} represents the match-one-or-more
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	133 operator and @samp{?} represents the match-zero-or-one operator. This
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	134 bit is irrelevant if @code{RE_LIMITED_OPS} is set.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	135
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	136 @cnindex RE_CHAR_CLASSES
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	137 @item RE_CHAR_CLASSES
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	138 If this bit is set, then you can use character classes in lists; if this
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	139 bit isn't set, then you can't.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	140
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	141 @cnindex RE_CONTEXT_INDEP_ANCHORS
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	142 @item RE_CONTEXT_INDEP_ANCHORS
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	143 If this bit is set, then @samp{^} and @samp{$} are special anywhere outside
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	144 a list; if this bit isn't set, then these characters are special only in
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	145 certain contexts. @xref{Match-beginning-of-line Operator}, and
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	146 @ref{Match-end-of-line Operator}.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	147
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	148 @cnindex RE_CONTEXT_INDEP_OPS
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	149 @item RE_CONTEXT_INDEP_OPS
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	150 If this bit is set, then certain characters are special anywhere outside
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	151 a list; if this bit isn't set, then those characters are special only in
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	152 some contexts and are ordinary elsewhere. Specifically, if this bit
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	153 isn't set then @samp{*}, and (if the syntax bit @code{RE_LIMITED_OPS}
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	154 isn't set) @samp{+} and @samp{?} (or @samp{\+} and @samp{\?}, depending
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	155 on the syntax bit @code{RE_BK_PLUS_QM}) represent repetition operators
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	156 only if they're not first in a regular expression or just after an
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	157 open-group or alternation operator. The same holds for @samp{@{} (or
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	158 @samp{\@{}, depending on the syntax bit @code{RE_NO_BK_BRACES}) if
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	159 it is the beginning of a valid interval and the syntax bit
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	160 @code{RE_INTERVALS} is set.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	161
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	162 @cnindex RE_CONTEXT_INVALID_OPS
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	163 @item RE_CONTEXT_INVALID_OPS
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	164 If this bit is set, then repetition and alternation operators can't be
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	165 in certain positions within a regular expression. Specifically, the
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	166 regular expression is invalid if it has:
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	167
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	168 @itemize @bullet
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	169
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	170 @item
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	171 a repetition operator first in the regular expression or just after a
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	172 match-beginning-of-line, open-group, or alternation operator; or
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	173
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	174 @item
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	175 an alternation operator first or last in the regular expression, just
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	176 before a match-end-of-line operator, or just after an alternation or
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	177 open-group operator.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	178
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	179 @end itemize
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	180
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	181 If this bit isn't set, then you can put the characters representing the
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	182 repetition and alternation characters anywhere in a regular expression.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	183 Whether or not they will in fact be operators in certain positions
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	184 depends on other syntax bits.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	185
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	186 @cnindex RE_DOT_NEWLINE
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	187 @item RE_DOT_NEWLINE
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	188 If this bit is set, then the match-any-character operator matches
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	189 a newline; if this bit isn't set, then it doesn't.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	190
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	191 @cnindex RE_DOT_NOT_NULL
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	192 @item RE_DOT_NOT_NULL
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	193 If this bit is set, then the match-any-character operator doesn't match
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	194 a null character; if this bit isn't set, then it does.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	195
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	196 @cnindex RE_INTERVALS
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	197 @item RE_INTERVALS
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	198 If this bit is set, then Regex recognizes interval operators; if this bit
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	199 isn't set, then it doesn't.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	200
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	201 @cnindex RE_LIMITED_OPS
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	202 @item RE_LIMITED_OPS
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	203 If this bit is set, then Regex doesn't recognize the match-one-or-more,
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	204 match-zero-or-one or alternation operators; if this bit isn't set, then
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	205 it does.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	206
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	207 @cnindex RE_NEWLINE_ALT
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	208 @item RE_NEWLINE_ALT
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	209 If this bit is set, then newline represents the alternation operator; if
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	210 this bit isn't set, then newline is ordinary.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	211
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	212 @cnindex RE_NO_BK_BRACES
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	213 @item RE_NO_BK_BRACES
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	214 If this bit is set, then @samp{@{} represents the open-interval operator
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	215 and @samp{@}} represents the close-interval operator; if this bit isn't
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	216 set, then @samp{\@{} represents the open-interval operator and
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	217 @samp{\@}} represents the close-interval operator. This bit is relevant
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	218 only if @code{RE_INTERVALS} is set.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	219
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	220 @cnindex RE_NO_BK_PARENS
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	221 @item RE_NO_BK_PARENS
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	222 If this bit is set, then @samp{(} represents the open-group operator and
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	223 @samp{)} represents the close-group operator; if this bit isn't set, then
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	224 @samp{\(} represents the open-group operator and @samp{\)} represents
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	225 the close-group operator.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	226
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	227 @cnindex RE_NO_BK_REFS
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	228 @item RE_NO_BK_REFS
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	229 If this bit is set, then Regex doesn't recognize @samp{\}@var{digit} as
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	230 the back reference operator; if this bit isn't set, then it does.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	231
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	232 @cnindex RE_NO_BK_VBAR
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	233 @item RE_NO_BK_VBAR
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	234 If this bit is set, then @samp{\|} represents the alternation operator;
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	235 if this bit isn't set, then @samp{\\|} represents the alternation
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	236 operator. This bit is irrelevant if @code{RE_LIMITED_OPS} is set.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	237
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	238 @cnindex RE_NO_EMPTY_RANGES
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	239 @item RE_NO_EMPTY_RANGES
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	240 If this bit is set, then a regular expression with a range whose ending
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	241 point collates lower than its starting point is invalid; if this bit
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	242 isn't set, then Regex considers such a range to be empty.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	243
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	244 @cnindex RE_UNMATCHED_RIGHT_PAREN_ORD
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	245 @item RE_UNMATCHED_RIGHT_PAREN_ORD
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	246 If this bit is set and the regular expression has no matching open-group
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	247 operator, then Regex considers what would otherwise be a close-group
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	248 operator (based on how @code{RE_NO_BK_PARENS} is set) to match @samp{)}.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	249
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	250 @end table
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	251
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	252
13533 ca70a11e70e2 Integrate the regex documentation. Bruno Haible <bruno@clisp.org> parents: 13532 diff changeset	253 @node Predefined Syntaxes
13532 b0bea693e638 Whitespace cleanup. Bruno Haible <bruno@clisp.org> parents: 13531 diff changeset	254 @section Predefined Syntaxes
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	255
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	256 If you're programming with Regex, you can set a pattern buffer's
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	257 (@pxref{GNU Pattern Buffers}, and @ref{POSIX Pattern Buffers})
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	258 @code{syntax} field either to an arbitrary combination of syntax bits
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	259 (@pxref{Syntax Bits}) or else to the configurations defined by Regex.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	260 These configurations define the syntaxes used by certain
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	261 programs---@sc{gnu} Emacs,
13532 b0bea693e638 Whitespace cleanup. Bruno Haible <bruno@clisp.org> parents: 13531 diff changeset	262 @cindex Emacs
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	263 @sc{posix} Awk,
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	264 @cindex POSIX Awk
13532 b0bea693e638 Whitespace cleanup. Bruno Haible <bruno@clisp.org> parents: 13531 diff changeset	265 traditional Awk,
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	266 @cindex Awk
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	267 Grep,
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	268 @cindex Grep
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	269 @cindex Egrep
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	270 Egrep---in addition to syntaxes for @sc{posix} basic and extended
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	271 regular expressions.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	272
13549 bb0ceefd22dc avoid some overlong lines from posix urls, etc. Karl Berry <karl@freefriends.org> parents: 13537 diff changeset	273 The predefined syntaxes---taken directly from @file{regex.h}---are:
bb0ceefd22dc avoid some overlong lines from posix urls, etc. Karl Berry <karl@freefriends.org> parents: 13537 diff changeset	274
bb0ceefd22dc avoid some overlong lines from posix urls, etc. Karl Berry <karl@freefriends.org> parents: 13537 diff changeset	275 @smallexample
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	276 #define RE_SYNTAX_EMACS 0
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	277
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	278 #define RE_SYNTAX_AWK \
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	279 (RE_BACKSLASH_ESCAPE_IN_LISTS \| RE_DOT_NOT_NULL \
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	280 \| RE_NO_BK_PARENS \| RE_NO_BK_REFS \
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	281 \| RE_NO_BK_VBAR \| RE_NO_EMPTY_RANGES \
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	282 \| RE_UNMATCHED_RIGHT_PAREN_ORD)
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	283
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	284 #define RE_SYNTAX_POSIX_AWK \
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	285 (RE_SYNTAX_POSIX_EXTENDED \| RE_BACKSLASH_ESCAPE_IN_LISTS)
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	286
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	287 #define RE_SYNTAX_GREP \
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	288 (RE_BK_PLUS_QM \| RE_CHAR_CLASSES \
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	289 \| RE_HAT_LISTS_NOT_NEWLINE \| RE_INTERVALS \
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	290 \| RE_NEWLINE_ALT)
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	291
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	292 #define RE_SYNTAX_EGREP \
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	293 (RE_CHAR_CLASSES \| RE_CONTEXT_INDEP_ANCHORS \
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	294 \| RE_CONTEXT_INDEP_OPS \| RE_HAT_LISTS_NOT_NEWLINE \
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	295 \| RE_NEWLINE_ALT \| RE_NO_BK_PARENS \
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	296 \| RE_NO_BK_VBAR)
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	297
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	298 #define RE_SYNTAX_POSIX_EGREP \
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	299 (RE_SYNTAX_EGREP \| RE_INTERVALS \| RE_NO_BK_BRACES)
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	300
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	301 /* P1003.2/D11.2, section 4.20.7.1, lines 5078ff. */
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	302 #define RE_SYNTAX_ED RE_SYNTAX_POSIX_BASIC
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	303
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	304 #define RE_SYNTAX_SED RE_SYNTAX_POSIX_BASIC
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	305
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	306 /* Syntax bits common to both basic and extended POSIX regex syntax. */
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	307 #define _RE_SYNTAX_POSIX_COMMON \
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	308 (RE_CHAR_CLASSES \| RE_DOT_NEWLINE \| RE_DOT_NOT_NULL \
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	309 \| RE_INTERVALS \| RE_NO_EMPTY_RANGES)
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	310
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	311 #define RE_SYNTAX_POSIX_BASIC \
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	312 (_RE_SYNTAX_POSIX_COMMON \| RE_BK_PLUS_QM)
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	313
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	314 /* Differs from ..._POSIX_BASIC only in that RE_BK_PLUS_QM becomes
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	315 RE_LIMITED_OPS, i.e., \? \+ \\| are not recognized. Actually, this
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	316 isn't minimal, since other operators, such as \`, aren't disabled. */
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	317 #define RE_SYNTAX_POSIX_MINIMAL_BASIC \
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	318 (_RE_SYNTAX_POSIX_COMMON \| RE_LIMITED_OPS)
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	319
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	320 #define RE_SYNTAX_POSIX_EXTENDED \
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	321 (_RE_SYNTAX_POSIX_COMMON \| RE_CONTEXT_INDEP_ANCHORS \
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	322 \| RE_CONTEXT_INDEP_OPS \| RE_NO_BK_BRACES \
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	323 \| RE_NO_BK_PARENS \| RE_NO_BK_VBAR \
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	324 \| RE_UNMATCHED_RIGHT_PAREN_ORD)
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	325
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	326 /* Differs from ..._POSIX_EXTENDED in that RE_CONTEXT_INVALID_OPS
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	327 replaces RE_CONTEXT_INDEP_OPS and RE_NO_BK_REFS is added. */
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	328 #define RE_SYNTAX_POSIX_MINIMAL_EXTENDED \
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	329 (_RE_SYNTAX_POSIX_COMMON \| RE_CONTEXT_INDEP_ANCHORS \
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	330 \| RE_CONTEXT_INVALID_OPS \| RE_NO_BK_BRACES \
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	331 \| RE_NO_BK_PARENS \| RE_NO_BK_REFS \
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	332 \| RE_NO_BK_VBAR \| RE_UNMATCHED_RIGHT_PAREN_ORD)
13549 bb0ceefd22dc avoid some overlong lines from posix urls, etc. Karl Berry <karl@freefriends.org> parents: 13537 diff changeset	333 @end smallexample
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	334
13533 ca70a11e70e2 Integrate the regex documentation. Bruno Haible <bruno@clisp.org> parents: 13532 diff changeset	335 @node Collating Elements vs. Characters
13532 b0bea693e638 Whitespace cleanup. Bruno Haible <bruno@clisp.org> parents: 13531 diff changeset	336 @section Collating Elements vs.@: Characters
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	337
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	338 @sc{posix} generalizes the notion of a character to that of a
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	339 collating element. It defines a @dfn{collating element} to be ``a
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	340 sequence of one or more bytes defined in the current collating sequence
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	341 as a unit of collation.''
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	342
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	343 This generalizes the notion of a character in
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	344 two ways. First, a single character can map into two or more collating
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	345 elements. For example, the German
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	346 @tex
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	347 `\ss'
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	348 @end tex
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	349 @ifinfo
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	350 ``es-zet''
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	351 @end ifinfo
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	352 collates as the collating element @samp{s} followed by another collating
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	353 element @samp{s}. Second, two or more characters can map into one
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	354 collating element. For example, the Spanish @samp{ll} collates after
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	355 @samp{l} and before @samp{m}.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	356
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	357 Since @sc{posix}'s ``collating element'' preserves the essential idea of
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	358 a ``character,'' we use the latter, more familiar, term in this document.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	359
13533 ca70a11e70e2 Integrate the regex documentation. Bruno Haible <bruno@clisp.org> parents: 13532 diff changeset	360 @node The Backslash Character
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	361 @section The Backslash Character
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	362
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	363 @cindex \
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	364 The @samp{\} character has one of four different meanings, depending on
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	365 the context in which you use it and what syntax bits are set
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	366 (@pxref{Syntax Bits}). It can: 1) stand for itself, 2) quote the next
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	367 character, 3) introduce an operator, or 4) do nothing.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	368
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	369 @enumerate
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	370 @item
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	371 It stands for itself inside a list
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	372 (@pxref{List Operators}) if the syntax bit
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	373 @code{RE_BACKSLASH_ESCAPE_IN_LISTS} is not set. For example, @samp{[\]}
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	374 would match @samp{\}.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	375
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	376 @item
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	377 It quotes (makes ordinary, if it's special) the next character when you
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	378 use it either:
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	379
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	380 @itemize @bullet
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	381 @item
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	382 outside a list,@footnote{Sometimes
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	383 you don't have to explicitly quote special characters to make
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	384 them ordinary. For instance, most characters lose any special meaning
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	385 inside a list (@pxref{List Operators}). In addition, if the syntax bits
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	386 @code{RE_CONTEXT_INVALID_OPS} and @code{RE_CONTEXT_INDEP_OPS}
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	387 aren't set, then (for historical reasons) the matcher considers special
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	388 characters ordinary if they are in contexts where the operations they
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	389 represent make no sense; for example, then the match-zero-or-more
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	390 operator (represented by @samp{*}) matches itself in the regular
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	391 expression @samp{*foo} because there is no preceding expression on which
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	392 it can operate. It is poor practice, however, to depend on this
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	393 behavior; if you want a special character to be ordinary outside a list,
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	394 it's better to always quote it, regardless.} or
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	395
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	396 @item
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	397 inside a list and the syntax bit @code{RE_BACKSLASH_ESCAPE_IN_LISTS} is set.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	398
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	399 @end itemize
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	400
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	401 @item
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	402 It introduces an operator when followed by certain ordinary
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	403 characters---sometimes only when certain syntax bits are set. See the
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	404 cases @code{RE_BK_PLUS_QM}, @code{RE_NO_BK_BRACES}, @code{RE_NO_BK_VAR},
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	405 @code{RE_NO_BK_PARENS}, @code{RE_NO_BK_REF} in @ref{Syntax Bits}. Also:
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	406
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	407 @itemize @bullet
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	408 @item
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	409 @samp{\b} represents the match-word-boundary operator
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	410 (@pxref{Match-word-boundary Operator}).
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	411
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	412 @item
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	413 @samp{\B} represents the match-within-word operator
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	414 (@pxref{Match-within-word Operator}).
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	415
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	416 @item
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	417 @samp{\<} represents the match-beginning-of-word operator @*
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	418 (@pxref{Match-beginning-of-word Operator}).
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	419
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	420 @item
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	421 @samp{\>} represents the match-end-of-word operator
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	422 (@pxref{Match-end-of-word Operator}).
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	423
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	424 @item
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	425 @samp{\w} represents the match-word-constituent operator
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	426 (@pxref{Match-word-constituent Operator}).
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	427
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	428 @item
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	429 @samp{\W} represents the match-non-word-constituent operator
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	430 (@pxref{Match-non-word-constituent Operator}).
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	431
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	432 @item
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	433 @samp{\`} represents the match-beginning-of-buffer
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	434 operator and @samp{\'} represents the match-end-of-buffer operator
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	435 (@pxref{Buffer Operators}).
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	436
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	437 @item
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	438 If Regex was compiled with the C preprocessor symbol @code{emacs}
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	439 defined, then @samp{\s@var{class}} represents the match-syntactic-class
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	440 operator and @samp{\S@var{class}} represents the
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	441 match-not-syntactic-class operator (@pxref{Syntactic Class Operators}).
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	442
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	443 @end itemize
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	444
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	445 @item
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	446 In all other cases, Regex ignores @samp{\}. For example,
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	447 @samp{\n} matches @samp{n}.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	448
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	449 @end enumerate
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	450
13533 ca70a11e70e2 Integrate the regex documentation. Bruno Haible <bruno@clisp.org> parents: 13532 diff changeset	451 @node Common Operators
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	452 @chapter Common Operators
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	453
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	454 You compose regular expressions from operators. In the following
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	455 sections, we describe the regular expression operators specified by
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	456 @sc{posix}; @sc{gnu} also uses these. Most operators have more than one
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	457 representation as characters. @xref{Regular Expression Syntax}, for
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	458 what characters represent what operators under what circumstances.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	459
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	460 For most operators that can be represented in two ways, one
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	461 representation is a single character and the other is that character
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	462 preceded by @samp{\}. For example, either @samp{(} or @samp{\(}
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	463 represents the open-group operator. Which one does depends on the
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	464 setting of a syntax bit, in this case @code{RE_NO_BK_PARENS}. Why is
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	465 this so? Historical reasons dictate some of the varying
13532 b0bea693e638 Whitespace cleanup. Bruno Haible <bruno@clisp.org> parents: 13531 diff changeset	466 representations, while @sc{posix} dictates others.
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	467
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	468 Finally, almost all characters lose any special meaning inside a list
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	469 (@pxref{List Operators}).
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	470
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	471 @menu
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	472 * Match-self Operator:: Ordinary characters.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	473 * Match-any-character Operator:: .
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	474 * Concatenation Operator:: Juxtaposition.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	475 * Repetition Operators:: * + ? @{@}
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	476 * Alternation Operator:: \|
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	477 * List Operators:: [...] [^...]
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	478 * Grouping Operators:: (...)
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	479 * Back-reference Operator:: \digit
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	480 * Anchoring Operators:: ^ $
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	481 @end menu
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	482
13533 ca70a11e70e2 Integrate the regex documentation. Bruno Haible <bruno@clisp.org> parents: 13532 diff changeset	483 @node Match-self Operator
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	484 @section The Match-self Operator (@var{ordinary character})
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	485
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	486 This operator matches the character itself. All ordinary characters
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	487 (@pxref{Regular Expression Syntax}) represent this operator. For
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	488 example, @samp{f} is always an ordinary character, so the regular
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	489 expression @samp{f} matches only the string @samp{f}. In
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	490 particular, it does @emph{not} match the string @samp{ff}.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	491
13533 ca70a11e70e2 Integrate the regex documentation. Bruno Haible <bruno@clisp.org> parents: 13532 diff changeset	492 @node Match-any-character Operator
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	493 @section The Match-any-character Operator (@code{.})
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	494
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	495 @cindex @samp{.}
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	496
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	497 This operator matches any single printing or nonprinting character
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	498 except it won't match a:
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	499
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	500 @table @asis
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	501 @item newline
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	502 if the syntax bit @code{RE_DOT_NEWLINE} isn't set.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	503
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	504 @item null
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	505 if the syntax bit @code{RE_DOT_NOT_NULL} is set.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	506
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	507 @end table
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	508
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	509 The @samp{.} (period) character represents this operator. For example,
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	510 @samp{a.b} matches any three-character string beginning with @samp{a}
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	511 and ending with @samp{b}.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	512
13533 ca70a11e70e2 Integrate the regex documentation. Bruno Haible <bruno@clisp.org> parents: 13532 diff changeset	513 @node Concatenation Operator
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	514 @section The Concatenation Operator
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	515
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	516 This operator concatenates two regular expressions @var{a} and @var{b}.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	517 No character represents this operator; you simply put @var{b} after
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	518 @var{a}. The result is a regular expression that will match a string if
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	519 @var{a} matches its first part and @var{b} matches the rest. For
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	520 example, @samp{xy} (two match-self operators) matches @samp{xy}.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	521
13533 ca70a11e70e2 Integrate the regex documentation. Bruno Haible <bruno@clisp.org> parents: 13532 diff changeset	522 @node Repetition Operators
13532 b0bea693e638 Whitespace cleanup. Bruno Haible <bruno@clisp.org> parents: 13531 diff changeset	523 @section Repetition Operators
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	524
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	525 Repetition operators repeat the preceding regular expression a specified
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	526 number of times.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	527
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	528 @menu
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	529 * Match-zero-or-more Operator:: *
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	530 * Match-one-or-more Operator:: +
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	531 * Match-zero-or-one Operator:: ?
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	532 * Interval Operators:: @{@}
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	533 @end menu
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	534
13533 ca70a11e70e2 Integrate the regex documentation. Bruno Haible <bruno@clisp.org> parents: 13532 diff changeset	535 @node Match-zero-or-more Operator
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	536 @subsection The Match-zero-or-more Operator (@code{*})
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	537
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	538 @cindex @samp{*}
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	539
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	540 This operator repeats the smallest possible preceding regular expression
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	541 as many times as necessary (including zero) to match the pattern.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	542 @samp{} represents this operator. For example, @samp{o}
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	543 matches any string made up of zero or more @samp{o}s. Since this
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	544 operator operates on the smallest preceding regular expression,
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	545 @samp{fo*} has a repeating @samp{o}, not a repeating @samp{fo}. So,
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	546 @samp{fo*} matches @samp{f}, @samp{fo}, @samp{foo}, and so on.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	547
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	548 Since the match-zero-or-more operator is a suffix operator, it may be
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	549 useless as such when no regular expression precedes it. This is the
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	550 case when it:
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	551
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	552 @itemize @bullet
13532 b0bea693e638 Whitespace cleanup. Bruno Haible <bruno@clisp.org> parents: 13531 diff changeset	553 @item
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	554 is first in a regular expression, or
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	555
13532 b0bea693e638 Whitespace cleanup. Bruno Haible <bruno@clisp.org> parents: 13531 diff changeset	556 @item
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	557 follows a match-beginning-of-line, open-group, or alternation
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	558 operator.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	559
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	560 @end itemize
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	561
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	562 @noindent
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	563 Three different things can happen in these cases:
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	564
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	565 @enumerate
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	566 @item
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	567 If the syntax bit @code{RE_CONTEXT_INVALID_OPS} is set, then the
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	568 regular expression is invalid.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	569
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	570 @item
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	571 If @code{RE_CONTEXT_INVALID_OPS} isn't set, but
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	572 @code{RE_CONTEXT_INDEP_OPS} is, then @samp{*} represents the
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	573 match-zero-or-more operator (which then operates on the empty string).
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	574
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	575 @item
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	576 Otherwise, @samp{*} is ordinary.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	577
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	578 @end enumerate
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	579
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	580 @cindex backtracking
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	581 The matcher processes a match-zero-or-more operator by first matching as
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	582 many repetitions of the smallest preceding regular expression as it can.
13532 b0bea693e638 Whitespace cleanup. Bruno Haible <bruno@clisp.org> parents: 13531 diff changeset	583 Then it continues to match the rest of the pattern.
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	584
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	585 If it can't match the rest of the pattern, it backtracks (as many times
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	586 as necessary), each time discarding one of the matches until it can
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	587 either match the entire pattern or be certain that it cannot get a
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	588 match. For example, when matching @samp{ca*ar} against @samp{caaar},
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	589 the matcher first matches all three @samp{a}s of the string with the
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	590 @samp{a*} of the regular expression. However, it cannot then match the
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	591 final @samp{ar} of the regular expression against the final @samp{r} of
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	592 the string. So it backtracks, discarding the match of the last @samp{a}
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	593 in the string. It can then match the remaining @samp{ar}.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	594
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	595
13533 ca70a11e70e2 Integrate the regex documentation. Bruno Haible <bruno@clisp.org> parents: 13532 diff changeset	596 @node Match-one-or-more Operator
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	597 @subsection The Match-one-or-more Operator (@code{+} or @code{\+})
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	598
13532 b0bea693e638 Whitespace cleanup. Bruno Haible <bruno@clisp.org> parents: 13531 diff changeset	599 @cindex @samp{+}
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	600
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	601 If the syntax bit @code{RE_LIMITED_OPS} is set, then Regex doesn't recognize
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	602 this operator. Otherwise, if the syntax bit @code{RE_BK_PLUS_QM} isn't
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	603 set, then @samp{+} represents this operator; if it is, then @samp{\+}
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	604 does.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	605
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	606 This operator is similar to the match-zero-or-more operator except that
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	607 it repeats the preceding regular expression at least once;
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	608 @pxref{Match-zero-or-more Operator}, for what it operates on, how some
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	609 syntax bits affect it, and how Regex backtracks to match it.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	610
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	611 For example, supposing that @samp{+} represents the match-one-or-more
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	612 operator; then @samp{ca+r} matches, e.g., @samp{car} and
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	613 @samp{caaaar}, but not @samp{cr}.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	614
13533 ca70a11e70e2 Integrate the regex documentation. Bruno Haible <bruno@clisp.org> parents: 13532 diff changeset	615 @node Match-zero-or-one Operator
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	616 @subsection The Match-zero-or-one Operator (@code{?} or @code{\?})
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	617 @cindex @samp{?}
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	618
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	619 If the syntax bit @code{RE_LIMITED_OPS} is set, then Regex doesn't
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	620 recognize this operator. Otherwise, if the syntax bit
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	621 @code{RE_BK_PLUS_QM} isn't set, then @samp{?} represents this operator;
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	622 if it is, then @samp{\?} does.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	623
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	624 This operator is similar to the match-zero-or-more operator except that
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	625 it repeats the preceding regular expression once or not at all;
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	626 @pxref{Match-zero-or-more Operator}, to see what it operates on, how
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	627 some syntax bits affect it, and how Regex backtracks to match it.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	628
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	629 For example, supposing that @samp{?} represents the match-zero-or-one
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	630 operator; then @samp{ca?r} matches both @samp{car} and @samp{cr}, but
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	631 nothing else.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	632
13533 ca70a11e70e2 Integrate the regex documentation. Bruno Haible <bruno@clisp.org> parents: 13532 diff changeset	633 @node Interval Operators
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	634 @subsection Interval Operators (@code{@{} @dots{} @code{@}} or @code{\@{} @dots{} @code{\@}})
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	635
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	636 @cindex interval expression
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	637 @cindex @samp{@{}
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	638 @cindex @samp{@}}
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	639 @cindex @samp{\@{}
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	640 @cindex @samp{\@}}
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	641
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	642 If the syntax bit @code{RE_INTERVALS} is set, then Regex recognizes
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	643 @dfn{interval expressions}. They repeat the smallest possible preceding
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	644 regular expression a specified number of times.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	645
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	646 If the syntax bit @code{RE_NO_BK_BRACES} is set, @samp{@{} represents
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	647 the @dfn{open-interval operator} and @samp{@}} represents the
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	648 @dfn{close-interval operator} ; otherwise, @samp{\@{} and @samp{\@}} do.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	649
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	650 Specifically, supposing that @samp{@{} and @samp{@}} represent the
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	651 open-interval and close-interval operators; then:
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	652
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	653 @table @code
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	654 @item @{@var{count}@}
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	655 matches exactly @var{count} occurrences of the preceding regular
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	656 expression.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	657
13537 77dd6d58a96b erroneous commas inside @var Karl Berry <karl@freefriends.org> parents: 13533 diff changeset	658 @item @{@var{min},@}
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	659 matches @var{min} or more occurrences of the preceding regular
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	660 expression.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	661
13537 77dd6d58a96b erroneous commas inside @var Karl Berry <karl@freefriends.org> parents: 13533 diff changeset	662 @item @{@var{min}, @var{max}@}
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	663 matches at least @var{min} but no more than @var{max} occurrences of
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	664 the preceding regular expression.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	665
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	666 @end table
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	667
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	668 The interval expression (but not necessarily the regular expression that
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	669 contains it) is invalid if:
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	670
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	671 @itemize @bullet
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	672 @item
13532 b0bea693e638 Whitespace cleanup. Bruno Haible <bruno@clisp.org> parents: 13531 diff changeset	673 @var{min} is greater than @var{max}, or
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	674
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	675 @item
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	676 any of @var{count}, @var{min}, or @var{max} are outside the range
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	677 zero to @code{RE_DUP_MAX} (which symbol @file{regex.h}
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	678 defines).
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	679
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	680 @end itemize
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	681
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	682 If the interval expression is invalid and the syntax bit
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	683 @code{RE_NO_BK_BRACES} is set, then Regex considers all the
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	684 characters in the would-be interval to be ordinary. If that bit
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	685 isn't set, then the regular expression is invalid.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	686
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	687 If the interval expression is valid but there is no preceding regular
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	688 expression on which to operate, then if the syntax bit
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	689 @code{RE_CONTEXT_INVALID_OPS} is set, the regular expression is invalid.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	690 If that bit isn't set, then Regex considers all the characters---other
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	691 than backslashes, which it ignores---in the would-be interval to be
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	692 ordinary.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	693
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	694
13533 ca70a11e70e2 Integrate the regex documentation. Bruno Haible <bruno@clisp.org> parents: 13532 diff changeset	695 @node Alternation Operator
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	696 @section The Alternation Operator (@code{\|} or @code{\\|})
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	697
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	698 @kindex \|
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	699 @kindex \\|
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	700 @cindex alternation operator
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	701 @cindex or operator
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	702
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	703 If the syntax bit @code{RE_LIMITED_OPS} is set, then Regex doesn't
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	704 recognize this operator. Otherwise, if the syntax bit
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	705 @code{RE_NO_BK_VBAR} is set, then @samp{\|} represents this operator;
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	706 otherwise, @samp{\\|} does.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	707
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	708 Alternatives match one of a choice of regular expressions:
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	709 if you put the character(s) representing the alternation operator between
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	710 any two regular expressions @var{a} and @var{b}, the result matches
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	711 the union of the strings that @var{a} and @var{b} match. For
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	712 example, supposing that @samp{\|} is the alternation operator, then
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	713 @samp{foo\|bar\|quux} would match any of @samp{foo}, @samp{bar} or
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	714 @samp{quux}.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	715
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	716 @ignore
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	717 @c Nobody needs to disallow empty alternatives any more.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	718 If the syntax bit @code{RE_NO_EMPTY_ALTS} is set, then if either of the regular
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	719 expressions @var{a} or @var{b} is empty, the
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	720 regular expression is invalid. More precisely, if this syntax bit is
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	721 set, then the alternation operator can't:
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	722
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	723 @itemize @bullet
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	724 @item
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	725 be first or last in a regular expression;
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	726
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	727 @item
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	728 follow either another alternation operator or an open-group operator
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	729 (@pxref{Grouping Operators}); or
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	730
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	731 @item
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	732 precede a close-group operator.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	733
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	734 @end itemize
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	735
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	736 @noindent
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	737 For example, supposing @samp{(} and @samp{)} represent the open and
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	738 close-group operators, then @samp{\|foo}, @samp{foo\|}, @samp{foo\|\|bar},
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	739 @samp{foo(\|bar)}, and @samp{(foo\|)bar} would all be invalid.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	740 @end ignore
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	741
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	742 The alternation operator operates on the @emph{largest} possible
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	743 surrounding regular expressions. (Put another way, it has the lowest
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	744 precedence of any regular expression operator.)
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	745 Thus, the only way you can
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	746 delimit its arguments is to use grouping. For example, if @samp{(} and
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	747 @samp{)} are the open and close-group operators, then @samp{fo(o\|b)ar}
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	748 would match either @samp{fooar} or @samp{fobar}. (@samp{foo\|bar} would
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	749 match @samp{foo} or @samp{bar}.)
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	750
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	751 @cindex backtracking
13532 b0bea693e638 Whitespace cleanup. Bruno Haible <bruno@clisp.org> parents: 13531 diff changeset	752 The matcher usually tries all combinations of alternatives so as to
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	753 match the longest possible string. For example, when matching
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	754 @samp{(fooq\|foo)*(qbarquux\|bar)} against @samp{fooqbarquux}, it cannot
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	755 take, say, the first (``depth-first'') combination it could match, since
13532 b0bea693e638 Whitespace cleanup. Bruno Haible <bruno@clisp.org> parents: 13531 diff changeset	756 then it would be content to match just @samp{fooqbar}.
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	757
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	758 @comment xx something about leftmost-longest
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	759
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	760
13533 ca70a11e70e2 Integrate the regex documentation. Bruno Haible <bruno@clisp.org> parents: 13532 diff changeset	761 @node List Operators
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	762 @section List Operators (@code{[} @dots{} @code{]} and @code{[^} @dots{} @code{]})
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	763
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	764 @cindex matching list
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	765 @cindex @samp{[}
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	766 @cindex @samp{]}
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	767 @cindex @samp{^}
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	768 @cindex @samp{-}
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	769 @cindex @samp{\}
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	770 @cindex @samp{[^}
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	771 @cindex nonmatching list
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	772 @cindex matching newline
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	773 @cindex bracket expression
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	774
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	775 @dfn{Lists}, also called @dfn{bracket expressions}, are a set of one or
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	776 more items. An @dfn{item} is a character,
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	777 @ignore
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	778 (These get added when they get implemented.)
13532 b0bea693e638 Whitespace cleanup. Bruno Haible <bruno@clisp.org> parents: 13531 diff changeset	779 a collating symbol, an equivalence class expression,
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	780 @end ignore
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	781 a character class expression, or a range expression. The syntax bits
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	782 affect which kinds of items you can put in a list. We explain the last
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	783 two items in subsections below. Empty lists are invalid.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	784
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	785 A @dfn{matching list} matches a single character represented by one of
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	786 the list items. You form a matching list by enclosing one or more items
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	787 within an @dfn{open-matching-list operator} (represented by @samp{[})
13532 b0bea693e638 Whitespace cleanup. Bruno Haible <bruno@clisp.org> parents: 13531 diff changeset	788 and a @dfn{close-list operator} (represented by @samp{]}).
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	789
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	790 For example, @samp{[ab]} matches either @samp{a} or @samp{b}.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	791 @samp{[ad]*} matches the empty string and any string composed of just
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	792 @samp{a}s and @samp{d}s in any order. Regex considers invalid a regular
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	793 expression with a @samp{[} but no matching
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	794 @samp{]}.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	795
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	796 @dfn{Nonmatching lists} are similar to matching lists except that they
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	797 match a single character @emph{not} represented by one of the list
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	798 items. You use an @dfn{open-nonmatching-list operator} (represented by
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	799 @samp{[^}@footnote{Regex therefore doesn't consider the @samp{^} to be
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	800 the first character in the list. If you put a @samp{^} character first
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	801 in (what you think is) a matching list, you'll turn it into a
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	802 nonmatching list.}) instead of an open-matching-list operator to start a
13532 b0bea693e638 Whitespace cleanup. Bruno Haible <bruno@clisp.org> parents: 13531 diff changeset	803 nonmatching list.
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	804
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	805 For example, @samp{[^ab]} matches any character except @samp{a} or
13532 b0bea693e638 Whitespace cleanup. Bruno Haible <bruno@clisp.org> parents: 13531 diff changeset	806 @samp{b}.
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	807
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	808 If the @code{posix_newline} field in the pattern buffer (@pxref{GNU
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	809 Pattern Buffers} is set, then nonmatching lists do not match a newline.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	810
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	811 Most characters lose any special meaning inside a list. The special
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	812 characters inside a list follow.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	813
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	814 @table @samp
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	815 @item ]
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	816 ends the list if it's not the first list item. So, if you want to make
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	817 the @samp{]} character a list item, you must put it first.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	818
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	819 @item \
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	820 quotes the next character if the syntax bit @code{RE_BACKSLASH_ESCAPE_IN_LISTS} is
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	821 set.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	822
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	823 @ignore
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	824 Put these in if they get implemented.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	825
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	826 @item [.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	827 represents the open-collating-symbol operator (@pxref{Collating Symbol
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	828 Operators}).
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	829
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	830 @item .]
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	831 represents the close-collating-symbol operator.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	832
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	833 @item [=
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	834 represents the open-equivalence-class operator (@pxref{Equivalence Class
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	835 Operators}).
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	836
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	837 @item =]
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	838 represents the close-equivalence-class operator.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	839
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	840 @end ignore
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	841
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	842 @item [:
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	843 represents the open-character-class operator (@pxref{Character Class
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	844 Operators}) if the syntax bit @code{RE_CHAR_CLASSES} is set and what
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	845 follows is a valid character class expression.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	846
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	847 @item :]
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	848 represents the close-character-class operator if the syntax bit
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	849 @code{RE_CHAR_CLASSES} is set and what precedes it is an
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	850 open-character-class operator followed by a valid character class name.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	851
13532 b0bea693e638 Whitespace cleanup. Bruno Haible <bruno@clisp.org> parents: 13531 diff changeset	852 @item -
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	853 represents the range operator (@pxref{Range Operator}) if it's
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	854 not first or last in a list or the ending point of a range.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	855
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	856 @end table
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	857
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	858 @noindent
13532 b0bea693e638 Whitespace cleanup. Bruno Haible <bruno@clisp.org> parents: 13531 diff changeset	859 All other characters are ordinary. For example, @samp{[.*]} matches
b0bea693e638 Whitespace cleanup. Bruno Haible <bruno@clisp.org> parents: 13531 diff changeset	860 @samp{.} and @samp{*}.
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	861
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	862 @menu
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	863 * Character Class Operators:: [:class:]
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	864 * Range Operator:: start-end
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	865 @end menu
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	866
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	867 @ignore
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	868 (If collating symbols and equivalence class expressions get implemented,
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	869 then add this.)
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	870
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	871 node Collating Symbol Operators
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	872 subsubsection Collating Symbol Operators (@code{[.} @dots{} @code{.]})
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	873
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	874 If the syntax bit @code{XX} is set, then you can represent
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	875 collating symbols inside lists. You form a @dfn{collating symbol} by
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	876 putting a collating element between an @dfn{open-collating-symbol
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	877 operator} and an @dfn{close-collating-symbol operator}. @samp{[.}
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	878 represents the open-collating-symbol operator and @samp{.]} represents
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	879 the close-collating-symbol operator. For example, if @samp{ll} is a
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	880 collating element, then @samp{[[.ll.]]} would match @samp{ll}.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	881
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	882 node Equivalence Class Operators
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	883 subsubsection Equivalence Class Operators (@code{[=} @dots{} @code{=]})
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	884 @cindex equivalence class expression in regex
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	885 @cindex @samp{[=} in regex
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	886 @cindex @samp{=]} in regex
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	887
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	888 If the syntax bit @code{XX} is set, then Regex recognizes equivalence class
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	889 expressions inside lists. A @dfn{equivalence class expression} is a set
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	890 of collating elements which all belong to the same equivalence class.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	891 You form an equivalence class expression by putting a collating
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	892 element between an @dfn{open-equivalence-class operator} and a
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	893 @dfn{close-equivalence-class operator}. @samp{[=} represents the
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	894 open-equivalence-class operator and @samp{=]} represents the
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	895 close-equivalence-class operator. For example, if @samp{a} and @samp{A}
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	896 were an equivalence class, then both @samp{[[=a=]]} and @samp{[[=A=]]}
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	897 would match both @samp{a} and @samp{A}. If the collating element in an
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	898 equivalence class expression isn't part of an equivalence class, then
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	899 the matcher considers the equivalence class expression to be a collating
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	900 symbol.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	901
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	902 @end ignore
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	903
13533 ca70a11e70e2 Integrate the regex documentation. Bruno Haible <bruno@clisp.org> parents: 13532 diff changeset	904 @node Character Class Operators
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	905 @subsection Character Class Operators (@code{[:} @dots{} @code{:]})
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	906
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	907 @cindex character classes
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	908 @cindex @samp{[:} in regex
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	909 @cindex @samp{:]} in regex
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	910
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	911 If the syntax bit @code{RE_CHARACTER_CLASSES} is set, then Regex
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	912 recognizes character class expressions inside lists. A @dfn{character
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	913 class expression} matches one character from a given class. You form a
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	914 character class expression by putting a character class name between an
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	915 @dfn{open-character-class operator} (represented by @samp{[:}) and a
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	916 @dfn{close-character-class operator} (represented by @samp{:]}). The
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	917 character class names and their meanings are:
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	918
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	919 @table @code
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	920
13532 b0bea693e638 Whitespace cleanup. Bruno Haible <bruno@clisp.org> parents: 13531 diff changeset	921 @item alnum
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	922 letters and digits
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	923
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	924 @item alpha
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	925 letters
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	926
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	927 @item blank
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	928 system-dependent; for @sc{gnu}, a space or tab
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	929
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	930 @item cntrl
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	931 control characters (in the @sc{ascii} encoding, code 0177 and codes
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	932 less than 040)
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	933
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	934 @item digit
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	935 digits
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	936
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	937 @item graph
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	938 same as @code{print} except omits space
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	939
13532 b0bea693e638 Whitespace cleanup. Bruno Haible <bruno@clisp.org> parents: 13531 diff changeset	940 @item lower
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	941 lowercase letters
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	942
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	943 @item print
13532 b0bea693e638 Whitespace cleanup. Bruno Haible <bruno@clisp.org> parents: 13531 diff changeset	944 printable characters (in the @sc{ascii} encoding, space
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	945 tilde---codes 040 through 0176)
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	946
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	947 @item punct
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	948 neither control nor alphanumeric characters
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	949
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	950 @item space
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	951 space, carriage return, newline, vertical tab, and form feed
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	952
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	953 @item upper
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	954 uppercase letters
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	955
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	956 @item xdigit
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	957 hexadecimal digits: @code{0}--@code{9}, @code{a}--@code{f}, @code{A}--@code{F}
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	958
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	959 @end table
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	960
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	961 @noindent
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	962 These correspond to the definitions in the C library's @file{<ctype.h>}
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	963 facility. For example, @samp{[:alpha:]} corresponds to the standard
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	964 facility @code{isalpha}. Regex recognizes character class expressions
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	965 only inside of lists; so @samp{[[:alpha:]]} matches any letter, but
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	966 @samp{[:alpha:]} outside of a bracket expression and not followed by a
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	967 repetition operator matches just itself.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	968
13533 ca70a11e70e2 Integrate the regex documentation. Bruno Haible <bruno@clisp.org> parents: 13532 diff changeset	969 @node Range Operator
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	970 @subsection The Range Operator (@code{-})
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	971
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	972 Regex recognizes @dfn{range expressions} inside a list. They represent
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	973 those characters
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	974 that fall between two elements in the current collating sequence. You
13532 b0bea693e638 Whitespace cleanup. Bruno Haible <bruno@clisp.org> parents: 13531 diff changeset	975 form a range expression by putting a @dfn{range operator} between two
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	976 @ignore
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	977 (If these get implemented, then substitute this for ``characters.'')
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	978 of any of the following: characters, collating elements, collating symbols,
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	979 and equivalence class expressions. The starting point of the range and
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	980 the ending point of the range don't have to be the same kind of item,
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	981 e.g., the starting point could be a collating element and the ending
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	982 point could be an equivalence class expression. If a range's ending
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	983 point is an equivalence class, then all the collating elements in that
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	984 class will be in the range.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	985 @end ignore
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	986 characters.@footnote{You can't use a character class for the starting
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	987 or ending point of a range, since a character class is not a single
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	988 character.} @samp{-} represents the range operator. For example,
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	989 @samp{a-f} within a list represents all the characters from @samp{a}
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	990 through @samp{f}
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	991 inclusively.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	992
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	993 If the syntax bit @code{RE_NO_EMPTY_RANGES} is set, then if the range's
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	994 ending point collates less than its starting point, the range (and the
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	995 regular expression containing it) is invalid. For example, the regular
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	996 expression @samp{[z-a]} would be invalid. If this bit isn't set, then
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	997 Regex considers such a range to be empty.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	998
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	999 Since @samp{-} represents the range operator, if you want to make a
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1000 @samp{-} character itself
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1001 a list item, you must do one of the following:
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1002
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1003 @itemize @bullet
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1004 @item
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1005 Put the @samp{-} either first or last in the list.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1006
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1007 @item
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1008 Include a range whose starting point collates strictly lower than
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1009 @samp{-} and whose ending point collates equal or higher. Unless a
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1010 range is the first item in a list, a @samp{-} can't be its starting
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1011 point, but @emph{can} be its ending point. That is because Regex
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1012 considers @samp{-} to be the range operator unless it is preceded by
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1013 another @samp{-}. For example, in the @sc{ascii} encoding, @samp{)},
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1014 @samp{*}, @samp{+}, @samp{,}, @samp{-}, @samp{.}, and @samp{/} are
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1015 contiguous characters in the collating sequence. You might think that
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1016 @samp{[)-+--/]} has two ranges: @samp{)-+} and @samp{--/}. Rather, it
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1017 has the ranges @samp{)-+} and @samp{+--}, plus the character @samp{/}, so
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1018 it matches, e.g., @samp{,}, not @samp{.}.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1019
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1020 @item
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1021 Put a range whose starting point is @samp{-} first in the list.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1022
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1023 @end itemize
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1024
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1025 For example, @samp{[-a-z]} matches a lowercase letter or a hyphen (in
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1026 English, in @sc{ascii}).
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1027
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1028
13533 ca70a11e70e2 Integrate the regex documentation. Bruno Haible <bruno@clisp.org> parents: 13532 diff changeset	1029 @node Grouping Operators
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1030 @section Grouping Operators (@code{(} @dots{} @code{)} or @code{\(} @dots{} @code{\)})
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1031
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1032 @kindex (
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1033 @kindex )
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1034 @kindex \(
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1035 @kindex \)
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1036 @cindex grouping
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1037 @cindex subexpressions
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1038 @cindex parenthesizing
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1039
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1040 A @dfn{group}, also known as a @dfn{subexpression}, consists of an
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1041 @dfn{open-group operator}, any number of other operators, and a
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1042 @dfn{close-group operator}. Regex treats this sequence as a unit, just
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1043 as mathematics and programming languages treat a parenthesized
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1044 expression as a unit.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1045
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1046 Therefore, using @dfn{groups}, you can:
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1047
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1048 @itemize @bullet
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1049 @item
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1050 delimit the argument(s) to an alternation operator (@pxref{Alternation
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1051 Operator}) or a repetition operator (@pxref{Repetition
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1052 Operators}).
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1053
13532 b0bea693e638 Whitespace cleanup. Bruno Haible <bruno@clisp.org> parents: 13531 diff changeset	1054 @item
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1055 keep track of the indices of the substring that matched a given group.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1056 @xref{Using Registers}, for a precise explanation.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1057 This lets you:
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1058
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1059 @itemize @bullet
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1060 @item
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1061 use the back-reference operator (@pxref{Back-reference Operator}).
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1062
13532 b0bea693e638 Whitespace cleanup. Bruno Haible <bruno@clisp.org> parents: 13531 diff changeset	1063 @item
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1064 use registers (@pxref{Using Registers}).
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1065
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1066 @end itemize
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1067
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1068 @end itemize
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1069
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1070 If the syntax bit @code{RE_NO_BK_PARENS} is set, then @samp{(} represents
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1071 the open-group operator and @samp{)} represents the
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1072 close-group operator; otherwise, @samp{\(} and @samp{\)} do.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1073
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1074 If the syntax bit @code{RE_UNMATCHED_RIGHT_PAREN_ORD} is set and a
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1075 close-group operator has no matching open-group operator, then Regex
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1076 considers it to match @samp{)}.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1077
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1078
13533 ca70a11e70e2 Integrate the regex documentation. Bruno Haible <bruno@clisp.org> parents: 13532 diff changeset	1079 @node Back-reference Operator
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1080 @section The Back-reference Operator (@dfn{\}@var{digit})
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1081
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1082 @cindex back references
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1083
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1084 If the syntax bit @code{RE_NO_BK_REF} isn't set, then Regex recognizes
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1085 back references. A back reference matches a specified preceding group.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1086 The back reference operator is represented by @samp{\@var{digit}}
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1087 anywhere after the end of a regular expression's @w{@var{digit}-th}
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1088 group (@pxref{Grouping Operators}).
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1089
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1090 @var{digit} must be between @samp{1} and @samp{9}. The matcher assigns
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1091 numbers 1 through 9 to the first nine groups it encounters. By using
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1092 one of @samp{\1} through @samp{\9} after the corresponding group's
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1093 close-group operator, you can match a substring identical to the
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1094 one that the group does.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1095
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1096 Back references match according to the following (in all examples below,
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1097 @samp{(} represents the open-group, @samp{)} the close-group, @samp{@{}
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1098 the open-interval and @samp{@}} the close-interval operator):
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1099
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1100 @itemize @bullet
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1101 @item
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1102 If the group matches a substring, the back reference matches an
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1103 identical substring. For example, @samp{(a)\1} matches @samp{aa} and
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1104 @samp{(bana)na\1bo\1} matches @samp{bananabanabobana}. Likewise,
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1105 @samp{(.*)\1} matches any (newline-free if the syntax bit
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1106 @code{RE_DOT_NEWLINE} isn't set) string that is composed of two
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1107 identical halves; the @samp{(.*)} matches the first half and the
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1108 @samp{\1} matches the second half.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1109
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1110 @item
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1111 If the group matches more than once (as it might if followed
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1112 by, e.g., a repetition operator), then the back reference matches the
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1113 substring the group @emph{last} matched. For example,
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1114 @samp{((a)b)\1\2} matches @samp{aabababa}; first @w{group 1} (the
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1115 outer one) matches @samp{aab} and @w{group 2} (the inner one) matches
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1116 @samp{aa}. Then @w{group 1} matches @samp{ab} and @w{group 2} matches
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1117 @samp{a}. So, @samp{\1} matches @samp{ab} and @samp{\2} matches
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1118 @samp{a}.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1119
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1120 @item
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1121 If the group doesn't participate in a match, i.e., it is part of an
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1122 alternative not taken or a repetition operator allows zero repetitions
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1123 of it, then the back reference makes the whole match fail. For example,
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1124 @samp{(one()\|two())-and-(three\2\|four\3)} matches @samp{one-and-three}
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1125 and @samp{two-and-four}, but not @samp{one-and-four} or
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1126 @samp{two-and-three}. For example, if the pattern matches
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1127 @samp{one-and-}, then its @w{group 2} matches the empty string and its
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1128 @w{group 3} doesn't participate in the match. So, if it then matches
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1129 @samp{four}, then when it tries to back reference @w{group 3}---which it
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1130 will attempt to do because @samp{\3} follows the @samp{four}---the match
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1131 will fail because @w{group 3} didn't participate in the match.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1132
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1133 @end itemize
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1134
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1135 You can use a back reference as an argument to a repetition operator. For
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1136 example, @samp{(a(b))\2*} matches @samp{a} followed by two or more
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1137 @samp{b}s. Similarly, @samp{(a(b))\2@{3@}} matches @samp{abbbb}.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1138
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1139 If there is no preceding @w{@var{digit}-th} subexpression, the regular
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1140 expression is invalid.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1141
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1142
13533 ca70a11e70e2 Integrate the regex documentation. Bruno Haible <bruno@clisp.org> parents: 13532 diff changeset	1143 @node Anchoring Operators
13532 b0bea693e638 Whitespace cleanup. Bruno Haible <bruno@clisp.org> parents: 13531 diff changeset	1144 @section Anchoring Operators
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1145
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1146 @cindex anchoring
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1147 @cindex regexp anchoring
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1148
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1149 These operators can constrain a pattern to match only at the beginning or
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1150 end of the entire string or at the beginning or end of a line.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1151
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1152 @menu
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1153 * Match-beginning-of-line Operator:: ^
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1154 * Match-end-of-line Operator:: $
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1155 @end menu
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1156
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1157
13533 ca70a11e70e2 Integrate the regex documentation. Bruno Haible <bruno@clisp.org> parents: 13532 diff changeset	1158 @node Match-beginning-of-line Operator
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1159 @subsection The Match-beginning-of-line Operator (@code{^})
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1160
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1161 @kindex ^
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1162 @cindex beginning-of-line operator
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1163 @cindex anchors
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1164
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1165 This operator can match the empty string either at the beginning of the
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1166 string or after a newline character. Thus, it is said to @dfn{anchor}
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1167 the pattern to the beginning of a line.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1168
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1169 In the cases following, @samp{^} represents this operator. (Otherwise,
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1170 @samp{^} is ordinary.)
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1171
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1172 @itemize @bullet
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1173
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1174 @item
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1175 It (the @samp{^}) is first in the pattern, as in @samp{^foo}.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1176
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1177 @cnindex RE_CONTEXT_INDEP_ANCHORS @r{(and @samp{^})}
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1178 @item
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1179 The syntax bit @code{RE_CONTEXT_INDEP_ANCHORS} is set, and it is outside
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1180 a bracket expression.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1181
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1182 @cindex open-group operator and @samp{^}
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1183 @cindex alternation operator and @samp{^}
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1184 @item
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1185 It follows an open-group or alternation operator, as in @samp{a\(^b\)}
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1186 and @samp{a\\|^b}. @xref{Grouping Operators}, and @ref{Alternation
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1187 Operator}.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1188
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1189 @end itemize
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1190
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1191 These rules imply that some valid patterns containing @samp{^} cannot be
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1192 matched; for example, @samp{foo^bar} if @code{RE_CONTEXT_INDEP_ANCHORS}
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1193 is set.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1194
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1195 @vindex not_bol @r{field in pattern buffer}
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1196 If the @code{not_bol} field is set in the pattern buffer (@pxref{GNU
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1197 Pattern Buffers}), then @samp{^} fails to match at the beginning of the
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1198 string. @xref{POSIX Matching}, for when you might find this useful.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1199
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1200 @vindex newline_anchor @r{field in pattern buffer}
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1201 If the @code{newline_anchor} field is set in the pattern buffer, then
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1202 @samp{^} fails to match after a newline. This is useful when you do not
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1203 regard the string to be matched as broken into lines.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1204
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1205
13533 ca70a11e70e2 Integrate the regex documentation. Bruno Haible <bruno@clisp.org> parents: 13532 diff changeset	1206 @node Match-end-of-line Operator
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1207 @subsection The Match-end-of-line Operator (@code{$})
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1208
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1209 @kindex $
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1210 @cindex end-of-line operator
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1211 @cindex anchors
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1212
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1213 This operator can match the empty string either at the end of
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1214 the string or before a newline character in the string. Thus, it is
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1215 said to @dfn{anchor} the pattern to the end of a line.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1216
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1217 It is always represented by @samp{$}. For example, @samp{foo$} usually
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1218 matches, e.g., @samp{foo} and, e.g., the first three characters of
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1219 @samp{foo\nbar}.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1220
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1221 Its interaction with the syntax bits and pattern buffer fields is
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1222 exactly the dual of @samp{^}'s; see the previous section. (That is,
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1223 ``beginning'' becomes ``end'', ``next'' becomes ``previous'', and
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1224 ``after'' becomes ``before''.)
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1225
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1226
13533 ca70a11e70e2 Integrate the regex documentation. Bruno Haible <bruno@clisp.org> parents: 13532 diff changeset	1227 @node GNU Operators
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1228 @chapter GNU Operators
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1229
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1230 Following are operators that @sc{gnu} defines (and @sc{posix} doesn't).
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1231
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1232 @menu
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1233 * Word Operators::
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1234 * Buffer Operators::
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1235 @end menu
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1236
13533 ca70a11e70e2 Integrate the regex documentation. Bruno Haible <bruno@clisp.org> parents: 13532 diff changeset	1237 @node Word Operators
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1238 @section Word Operators
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1239
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1240 The operators in this section require Regex to recognize parts of words.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1241 Regex uses a syntax table to determine whether or not a character is
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1242 part of a word, i.e., whether or not it is @dfn{word-constituent}.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1243
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1244 @menu
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1245 * Non-Emacs Syntax Tables::
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1246 * Match-word-boundary Operator:: \b
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1247 * Match-within-word Operator:: \B
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1248 * Match-beginning-of-word Operator:: \<
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1249 * Match-end-of-word Operator:: \>
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1250 * Match-word-constituent Operator:: \w
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1251 * Match-non-word-constituent Operator:: \W
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1252 @end menu
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1253
13533 ca70a11e70e2 Integrate the regex documentation. Bruno Haible <bruno@clisp.org> parents: 13532 diff changeset	1254 @node Non-Emacs Syntax Tables
13532 b0bea693e638 Whitespace cleanup. Bruno Haible <bruno@clisp.org> parents: 13531 diff changeset	1255 @subsection Non-Emacs Syntax Tables
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1256
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1257 A @dfn{syntax table} is an array indexed by the characters in your
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1258 character set. In the @sc{ascii} encoding, therefore, a syntax table
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1259 has 256 elements. Regex always uses a @code{char *} variable
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1260 @code{re_syntax_table} as its syntax table. In some cases, it
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1261 initializes this variable and in others it expects you to initialize it.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1262
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1263 @itemize @bullet
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1264 @item
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1265 If Regex is compiled with the preprocessor symbols @code{emacs} and
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1266 @code{SYNTAX_TABLE} both undefined, then Regex allocates
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1267 @code{re_syntax_table} and initializes an element @var{i} either to
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1268 @code{Sword} (which it defines) if @var{i} is a letter, number, or
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1269 @samp{_}, or to zero if it's not.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1270
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1271 @item
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1272 If Regex is compiled with @code{emacs} undefined but @code{SYNTAX_TABLE}
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1273 defined, then Regex expects you to define a @code{char *} variable
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1274 @code{re_syntax_table} to be a valid syntax table.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1275
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1276 @item
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1277 @xref{Emacs Syntax Tables}, for what happens when Regex is compiled with
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1278 the preprocessor symbol @code{emacs} defined.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1279
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1280 @end itemize
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1281
13533 ca70a11e70e2 Integrate the regex documentation. Bruno Haible <bruno@clisp.org> parents: 13532 diff changeset	1282 @node Match-word-boundary Operator
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1283 @subsection The Match-word-boundary Operator (@code{\b})
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1284
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1285 @cindex @samp{\b}
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1286 @cindex word boundaries, matching
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1287
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1288 This operator (represented by @samp{\b}) matches the empty string at
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1289 either the beginning or the end of a word. For example, @samp{\brat\b}
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1290 matches the separate word @samp{rat}.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1291
13533 ca70a11e70e2 Integrate the regex documentation. Bruno Haible <bruno@clisp.org> parents: 13532 diff changeset	1292 @node Match-within-word Operator
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1293 @subsection The Match-within-word Operator (@code{\B})
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1294
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1295 @cindex @samp{\B}
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1296
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1297 This operator (represented by @samp{\B}) matches the empty string within
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1298 a word. For example, @samp{c\Brat\Be} matches @samp{crate}, but
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1299 @samp{dirty \Brat} doesn't match @samp{dirty rat}.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1300
13533 ca70a11e70e2 Integrate the regex documentation. Bruno Haible <bruno@clisp.org> parents: 13532 diff changeset	1301 @node Match-beginning-of-word Operator
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1302 @subsection The Match-beginning-of-word Operator (@code{\<})
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1303
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1304 @cindex @samp{\<}
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1305
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1306 This operator (represented by @samp{\<}) matches the empty string at the
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1307 beginning of a word.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1308
13533 ca70a11e70e2 Integrate the regex documentation. Bruno Haible <bruno@clisp.org> parents: 13532 diff changeset	1309 @node Match-end-of-word Operator
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1310 @subsection The Match-end-of-word Operator (@code{\>})
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1311
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1312 @cindex @samp{\>}
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1313
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1314 This operator (represented by @samp{\>}) matches the empty string at the
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1315 end of a word.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1316
13533 ca70a11e70e2 Integrate the regex documentation. Bruno Haible <bruno@clisp.org> parents: 13532 diff changeset	1317 @node Match-word-constituent Operator
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1318 @subsection The Match-word-constituent Operator (@code{\w})
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1319
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1320 @cindex @samp{\w}
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1321
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1322 This operator (represented by @samp{\w}) matches any word-constituent
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1323 character.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1324
13533 ca70a11e70e2 Integrate the regex documentation. Bruno Haible <bruno@clisp.org> parents: 13532 diff changeset	1325 @node Match-non-word-constituent Operator
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1326 @subsection The Match-non-word-constituent Operator (@code{\W})
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1327
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1328 @cindex @samp{\W}
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1329
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1330 This operator (represented by @samp{\W}) matches any character that is
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1331 not word-constituent.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1332
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1333
13533 ca70a11e70e2 Integrate the regex documentation. Bruno Haible <bruno@clisp.org> parents: 13532 diff changeset	1334 @node Buffer Operators
13532 b0bea693e638 Whitespace cleanup. Bruno Haible <bruno@clisp.org> parents: 13531 diff changeset	1335 @section Buffer Operators
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1336
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1337 Following are operators which work on buffers. In Emacs, a @dfn{buffer}
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1338 is, naturally, an Emacs buffer. For other programs, Regex considers the
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1339 entire string to be matched as the buffer.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1340
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1341 @menu
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1342 * Match-beginning-of-buffer Operator:: \`
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1343 * Match-end-of-buffer Operator:: \'
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1344 @end menu
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1345
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1346
13533 ca70a11e70e2 Integrate the regex documentation. Bruno Haible <bruno@clisp.org> parents: 13532 diff changeset	1347 @node Match-beginning-of-buffer Operator
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1348 @subsection The Match-beginning-of-buffer Operator (@code{\`})
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1349
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1350 @cindex @samp{\`}
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1351
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1352 This operator (represented by @samp{\`}) matches the empty string at the
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1353 beginning of the buffer.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1354
13533 ca70a11e70e2 Integrate the regex documentation. Bruno Haible <bruno@clisp.org> parents: 13532 diff changeset	1355 @node Match-end-of-buffer Operator
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1356 @subsection The Match-end-of-buffer Operator (@code{\'})
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1357
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1358 @cindex @samp{\'}
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1359
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1360 This operator (represented by @samp{\'}) matches the empty string at the
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1361 end of the buffer.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1362
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1363
13533 ca70a11e70e2 Integrate the regex documentation. Bruno Haible <bruno@clisp.org> parents: 13532 diff changeset	1364 @node GNU Emacs Operators
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1365 @chapter GNU Emacs Operators
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1366
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1367 Following are operators that @sc{gnu} defines (and @sc{posix} doesn't)
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1368 that you can use only when Regex is compiled with the preprocessor
13532 b0bea693e638 Whitespace cleanup. Bruno Haible <bruno@clisp.org> parents: 13531 diff changeset	1369 symbol @code{emacs} defined.
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1370
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1371 @menu
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1372 * Syntactic Class Operators::
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1373 @end menu
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1374
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1375
13533 ca70a11e70e2 Integrate the regex documentation. Bruno Haible <bruno@clisp.org> parents: 13532 diff changeset	1376 @node Syntactic Class Operators
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1377 @section Syntactic Class Operators
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1378
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1379 The operators in this section require Regex to recognize the syntactic
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1380 classes of characters. Regex uses a syntax table to determine this.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1381
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1382 @menu
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1383 * Emacs Syntax Tables::
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1384 * Match-syntactic-class Operator:: \sCLASS
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1385 * Match-not-syntactic-class Operator:: \SCLASS
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1386 @end menu
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1387
13533 ca70a11e70e2 Integrate the regex documentation. Bruno Haible <bruno@clisp.org> parents: 13532 diff changeset	1388 @node Emacs Syntax Tables
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1389 @subsection Emacs Syntax Tables
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1390
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1391 A @dfn{syntax table} is an array indexed by the characters in your
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1392 character set. In the @sc{ascii} encoding, therefore, a syntax table
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1393 has 256 elements.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1394
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1395 If Regex is compiled with the preprocessor symbol @code{emacs} defined,
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1396 then Regex expects you to define and initialize the variable
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1397 @code{re_syntax_table} to be an Emacs syntax table. Emacs' syntax
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1398 tables are more complicated than Regex's own (@pxref{Non-Emacs Syntax
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1399 Tables}). @xref{Syntax, , Syntax, emacs, The GNU Emacs User's Manual},
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1400 for a description of Emacs' syntax tables.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1401
13533 ca70a11e70e2 Integrate the regex documentation. Bruno Haible <bruno@clisp.org> parents: 13532 diff changeset	1402 @node Match-syntactic-class Operator
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1403 @subsection The Match-syntactic-class Operator (@code{\s}@var{class})
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1404
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1405 @cindex @samp{\s}
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1406
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1407 This operator matches any character whose syntactic class is represented
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1408 by a specified character. @samp{\s@var{class}} represents this operator
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1409 where @var{class} is the character representing the syntactic class you
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1410 want. For example, @samp{w} represents the syntactic
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1411 class of word-constituent characters, so @samp{\sw} matches any
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1412 word-constituent character.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1413
13533 ca70a11e70e2 Integrate the regex documentation. Bruno Haible <bruno@clisp.org> parents: 13532 diff changeset	1414 @node Match-not-syntactic-class Operator
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1415 @subsection The Match-not-syntactic-class Operator (@code{\S}@var{class})
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1416
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1417 @cindex @samp{\S}
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1418
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1419 This operator is similar to the match-syntactic-class operator except
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1420 that it matches any character whose syntactic class is @emph{not}
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1421 represented by the specified character. @samp{\S@var{class}} represents
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1422 this operator. For example, @samp{w} represents the syntactic class of
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1423 word-constituent characters, so @samp{\Sw} matches any character that is
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1424 not word-constituent.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1425
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1426
13533 ca70a11e70e2 Integrate the regex documentation. Bruno Haible <bruno@clisp.org> parents: 13532 diff changeset	1427 @node What Gets Matched?
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1428 @chapter What Gets Matched?
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1429
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1430 Regex usually matches strings according to the ``leftmost longest''
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1431 rule; that is, it chooses the longest of the leftmost matches. This
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1432 does not mean that for a regular expression containing subexpressions
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1433 that it simply chooses the longest match for each subexpression, left to
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1434 right; the overall match must also be the longest possible one.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1435
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1436 For example, @samp{(ac)(cd[ac]*)\1} matches @samp{acdacaaa}, not
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1437 @samp{acdac}, as it would if it were to choose the longest match for the
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1438 first subexpression.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1439
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1440
13533 ca70a11e70e2 Integrate the regex documentation. Bruno Haible <bruno@clisp.org> parents: 13532 diff changeset	1441 @node Programming with Regex
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1442 @chapter Programming with Regex
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1443
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1444 Here we describe how you use the Regex data structures and functions in
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1445 C programs. Regex has three interfaces: one designed for @sc{gnu}, one
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1446 compatible with @sc{posix} and one compatible with Berkeley @sc{unix}.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1447
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1448 @menu
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1449 * GNU Regex Functions::
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1450 * POSIX Regex Functions::
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1451 * BSD Regex Functions::
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1452 @end menu
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1453
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1454
13533 ca70a11e70e2 Integrate the regex documentation. Bruno Haible <bruno@clisp.org> parents: 13532 diff changeset	1455 @node GNU Regex Functions
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1456 @section GNU Regex Functions
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1457
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1458 If you're writing code that doesn't need to be compatible with either
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1459 @sc{posix} or Berkeley @sc{unix}, you can use these functions. They
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1460 provide more options than the other interfaces.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1461
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1462 @menu
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1463 * GNU Pattern Buffers:: The re_pattern_buffer type.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1464 * GNU Regular Expression Compiling:: re_compile_pattern ()
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1465 * GNU Matching:: re_match ()
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1466 * GNU Searching:: re_search ()
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1467 * Matching/Searching with Split Data:: re_match_2 (), re_search_2 ()
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1468 * Searching with Fastmaps:: re_compile_fastmap ()
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1469 * GNU Translate Tables:: The `translate' field.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1470 * Using Registers:: The re_registers type and related fns.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1471 * Freeing GNU Pattern Buffers:: regfree ()
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1472 @end menu
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1473
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1474
13533 ca70a11e70e2 Integrate the regex documentation. Bruno Haible <bruno@clisp.org> parents: 13532 diff changeset	1475 @node GNU Pattern Buffers
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1476 @subsection GNU Pattern Buffers
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1477
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1478 @cindex pattern buffer, definition of
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1479 @tindex re_pattern_buffer @r{definition}
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1480 @tindex struct re_pattern_buffer @r{definition}
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1481
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1482 To compile, match, or search for a given regular expression, you must
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1483 supply a pattern buffer. A @dfn{pattern buffer} holds one compiled
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1484 regular expression.@footnote{Regular expressions are also referred to as
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1485 ``patterns,'' hence the name ``pattern buffer.''}
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1486
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1487 You can have several different pattern buffers simultaneously, each
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1488 holding a compiled pattern for a different regular expression.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1489
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1490 @file{regex.h} defines the pattern buffer @code{struct} as follows:
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1491
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1492 @example
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1493 /* Space that holds the compiled pattern. It is declared as
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1494 `unsigned char *' because its elements are
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1495 sometimes used as array indexes. */
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1496 unsigned char *buffer;
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1497
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1498 /* Number of bytes to which `buffer' points. */
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1499 unsigned long allocated;
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1500
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1501 /* Number of bytes actually used in `buffer'. */
13532 b0bea693e638 Whitespace cleanup. Bruno Haible <bruno@clisp.org> parents: 13531 diff changeset	1502 unsigned long used;
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1503
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1504 /* Syntax setting with which the pattern was compiled. */
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1505 reg_syntax_t syntax;
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1506
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1507 /* Pointer to a fastmap, if any, otherwise zero. re_search uses
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1508 the fastmap, if there is one, to skip over impossible
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1509 starting points for matches. */
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1510 char *fastmap;
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1511
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1512 /* Either a translate table to apply to all characters before
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1513 comparing them, or zero for no translation. The translation
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1514 is applied to a pattern when it is compiled and to a string
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1515 when it is matched. */
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1516 char *translate;
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1517
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1518 /* Number of subexpressions found by the compiler. */
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1519 size_t re_nsub;
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1520
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1521 /* Zero if this pattern cannot match the empty string, one else.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1522 Well, in truth it's used only in `re_search_2', to see
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1523 whether or not we should use the fastmap, so we don't set
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1524 this absolutely perfectly; see `re_compile_fastmap' (the
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1525 `duplicate' case). */
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1526 unsigned can_be_null : 1;
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1527
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1528 /* If REGS_UNALLOCATED, allocate space in the `regs' structure
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1529 for `max (RE_NREGS, re_nsub + 1)' groups.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1530 If REGS_REALLOCATE, reallocate space if necessary.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1531 If REGS_FIXED, use what's there. */
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1532 #define REGS_UNALLOCATED 0
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1533 #define REGS_REALLOCATE 1
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1534 #define REGS_FIXED 2
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1535 unsigned regs_allocated : 2;
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1536
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1537 /* Set to zero when `regex_compile' compiles a pattern; set to one
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1538 by `re_compile_fastmap' if it updates the fastmap. */
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1539 unsigned fastmap_accurate : 1;
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1540
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1541 /* If set, `re_match_2' does not return information about
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1542 subexpressions. */
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1543 unsigned no_sub : 1;
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1544
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1545 /* If set, a beginning-of-line anchor doesn't match at the
13532 b0bea693e638 Whitespace cleanup. Bruno Haible <bruno@clisp.org> parents: 13531 diff changeset	1546 beginning of the string. */
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1547 unsigned not_bol : 1;
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1548
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1549 /* Similarly for an end-of-line anchor. */
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1550 unsigned not_eol : 1;
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1551
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1552 /* If true, an anchor at a newline matches. */
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1553 unsigned newline_anchor : 1;
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1554
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1555 @end example
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1556
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1557
13533 ca70a11e70e2 Integrate the regex documentation. Bruno Haible <bruno@clisp.org> parents: 13532 diff changeset	1558 @node GNU Regular Expression Compiling
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1559 @subsection GNU Regular Expression Compiling
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1560
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1561 In @sc{gnu}, you can both match and search for a given regular
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1562 expression. To do either, you must first compile it in a pattern buffer
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1563 (@pxref{GNU Pattern Buffers}).
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1564
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1565 @cindex syntax initialization
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1566 @vindex re_syntax_options @r{initialization}
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1567 Regular expressions match according to the syntax with which they were
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1568 compiled; with @sc{gnu}, you indicate what syntax you want by setting
13553 8fc3314fe460 Document not_eol and remove mention of regex.c. Reuben Thomas <rrt@sc3d.org> parents: 13549 diff changeset	1569 the variable @code{re_syntax_options} (declared in @file{regex.h})
8fc3314fe460 Document not_eol and remove mention of regex.c. Reuben Thomas <rrt@sc3d.org> parents: 13549 diff changeset	1570 before calling the compiling function, @code{re_compile_pattern} (see
8fc3314fe460 Document not_eol and remove mention of regex.c. Reuben Thomas <rrt@sc3d.org> parents: 13549 diff changeset	1571 below). @xref{Syntax Bits}, and @ref{Predefined Syntaxes}.
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1572
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1573 You can change the value of @code{re_syntax_options} at any time.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1574 Usually, however, you set its value once and then never change it.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1575
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1576 @cindex pattern buffer initialization
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1577 @code{re_compile_pattern} takes a pattern buffer as an argument. You
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1578 must initialize the following fields:
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1579
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1580 @table @code
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1581
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1582 @item translate @r{initialization}
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1583
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1584 @item translate
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1585 @vindex translate @r{initialization}
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1586 Initialize this to point to a translate table if you want one, or to
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1587 zero if you don't. We explain translate tables in @ref{GNU Translate
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1588 Tables}.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1589
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1590 @item fastmap
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1591 @vindex fastmap @r{initialization}
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1592 Initialize this to nonzero if you want a fastmap, or to zero if you
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1593 don't.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1594
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1595 @item buffer
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1596 @itemx allocated
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1597 @vindex buffer @r{initialization}
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1598 @vindex allocated @r{initialization}
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1599 @findex malloc
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1600 If you want @code{re_compile_pattern} to allocate memory for the
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1601 compiled pattern, set both of these to zero. If you have an existing
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1602 block of memory (allocated with @code{malloc}) you want Regex to use,
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1603 set @code{buffer} to its address and @code{allocated} to its size (in
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1604 bytes).
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1605
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1606 @code{re_compile_pattern} uses @code{realloc} to extend the space for
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1607 the compiled pattern as necessary.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1608
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1609 @end table
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1610
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1611 To compile a pattern buffer, use:
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1612
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1613 @findex re_compile_pattern
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1614 @example
13532 b0bea693e638 Whitespace cleanup. Bruno Haible <bruno@clisp.org> parents: 13531 diff changeset	1615 char *
b0bea693e638 Whitespace cleanup. Bruno Haible <bruno@clisp.org> parents: 13531 diff changeset	1616 re_compile_pattern (const char *@var{regex}, const int @var{regex_size},
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1617 struct re_pattern_buffer *@var{pattern_buffer})
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1618 @end example
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1619
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1620 @noindent
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1621 @var{regex} is the regular expression's address, @var{regex_size} is its
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1622 length, and @var{pattern_buffer} is the pattern buffer's address.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1623
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1624 If @code{re_compile_pattern} successfully compiles the regular
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1625 expression, it returns zero and sets @code{*@var{pattern_buffer}} to the
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1626 compiled pattern. It sets the pattern buffer's fields as follows:
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1627
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1628 @table @code
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1629 @item buffer
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1630 @vindex buffer @r{field, set by @code{re_compile_pattern}}
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1631 to the compiled pattern.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1632
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1633 @item used
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1634 @vindex used @r{field, set by @code{re_compile_pattern}}
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1635 to the number of bytes the compiled pattern in @code{buffer} occupies.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1636
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1637 @item syntax
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1638 @vindex syntax @r{field, set by @code{re_compile_pattern}}
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1639 to the current value of @code{re_syntax_options}.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1640
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1641 @item re_nsub
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1642 @vindex re_nsub @r{field, set by @code{re_compile_pattern}}
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1643 to the number of subexpressions in @var{regex}.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1644
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1645 @item fastmap_accurate
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1646 @vindex fastmap_accurate @r{field, set by @code{re_compile_pattern}}
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1647 to zero on the theory that the pattern you're compiling is different
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1648 than the one previously compiled into @code{buffer}; in that case (since
13532 b0bea693e638 Whitespace cleanup. Bruno Haible <bruno@clisp.org> parents: 13531 diff changeset	1649 you can't make a fastmap without a compiled pattern),
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1650 @code{fastmap} would either contain an incompatible fastmap, or nothing
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1651 at all.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1652
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1653 @c xx what else?
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1654 @end table
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1655
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1656 If @code{re_compile_pattern} can't compile @var{regex}, it returns an
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1657 error string corresponding to one of the errors listed in @ref{POSIX
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1658 Regular Expression Compiling}.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1659
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1660
13533 ca70a11e70e2 Integrate the regex documentation. Bruno Haible <bruno@clisp.org> parents: 13532 diff changeset	1661 @node GNU Matching
13532 b0bea693e638 Whitespace cleanup. Bruno Haible <bruno@clisp.org> parents: 13531 diff changeset	1662 @subsection GNU Matching
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1663
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1664 @cindex matching with GNU functions
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1665
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1666 Matching the @sc{gnu} way means trying to match as much of a string as
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1667 possible starting at a position within it you specify. Once you've compiled
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1668 a pattern into a pattern buffer (@pxref{GNU Regular Expression
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1669 Compiling}), you can ask the matcher to match that pattern against a
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1670 string using:
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1671
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1672 @findex re_match
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1673 @example
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1674 int
13532 b0bea693e638 Whitespace cleanup. Bruno Haible <bruno@clisp.org> parents: 13531 diff changeset	1675 re_match (struct re_pattern_buffer *@var{pattern_buffer},
b0bea693e638 Whitespace cleanup. Bruno Haible <bruno@clisp.org> parents: 13531 diff changeset	1676 const char *@var{string}, const int @var{size},
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1677 const int @var{start}, struct re_registers *@var{regs})
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1678 @end example
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1679
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1680 @noindent
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1681 @var{pattern_buffer} is the address of a pattern buffer containing a
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1682 compiled pattern. @var{string} is the string you want to match; it can
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1683 contain newline and null characters. @var{size} is the length of that
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1684 string. @var{start} is the string index at which you want to
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1685 begin matching; the first character of @var{string} is at index zero.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1686 @xref{Using Registers}, for a explanation of @var{regs}; you can safely
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1687 pass zero.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1688
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1689 @code{re_match} matches the regular expression in @var{pattern_buffer}
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1690 against the string @var{string} according to the syntax in
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1691 @var{pattern_buffers}'s @code{syntax} field. (@xref{GNU Regular
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1692 Expression Compiling}, for how to set it.) The function returns
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1693 @math{-1} if the compiled pattern does not match any part of
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1694 @var{string} and @math{-2} if an internal error happens; otherwise, it
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1695 returns how many (possibly zero) characters of @var{string} the pattern
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1696 matched.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1697
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1698 An example: suppose @var{pattern_buffer} points to a pattern buffer
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1699 containing the compiled pattern for @samp{a*}, and @var{string} points
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1700 to @samp{aaaaab} (whereupon @var{size} should be 6). Then if @var{start}
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1701 is 2, @code{re_match} returns 3, i.e., @samp{a*} would have matched the
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1702 last three @samp{a}s in @var{string}. If @var{start} is 0,
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1703 @code{re_match} returns 5, i.e., @samp{a*} would have matched all the
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1704 @samp{a}s in @var{string}. If @var{start} is either 5 or 6, it returns
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1705 zero.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1706
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1707 If @var{start} is not between zero and @var{size}, then
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1708 @code{re_match} returns @math{-1}.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1709
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1710
13533 ca70a11e70e2 Integrate the regex documentation. Bruno Haible <bruno@clisp.org> parents: 13532 diff changeset	1711 @node GNU Searching
13532 b0bea693e638 Whitespace cleanup. Bruno Haible <bruno@clisp.org> parents: 13531 diff changeset	1712 @subsection GNU Searching
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1713
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1714 @cindex searching with GNU functions
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1715
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1716 @dfn{Searching} means trying to match starting at successive positions
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1717 within a string. The function @code{re_search} does this.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1718
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1719 Before calling @code{re_search}, you must compile your regular
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1720 expression. @xref{GNU Regular Expression Compiling}.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1721
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1722 Here is the function declaration:
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1723
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1724 @findex re_search
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1725 @example
13532 b0bea693e638 Whitespace cleanup. Bruno Haible <bruno@clisp.org> parents: 13531 diff changeset	1726 int
b0bea693e638 Whitespace cleanup. Bruno Haible <bruno@clisp.org> parents: 13531 diff changeset	1727 re_search (struct re_pattern_buffer *@var{pattern_buffer},
b0bea693e638 Whitespace cleanup. Bruno Haible <bruno@clisp.org> parents: 13531 diff changeset	1728 const char *@var{string}, const int @var{size},
b0bea693e638 Whitespace cleanup. Bruno Haible <bruno@clisp.org> parents: 13531 diff changeset	1729 const int @var{start}, const int @var{range},
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1730 struct re_registers *@var{regs})
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1731 @end example
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1732
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1733 @noindent
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1734 @vindex start @r{argument to @code{re_search}}
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1735 @vindex range @r{argument to @code{re_search}}
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1736 whose arguments are the same as those to @code{re_match} (@pxref{GNU
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1737 Matching}) except that the two arguments @var{start} and @var{range}
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1738 replace @code{re_match}'s argument @var{start}.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1739
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1740 If @var{range} is positive, then @code{re_search} attempts a match
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1741 starting first at index @var{start}, then at @math{@var{start} + 1} if
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1742 that fails, and so on, up to @math{@var{start} + @var{range}}; if
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1743 @var{range} is negative, then it attempts a match starting first at
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1744 index @var{start}, then at @math{@var{start} -1} if that fails, and so
13532 b0bea693e638 Whitespace cleanup. Bruno Haible <bruno@clisp.org> parents: 13531 diff changeset	1745 on.
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1746
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1747 If @var{start} is not between zero and @var{size}, then @code{re_search}
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1748 returns @math{-1}. When @var{range} is positive, @code{re_search}
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1749 adjusts @var{range} so that @math{@var{start} + @var{range} - 1} is
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1750 between zero and @var{size}, if necessary; that way it won't search
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1751 outside of @var{string}. Similarly, when @var{range} is negative,
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1752 @code{re_search} adjusts @var{range} so that @math{@var{start} +
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1753 @var{range} + 1} is between zero and @var{size}, if necessary.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1754
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1755 If the @code{fastmap} field of @var{pattern_buffer} is zero,
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1756 @code{re_search} matches starting at consecutive positions; otherwise,
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1757 it uses @code{fastmap} to make the search more efficient.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1758 @xref{Searching with Fastmaps}.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1759
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1760 If no match is found, @code{re_search} returns @math{-1}. If
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1761 a match is found, it returns the index where the match began. If an
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1762 internal error happens, it returns @math{-2}.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1763
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1764
13533 ca70a11e70e2 Integrate the regex documentation. Bruno Haible <bruno@clisp.org> parents: 13532 diff changeset	1765 @node Matching/Searching with Split Data
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1766 @subsection Matching and Searching with Split Data
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1767
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1768 Using the functions @code{re_match_2} and @code{re_search_2}, you can
13532 b0bea693e638 Whitespace cleanup. Bruno Haible <bruno@clisp.org> parents: 13531 diff changeset	1769 match or search in data that is divided into two strings.
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1770
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1771 The function:
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1772
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1773 @findex re_match_2
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1774 @example
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1775 int
13532 b0bea693e638 Whitespace cleanup. Bruno Haible <bruno@clisp.org> parents: 13531 diff changeset	1776 re_match_2 (struct re_pattern_buffer *@var{buffer},
b0bea693e638 Whitespace cleanup. Bruno Haible <bruno@clisp.org> parents: 13531 diff changeset	1777 const char *@var{string1}, const int @var{size1},
b0bea693e638 Whitespace cleanup. Bruno Haible <bruno@clisp.org> parents: 13531 diff changeset	1778 const char *@var{string2}, const int @var{size2},
b0bea693e638 Whitespace cleanup. Bruno Haible <bruno@clisp.org> parents: 13531 diff changeset	1779 const int @var{start},
b0bea693e638 Whitespace cleanup. Bruno Haible <bruno@clisp.org> parents: 13531 diff changeset	1780 struct re_registers *@var{regs},
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1781 const int @var{stop})
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1782 @end example
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1783
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1784 @noindent
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1785 is similar to @code{re_match} (@pxref{GNU Matching}) except that you
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1786 pass @emph{two} data strings and sizes, and an index @var{stop} beyond
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1787 which you don't want the matcher to try matching. As with
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1788 @code{re_match}, if it succeeds, @code{re_match_2} returns how many
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1789 characters of @var{string} it matched. Regard @var{string1} and
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1790 @var{string2} as concatenated when you set the arguments @var{start} and
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1791 @var{stop} and use the contents of @var{regs}; @code{re_match_2} never
13532 b0bea693e638 Whitespace cleanup. Bruno Haible <bruno@clisp.org> parents: 13531 diff changeset	1792 returns a value larger than @math{@var{size1} + @var{size2}}.
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1793
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1794 The function:
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1795
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1796 @findex re_search_2
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1797 @example
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1798 int
13532 b0bea693e638 Whitespace cleanup. Bruno Haible <bruno@clisp.org> parents: 13531 diff changeset	1799 re_search_2 (struct re_pattern_buffer *@var{buffer},
b0bea693e638 Whitespace cleanup. Bruno Haible <bruno@clisp.org> parents: 13531 diff changeset	1800 const char *@var{string1}, const int @var{size1},
b0bea693e638 Whitespace cleanup. Bruno Haible <bruno@clisp.org> parents: 13531 diff changeset	1801 const char *@var{string2}, const int @var{size2},
b0bea693e638 Whitespace cleanup. Bruno Haible <bruno@clisp.org> parents: 13531 diff changeset	1802 const int @var{start}, const int @var{range},
b0bea693e638 Whitespace cleanup. Bruno Haible <bruno@clisp.org> parents: 13531 diff changeset	1803 struct re_registers *@var{regs},
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1804 const int @var{stop})
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1805 @end example
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1806
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1807 @noindent
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1808 is similarly related to @code{re_search}.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1809
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1810
13533 ca70a11e70e2 Integrate the regex documentation. Bruno Haible <bruno@clisp.org> parents: 13532 diff changeset	1811 @node Searching with Fastmaps
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1812 @subsection Searching with Fastmaps
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1813
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1814 @cindex fastmaps
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1815 If you're searching through a long string, you should use a fastmap.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1816 Without one, the searcher tries to match at consecutive positions in the
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1817 string. Generally, most of the characters in the string could not start
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1818 a match. It takes much longer to try matching at a given position in the
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1819 string than it does to check in a table whether or not the character at
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1820 that position could start a match. A @dfn{fastmap} is such a table.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1821
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1822 More specifically, a fastmap is an array indexed by the characters in
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1823 your character set. Under the @sc{ascii} encoding, therefore, a fastmap
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1824 has 256 elements. If you want the searcher to use a fastmap with a
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1825 given pattern buffer, you must allocate the array and assign the array's
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1826 address to the pattern buffer's @code{fastmap} field. You either can
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1827 compile the fastmap yourself or have @code{re_search} do it for you;
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1828 when @code{fastmap} is nonzero, it automatically compiles a fastmap the
13532 b0bea693e638 Whitespace cleanup. Bruno Haible <bruno@clisp.org> parents: 13531 diff changeset	1829 first time you search using a particular compiled pattern.
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1830
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1831 To compile a fastmap yourself, use:
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1832
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1833 @findex re_compile_fastmap
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1834 @example
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1835 int
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1836 re_compile_fastmap (struct re_pattern_buffer *@var{pattern_buffer})
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1837 @end example
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1838
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1839 @noindent
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1840 @var{pattern_buffer} is the address of a pattern buffer. If the
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1841 character @var{c} could start a match for the pattern,
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1842 @code{re_compile_fastmap} makes
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1843 @code{@var{pattern_buffer}->fastmap[@var{c}]} nonzero. It returns
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1844 @math{0} if it can compile a fastmap and @math{-2} if there is an
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1845 internal error. For example, if @samp{\|} is the alternation operator
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1846 and @var{pattern_buffer} holds the compiled pattern for @samp{a\|b}, then
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1847 @code{re_compile_fastmap} sets @code{fastmap['a']} and
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1848 @code{fastmap['b']} (and no others).
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1849
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1850 @code{re_search} uses a fastmap as it moves along in the string: it
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1851 checks the string's characters until it finds one that's in the fastmap.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1852 Then it tries matching at that character. If the match fails, it
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1853 repeats the process. So, by using a fastmap, @code{re_search} doesn't
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1854 waste time trying to match at positions in the string that couldn't
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1855 start a match.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1856
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1857 If you don't want @code{re_search} to use a fastmap,
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1858 store zero in the @code{fastmap} field of the pattern buffer before
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1859 calling @code{re_search}.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1860
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1861 Once you've initialized a pattern buffer's @code{fastmap} field, you
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1862 need never do so again---even if you compile a new pattern in
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1863 it---provided the way the field is set still reflects whether or not you
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1864 want a fastmap. @code{re_search} will still either do nothing if
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1865 @code{fastmap} is null or, if it isn't, compile a new fastmap for the
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1866 new pattern.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1867
13533 ca70a11e70e2 Integrate the regex documentation. Bruno Haible <bruno@clisp.org> parents: 13532 diff changeset	1868 @node GNU Translate Tables
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1869 @subsection GNU Translate Tables
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1870
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1871 If you set the @code{translate} field of a pattern buffer to a translate
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1872 table, then the @sc{gnu} Regex functions to which you've passed that
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1873 pattern buffer use it to apply a simple transformation
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1874 to all the regular expression and string characters at which they look.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1875
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1876 A @dfn{translate table} is an array indexed by the characters in your
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1877 character set. Under the @sc{ascii} encoding, therefore, a translate
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1878 table has 256 elements. The array's elements are also characters in
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1879 your character set. When the Regex functions see a character @var{c},
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1880 they use @code{translate[@var{c}]} in its place, with one exception: the
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1881 character after a @samp{\} is not translated. (This ensures that, the
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1882 operators, e.g., @samp{\B} and @samp{\b}, are always distinguishable.)
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1883
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1884 For example, a table that maps all lowercase letters to the
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1885 corresponding uppercase ones would cause the matcher to ignore
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1886 differences in case.@footnote{A table that maps all uppercase letters to
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1887 the corresponding lowercase ones would work just as well for this
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1888 purpose.} Such a table would map all characters except lowercase letters
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1889 to themselves, and lowercase letters to the corresponding uppercase
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1890 ones. Under the @sc{ascii} encoding, here's how you could initialize
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1891 such a table (we'll call it @code{case_fold}):
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1892
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1893 @example
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1894 for (i = 0; i < 256; i++)
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1895 case_fold[i] = i;
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1896 for (i = 'a'; i <= 'z'; i++)
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1897 case_fold[i] = i - ('a' - 'A');
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1898 @end example
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1899
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1900 You tell Regex to use a translate table on a given pattern buffer by
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1901 assigning that table's address to the @code{translate} field of that
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1902 buffer. If you don't want Regex to do any translation, put zero into
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1903 this field. You'll get weird results if you change the table's contents
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1904 anytime between compiling the pattern buffer, compiling its fastmap, and
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1905 matching or searching with the pattern buffer.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1906
13533 ca70a11e70e2 Integrate the regex documentation. Bruno Haible <bruno@clisp.org> parents: 13532 diff changeset	1907 @node Using Registers
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1908 @subsection Using Registers
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1909
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1910 A group in a regular expression can match a (posssibly empty) substring
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1911 of the string that regular expression as a whole matched. The matcher
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1912 remembers the beginning and end of the substring matched by
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1913 each group.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1914
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1915 To find out what they matched, pass a nonzero @var{regs} argument to a
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1916 @sc{gnu} matching or searching function (@pxref{GNU Matching} and
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1917 @ref{GNU Searching}), i.e., the address of a structure of this type, as
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1918 defined in @file{regex.h}:
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1919
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1920 @c We don't bother to include this directly from regex.h,
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1921 @c since it changes so rarely.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1922 @example
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1923 @tindex re_registers
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1924 @vindex num_regs @r{in @code{struct re_registers}}
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1925 @vindex start @r{in @code{struct re_registers}}
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1926 @vindex end @r{in @code{struct re_registers}}
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1927 struct re_registers
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1928 @{
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1929 unsigned num_regs;
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1930 regoff_t *start;
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1931 regoff_t *end;
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1932 @};
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1933 @end example
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1934
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1935 Except for (possibly) the @var{num_regs}'th element (see below), the
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1936 @var{i}th element of the @code{start} and @code{end} arrays records
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1937 information about the @var{i}th group in the pattern. (They're declared
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1938 as C pointers, but this is only because not all C compilers accept
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1939 zero-length arrays; conceptually, it is simplest to think of them as
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1940 arrays.)
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1941
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1942 The @code{start} and @code{end} arrays are allocated in various ways,
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1943 depending on the value of the @code{regs_allocated}
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1944 @vindex regs_allocated
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1945 field in the pattern buffer passed to the matcher.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1946
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1947 The simplest and perhaps most useful is to let the matcher (re)allocate
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1948 enough space to record information for all the groups in the regular
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1949 expression. If @code{regs_allocated} is @code{REGS_UNALLOCATED},
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1950 @vindex REGS_UNALLOCATED
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1951 the matcher allocates @math{1 + @var{re_nsub}} (another field in the
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1952 pattern buffer; @pxref{GNU Pattern Buffers}). The extra element is set
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1953 to @math{-1}, and sets @code{regs_allocated} to @code{REGS_REALLOCATE}.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1954 @vindex REGS_REALLOCATE
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1955 Then on subsequent calls with the same pattern buffer and @var{regs}
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1956 arguments, the matcher reallocates more space if necessary.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1957
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1958 It would perhaps be more logical to make the @code{regs_allocated} field
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1959 part of the @code{re_registers} structure, instead of part of the
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1960 pattern buffer. But in that case the caller would be forced to
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1961 initialize the structure before passing it. Much existing code doesn't
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1962 do this initialization, and it's arguably better to avoid it anyway.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1963
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1964 @code{re_compile_pattern} sets @code{regs_allocated} to
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1965 @code{REGS_UNALLOCATED},
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1966 so if you use the GNU regular expression
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1967 functions, you get this behavior by default.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1968
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1969 xx document re_set_registers
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1970
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1971 @sc{posix}, on the other hand, requires a different interface: the
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1972 caller is supposed to pass in a fixed-length array which the matcher
13532 b0bea693e638 Whitespace cleanup. Bruno Haible <bruno@clisp.org> parents: 13531 diff changeset	1973 fills. Therefore, if @code{regs_allocated} is @code{REGS_FIXED}
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1974 @vindex REGS_FIXED
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1975 the matcher simply fills that array.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1976
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1977 The following examples illustrate the information recorded in the
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1978 @code{re_registers} structure. (In all of them, @samp{(} represents the
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1979 open-group and @samp{)} the close-group operator. The first character
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1980 in the string @var{string} is at index 0.)
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1981
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1982 @c xx i'm not sure this is all true anymore.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1983
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1984 @itemize @bullet
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1985
13532 b0bea693e638 Whitespace cleanup. Bruno Haible <bruno@clisp.org> parents: 13531 diff changeset	1986 @item
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1987 If the regular expression has an @w{@var{i}-th}
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1988 group not contained within another group that matches a
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1989 substring of @var{string}, then the function sets
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1990 @code{@w{@var{regs}->}start[@var{i}]} to the index in @var{string} where
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1991 the substring matched by the @w{@var{i}-th} group begins, and
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1992 @code{@w{@var{regs}->}end[@var{i}]} to the index just beyond that
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1993 substring's end. The function sets @code{@w{@var{regs}->}start[0]} and
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1994 @code{@w{@var{regs}->}end[0]} to analogous information about the entire
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1995 pattern.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1996
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1997 For example, when you match @samp{((a)(b))} against @samp{ab}, you get:
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1998
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	1999 @itemize
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2000 @item
13532 b0bea693e638 Whitespace cleanup. Bruno Haible <bruno@clisp.org> parents: 13531 diff changeset	2001 0 in @code{@w{@var{regs}->}start[0]} and 2 in @code{@w{@var{regs}->}end[0]}
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2002
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2003 @item
13532 b0bea693e638 Whitespace cleanup. Bruno Haible <bruno@clisp.org> parents: 13531 diff changeset	2004 0 in @code{@w{@var{regs}->}start[1]} and 2 in @code{@w{@var{regs}->}end[1]}
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2005
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2006 @item
13532 b0bea693e638 Whitespace cleanup. Bruno Haible <bruno@clisp.org> parents: 13531 diff changeset	2007 0 in @code{@w{@var{regs}->}start[2]} and 1 in @code{@w{@var{regs}->}end[2]}
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2008
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2009 @item
13532 b0bea693e638 Whitespace cleanup. Bruno Haible <bruno@clisp.org> parents: 13531 diff changeset	2010 1 in @code{@w{@var{regs}->}start[3]} and 2 in @code{@w{@var{regs}->}end[3]}
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2011 @end itemize
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2012
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2013 @item
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2014 If a group matches more than once (as it might if followed by,
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2015 e.g., a repetition operator), then the function reports the information
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2016 about what the group @emph{last} matched.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2017
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2018 For example, when you match the pattern @samp{(a)*} against the string
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2019 @samp{aa}, you get:
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2020
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2021 @itemize
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2022 @item
13532 b0bea693e638 Whitespace cleanup. Bruno Haible <bruno@clisp.org> parents: 13531 diff changeset	2023 0 in @code{@w{@var{regs}->}start[0]} and 2 in @code{@w{@var{regs}->}end[0]}
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2024
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2025 @item
13532 b0bea693e638 Whitespace cleanup. Bruno Haible <bruno@clisp.org> parents: 13531 diff changeset	2026 1 in @code{@w{@var{regs}->}start[1]} and 2 in @code{@w{@var{regs}->}end[1]}
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2027 @end itemize
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2028
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2029 @item
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2030 If the @w{@var{i}-th} group does not participate in a
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2031 successful match, e.g., it is an alternative not taken or a
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2032 repetition operator allows zero repetitions of it, then the function
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2033 sets @code{@w{@var{regs}->}start[@var{i}]} and
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2034 @code{@w{@var{regs}->}end[@var{i}]} to @math{-1}.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2035
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2036 For example, when you match the pattern @samp{(a)*b} against
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2037 the string @samp{b}, you get:
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2038
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2039 @itemize
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2040 @item
13532 b0bea693e638 Whitespace cleanup. Bruno Haible <bruno@clisp.org> parents: 13531 diff changeset	2041 0 in @code{@w{@var{regs}->}start[0]} and 1 in @code{@w{@var{regs}->}end[0]}
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2042
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2043 @item
13532 b0bea693e638 Whitespace cleanup. Bruno Haible <bruno@clisp.org> parents: 13531 diff changeset	2044 @math{-1} in @code{@w{@var{regs}->}start[1]} and @math{-1} in @code{@w{@var{regs}->}end[1]}
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2045 @end itemize
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2046
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2047 @item
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2048 If the @w{@var{i}-th} group matches a zero-length string, then the
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2049 function sets @code{@w{@var{regs}->}start[@var{i}]} and
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2050 @code{@w{@var{regs}->}end[@var{i}]} to the index just beyond that
13532 b0bea693e638 Whitespace cleanup. Bruno Haible <bruno@clisp.org> parents: 13531 diff changeset	2051 zero-length string.
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2052
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2053 For example, when you match the pattern @samp{(a*)b} against the string
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2054 @samp{b}, you get:
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2055
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2056 @itemize
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2057 @item
13532 b0bea693e638 Whitespace cleanup. Bruno Haible <bruno@clisp.org> parents: 13531 diff changeset	2058 0 in @code{@w{@var{regs}->}start[0]} and 1 in @code{@w{@var{regs}->}end[0]}
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2059
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2060 @item
13532 b0bea693e638 Whitespace cleanup. Bruno Haible <bruno@clisp.org> parents: 13531 diff changeset	2061 0 in @code{@w{@var{regs}->}start[1]} and 0 in @code{@w{@var{regs}->}end[1]}
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2062 @end itemize
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2063
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2064 @ignore
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2065 The function sets @code{@w{@var{regs}->}start[0]} and
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2066 @code{@w{@var{regs}->}end[0]} to analogous information about the entire
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2067 pattern.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2068
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2069 For example, when you match the pattern @samp{(a*)} against the empty
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2070 string, you get:
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2071
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2072 @itemize
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2073 @item
13532 b0bea693e638 Whitespace cleanup. Bruno Haible <bruno@clisp.org> parents: 13531 diff changeset	2074 0 in @code{@w{@var{regs}->}start[0]} and 0 in @code{@w{@var{regs}->}end[0]}
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2075
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2076 @item
13532 b0bea693e638 Whitespace cleanup. Bruno Haible <bruno@clisp.org> parents: 13531 diff changeset	2077 0 in @code{@w{@var{regs}->}start[1]} and 0 in @code{@w{@var{regs}->}end[1]}
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2078 @end itemize
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2079 @end ignore
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2080
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2081 @item
13532 b0bea693e638 Whitespace cleanup. Bruno Haible <bruno@clisp.org> parents: 13531 diff changeset	2082 If an @w{@var{i}-th} group contains a @w{@var{j}-th} group
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2083 in turn not contained within any other group within group @var{i} and
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2084 the function reports a match of the @w{@var{i}-th} group, then it
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2085 records in @code{@w{@var{regs}->}start[@var{j}]} and
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2086 @code{@w{@var{regs}->}end[@var{j}]} the last match (if it matched) of
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2087 the @w{@var{j}-th} group.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2088
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2089 For example, when you match the pattern @samp{((a)b)} against the
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2090 string @samp{abb}, @w{group 2} last matches the empty string, so you
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2091 get what it previously matched:
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2092
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2093 @itemize
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2094 @item
13532 b0bea693e638 Whitespace cleanup. Bruno Haible <bruno@clisp.org> parents: 13531 diff changeset	2095 0 in @code{@w{@var{regs}->}start[0]} and 3 in @code{@w{@var{regs}->}end[0]}
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2096
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2097 @item
13532 b0bea693e638 Whitespace cleanup. Bruno Haible <bruno@clisp.org> parents: 13531 diff changeset	2098 2 in @code{@w{@var{regs}->}start[1]} and 3 in @code{@w{@var{regs}->}end[1]}
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2099
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2100 @item
13532 b0bea693e638 Whitespace cleanup. Bruno Haible <bruno@clisp.org> parents: 13531 diff changeset	2101 2 in @code{@w{@var{regs}->}start[2]} and 2 in @code{@w{@var{regs}->}end[2]}
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2102 @end itemize
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2103
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2104 When you match the pattern @samp{((a)b)} against the string
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2105 @samp{abb}, @w{group 2} doesn't participate in the last match, so you
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2106 get:
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2107
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2108 @itemize
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2109 @item
13532 b0bea693e638 Whitespace cleanup. Bruno Haible <bruno@clisp.org> parents: 13531 diff changeset	2110 0 in @code{@w{@var{regs}->}start[0]} and 3 in @code{@w{@var{regs}->}end[0]}
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2111
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2112 @item
13532 b0bea693e638 Whitespace cleanup. Bruno Haible <bruno@clisp.org> parents: 13531 diff changeset	2113 2 in @code{@w{@var{regs}->}start[1]} and 3 in @code{@w{@var{regs}->}end[1]}
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2114
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2115 @item
13532 b0bea693e638 Whitespace cleanup. Bruno Haible <bruno@clisp.org> parents: 13531 diff changeset	2116 0 in @code{@w{@var{regs}->}start[2]} and 1 in @code{@w{@var{regs}->}end[2]}
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2117 @end itemize
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2118
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2119 @item
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2120 If an @w{@var{i}-th} group contains a @w{@var{j}-th} group
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2121 in turn not contained within any other group within group @var{i}
13532 b0bea693e638 Whitespace cleanup. Bruno Haible <bruno@clisp.org> parents: 13531 diff changeset	2122 and the function sets
b0bea693e638 Whitespace cleanup. Bruno Haible <bruno@clisp.org> parents: 13531 diff changeset	2123 @code{@w{@var{regs}->}start[@var{i}]} and
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2124 @code{@w{@var{regs}->}end[@var{i}]} to @math{-1}, then it also sets
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2125 @code{@w{@var{regs}->}start[@var{j}]} and
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2126 @code{@w{@var{regs}->}end[@var{j}]} to @math{-1}.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2127
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2128 For example, when you match the pattern @samp{((a)b)c} against the
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2129 string @samp{c}, you get:
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2130
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2131 @itemize
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2132 @item
13532 b0bea693e638 Whitespace cleanup. Bruno Haible <bruno@clisp.org> parents: 13531 diff changeset	2133 0 in @code{@w{@var{regs}->}start[0]} and 1 in @code{@w{@var{regs}->}end[0]}
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2134
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2135 @item
13532 b0bea693e638 Whitespace cleanup. Bruno Haible <bruno@clisp.org> parents: 13531 diff changeset	2136 @math{-1} in @code{@w{@var{regs}->}start[1]} and @math{-1} in @code{@w{@var{regs}->}end[1]}
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2137
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2138 @item
13532 b0bea693e638 Whitespace cleanup. Bruno Haible <bruno@clisp.org> parents: 13531 diff changeset	2139 @math{-1} in @code{@w{@var{regs}->}start[2]} and @math{-1} in @code{@w{@var{regs}->}end[2]}
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2140 @end itemize
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2141
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2142 @end itemize
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2143
13533 ca70a11e70e2 Integrate the regex documentation. Bruno Haible <bruno@clisp.org> parents: 13532 diff changeset	2144 @node Freeing GNU Pattern Buffers
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2145 @subsection Freeing GNU Pattern Buffers
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2146
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2147 To free any allocated fields of a pattern buffer, you can use the
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2148 @sc{posix} function described in @ref{Freeing POSIX Pattern Buffers},
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2149 since the type @code{regex_t}---the type for @sc{posix} pattern
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2150 buffers---is equivalent to the type @code{re_pattern_buffer}. After
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2151 freeing a pattern buffer, you need to again compile a regular expression
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2152 in it (@pxref{GNU Regular Expression Compiling}) before passing it to
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2153 a matching or searching function.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2154
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2155
13533 ca70a11e70e2 Integrate the regex documentation. Bruno Haible <bruno@clisp.org> parents: 13532 diff changeset	2156 @node POSIX Regex Functions
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2157 @section POSIX Regex Functions
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2158
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2159 If you're writing code that has to be @sc{posix} compatible, you'll need
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2160 to use these functions. Their interfaces are as specified by @sc{posix},
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2161 draft 1003.2/D11.2.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2162
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2163 @menu
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2164 * POSIX Pattern Buffers:: The regex_t type.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2165 * POSIX Regular Expression Compiling:: regcomp ()
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2166 * POSIX Matching:: regexec ()
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2167 * Reporting Errors:: regerror ()
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2168 * Using Byte Offsets:: The regmatch_t type.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2169 * Freeing POSIX Pattern Buffers:: regfree ()
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2170 @end menu
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2171
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2172
13533 ca70a11e70e2 Integrate the regex documentation. Bruno Haible <bruno@clisp.org> parents: 13532 diff changeset	2173 @node POSIX Pattern Buffers
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2174 @subsection POSIX Pattern Buffers
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2175
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2176 To compile or match a given regular expression the @sc{posix} way, you
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2177 must supply a pattern buffer exactly the way you do for @sc{gnu}
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2178 (@pxref{GNU Pattern Buffers}). @sc{posix} pattern buffers have type
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2179 @code{regex_t}, which is equivalent to the @sc{gnu} pattern buffer
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2180 type @code{re_pattern_buffer}.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2181
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2182
13533 ca70a11e70e2 Integrate the regex documentation. Bruno Haible <bruno@clisp.org> parents: 13532 diff changeset	2183 @node POSIX Regular Expression Compiling
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2184 @subsection POSIX Regular Expression Compiling
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2185
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2186 With @sc{posix}, you can only search for a given regular expression; you
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2187 can't match it. To do this, you must first compile it in a
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2188 pattern buffer, using @code{regcomp}.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2189
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2190 @ignore
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2191 Before calling @code{regcomp}, you must initialize this pattern buffer
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2192 as you do for @sc{gnu} (@pxref{GNU Regular Expression Compiling}). See
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2193 below, however, for how to choose a syntax with which to compile.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2194 @end ignore
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2195
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2196 To compile a pattern buffer, use:
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2197
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2198 @findex regcomp
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2199 @example
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2200 int
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2201 regcomp (regex_t @var{preg}, const char @var{regex}, int @var{cflags})
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2202 @end example
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2203
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2204 @noindent
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2205 @var{preg} is the initialized pattern buffer's address, @var{regex} is
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2206 the regular expression's address, and @var{cflags} is the compilation
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2207 flags, which Regex considers as a collection of bits. Here are the
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2208 valid bits, as defined in @file{regex.h}:
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2209
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2210 @table @code
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2211
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2212 @item REG_EXTENDED
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2213 @vindex REG_EXTENDED
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2214 says to use @sc{posix} Extended Regular Expression syntax; if this isn't
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2215 set, then says to use @sc{posix} Basic Regular Expression syntax.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2216 @code{regcomp} sets @var{preg}'s @code{syntax} field accordingly.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2217
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2218 @item REG_ICASE
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2219 @vindex REG_ICASE
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2220 @cindex ignoring case
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2221 says to ignore case; @code{regcomp} sets @var{preg}'s @code{translate}
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2222 field to a translate table which ignores case, replacing anything you've
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2223 put there before.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2224
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2225 @item REG_NOSUB
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2226 @vindex REG_NOSUB
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2227 says to set @var{preg}'s @code{no_sub} field; @pxref{POSIX Matching},
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2228 for what this means.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2229
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2230 @item REG_NEWLINE
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2231 @vindex REG_NEWLINE
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2232 says that a:
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2233
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2234 @itemize @bullet
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2235
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2236 @item
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2237 match-any-character operator (@pxref{Match-any-character
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2238 Operator}) doesn't match a newline.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2239
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2240 @item
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2241 nonmatching list not containing a newline (@pxref{List
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2242 Operators}) matches a newline.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2243
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2244 @item
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2245 match-beginning-of-line operator (@pxref{Match-beginning-of-line
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2246 Operator}) matches the empty string immediately after a newline,
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2247 regardless of how @code{REG_NOTBOL} is set (@pxref{POSIX Matching}, for
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2248 an explanation of @code{REG_NOTBOL}).
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2249
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2250 @item
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2251 match-end-of-line operator (@pxref{Match-beginning-of-line
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2252 Operator}) matches the empty string immediately before a newline,
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2253 regardless of how @code{REG_NOTEOL} is set (@pxref{POSIX Matching},
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2254 for an explanation of @code{REG_NOTEOL}).
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2255
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2256 @end itemize
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2257
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2258 @end table
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2259
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2260 If @code{regcomp} successfully compiles the regular expression, it
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2261 returns zero and sets @code{*@var{pattern_buffer}} to the compiled
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2262 pattern. Except for @code{syntax} (which it sets as explained above), it
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2263 also sets the same fields the same way as does the @sc{gnu} compiling
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2264 function (@pxref{GNU Regular Expression Compiling}).
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2265
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2266 If @code{regcomp} can't compile the regular expression, it returns one
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2267 of the error codes listed here. (Except when noted differently, the
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2268 syntax of in all examples below is basic regular expression syntax.)
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2269
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2270 @table @code
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2271
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2272 @comment repetitions
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2273 @item REG_BADRPT
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2274 For example, the consecutive repetition operators @samp{**} in
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2275 @samp{a**} are invalid. As another example, if the syntax is extended
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2276 regular expression syntax, then the repetition operator @samp{*} with
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2277 nothing on which to operate in @samp{*} is invalid.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2278
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2279 @item REG_BADBR
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2280 For example, the @var{count} @samp{-1} in @samp{a\@{-1} is invalid.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2281
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2282 @item REG_EBRACE
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2283 For example, @samp{a\@{1} is missing a close-interval operator.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2284
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2285 @comment lists
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2286 @item REG_EBRACK
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2287 For example, @samp{[a} is missing a close-list operator.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2288
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2289 @item REG_ERANGE
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2290 For example, the range ending point @samp{z} that collates lower than
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2291 does its starting point @samp{a} in @samp{[z-a]} is invalid. Also, the
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2292 range with the character class @samp{[:alpha:]} as its starting point in
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2293 @samp{[[:alpha:]-\|]}.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2294
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2295 @item REG_ECTYPE
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2296 For example, the character class name @samp{foo} in @samp{[[:foo:]} is
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2297 invalid.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2298
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2299 @comment groups
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2300 @item REG_EPAREN
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2301 For example, @samp{a\)} is missing an open-group operator and @samp{\(a}
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2302 is missing a close-group operator.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2303
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2304 @item REG_ESUBREG
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2305 For example, the back reference @samp{\2} that refers to a nonexistent
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2306 subexpression in @samp{\(a\)\2} is invalid.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2307
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2308 @comment unfinished business
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2309
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2310 @item REG_EEND
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2311 Returned when a regular expression causes no other more specific error.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2312
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2313 @item REG_EESCAPE
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2314 For example, the trailing backslash @samp{\} in @samp{a\} is invalid, as is the
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2315 one in @samp{\}.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2316
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2317 @comment kitchen sink
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2318 @item REG_BADPAT
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2319 For example, in the extended regular expression syntax, the empty group
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2320 @samp{()} in @samp{a()b} is invalid.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2321
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2322 @comment internal
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2323 @item REG_ESIZE
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2324 Returned when a regular expression needs a pattern buffer larger than
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2325 65536 bytes.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2326
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2327 @item REG_ESPACE
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2328 Returned when a regular expression makes Regex to run out of memory.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2329
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2330 @end table
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2331
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2332
13533 ca70a11e70e2 Integrate the regex documentation. Bruno Haible <bruno@clisp.org> parents: 13532 diff changeset	2333 @node POSIX Matching
13532 b0bea693e638 Whitespace cleanup. Bruno Haible <bruno@clisp.org> parents: 13531 diff changeset	2334 @subsection POSIX Matching
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2335
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2336 Matching the @sc{posix} way means trying to match a null-terminated
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2337 string starting at its first character. Once you've compiled a pattern
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2338 into a pattern buffer (@pxref{POSIX Regular Expression Compiling}), you
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2339 can ask the matcher to match that pattern against a string using:
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2340
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2341 @findex regexec
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2342 @example
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2343 int
13532 b0bea693e638 Whitespace cleanup. Bruno Haible <bruno@clisp.org> parents: 13531 diff changeset	2344 regexec (const regex_t @var{preg}, const char @var{string},
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2345 size_t @var{nmatch}, regmatch_t @var{pmatch}[], int @var{eflags})
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2346 @end example
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2347
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2348 @noindent
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2349 @var{preg} is the address of a pattern buffer for a compiled pattern.
13532 b0bea693e638 Whitespace cleanup. Bruno Haible <bruno@clisp.org> parents: 13531 diff changeset	2350 @var{string} is the string you want to match.
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2351
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2352 @xref{Using Byte Offsets}, for an explanation of @var{pmatch}. If you
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2353 pass zero for @var{nmatch} or you compiled @var{preg} with the
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2354 compilation flag @code{REG_NOSUB} set, then @code{regexec} will ignore
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2355 @var{pmatch}; otherwise, you must allocate it to have at least
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2356 @var{nmatch} elements. @code{regexec} will record @var{nmatch} byte
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2357 offsets in @var{pmatch}, and set to @math{-1} any unused elements up to
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2358 @math{@var{pmatch}@code{[@var{nmatch}]} - 1}.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2359
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2360 @var{eflags} specifies @dfn{execution flags}---namely, the two bits
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2361 @code{REG_NOTBOL} and @code{REG_NOTEOL} (defined in @file{regex.h}). If
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2362 you set @code{REG_NOTBOL}, then the match-beginning-of-line operator
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2363 (@pxref{Match-beginning-of-line Operator}) always fails to match.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2364 This lets you match against pieces of a line, as you would need to if,
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2365 say, searching for repeated instances of a given pattern in a line; it
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2366 would work correctly for patterns both with and without
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2367 match-beginning-of-line operators. @code{REG_NOTEOL} works analogously
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2368 for the match-end-of-line operator (@pxref{Match-end-of-line
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2369 Operator}); it exists for symmetry.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2370
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2371 @code{regexec} tries to find a match for @var{preg} in @var{string}
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2372 according to the syntax in @var{preg}'s @code{syntax} field.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2373 (@xref{POSIX Regular Expression Compiling}, for how to set it.) The
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2374 function returns zero if the compiled pattern matches @var{string} and
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2375 @code{REG_NOMATCH} (defined in @file{regex.h}) if it doesn't.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2376
13533 ca70a11e70e2 Integrate the regex documentation. Bruno Haible <bruno@clisp.org> parents: 13532 diff changeset	2377 @node Reporting Errors
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2378 @subsection Reporting Errors
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2379
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2380 If either @code{regcomp} or @code{regexec} fail, they return a nonzero
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2381 error code, the possibilities for which are defined in @file{regex.h}.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2382 @xref{POSIX Regular Expression Compiling}, and @ref{POSIX Matching}, for
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2383 what these codes mean. To get an error string corresponding to these
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2384 codes, you can use:
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2385
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2386 @findex regerror
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2387 @example
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2388 size_t
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2389 regerror (int @var{errcode},
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2390 const regex_t *@var{preg},
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2391 char *@var{errbuf},
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2392 size_t @var{errbuf_size})
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2393 @end example
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2394
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2395 @noindent
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2396 @var{errcode} is an error code, @var{preg} is the address of the pattern
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2397 buffer which provoked the error, @var{errbuf} is the error buffer, and
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2398 @var{errbuf_size} is @var{errbuf}'s size.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2399
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2400 @code{regerror} returns the size in bytes of the error string
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2401 corresponding to @var{errcode} (including its terminating null). If
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2402 @var{errbuf} and @var{errbuf_size} are nonzero, it also returns in
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2403 @var{errbuf} the first @math{@var{errbuf_size} - 1} characters of the
13532 b0bea693e638 Whitespace cleanup. Bruno Haible <bruno@clisp.org> parents: 13531 diff changeset	2404 error string, followed by a null.
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2405 @var{errbuf_size} must be a nonnegative number less than or equal to the
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2406 size in bytes of @var{errbuf}.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2407
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2408 You can call @code{regerror} with a null @var{errbuf} and a zero
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2409 @var{errbuf_size} to determine how large @var{errbuf} need be to
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2410 accommodate @code{regerror}'s error string.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2411
13533 ca70a11e70e2 Integrate the regex documentation. Bruno Haible <bruno@clisp.org> parents: 13532 diff changeset	2412 @node Using Byte Offsets
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2413 @subsection Using Byte Offsets
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2414
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2415 In @sc{posix}, variables of type @code{regmatch_t} hold analogous
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2416 information, but are not identical to, @sc{gnu}'s registers (@pxref{Using
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2417 Registers}). To get information about registers in @sc{posix}, pass to
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2418 @code{regexec} a nonzero @var{pmatch} of type @code{regmatch_t}, i.e.,
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2419 the address of a structure of this type, defined in
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2420 @file{regex.h}:
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2421
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2422 @tindex regmatch_t
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2423 @example
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2424 typedef struct
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2425 @{
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2426 regoff_t rm_so;
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2427 regoff_t rm_eo;
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2428 @} regmatch_t;
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2429 @end example
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2430
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2431 When reading in @ref{Using Registers}, about how the matching function
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2432 stores the information into the registers, substitute @var{pmatch} for
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2433 @var{regs}, @code{@w{@var{pmatch}[@var{i}]->}rm_so} for
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2434 @code{@w{@var{regs}->}start[@var{i}]} and
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2435 @code{@w{@var{pmatch}[@var{i}]->}rm_eo} for
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2436 @code{@w{@var{regs}->}end[@var{i}]}.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2437
13533 ca70a11e70e2 Integrate the regex documentation. Bruno Haible <bruno@clisp.org> parents: 13532 diff changeset	2438 @node Freeing POSIX Pattern Buffers
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2439 @subsection Freeing POSIX Pattern Buffers
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2440
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2441 To free any allocated fields of a pattern buffer, use:
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2442
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2443 @findex regfree
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2444 @example
13532 b0bea693e638 Whitespace cleanup. Bruno Haible <bruno@clisp.org> parents: 13531 diff changeset	2445 void
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2446 regfree (regex_t *@var{preg})
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2447 @end example
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2448
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2449 @noindent
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2450 @var{preg} is the pattern buffer whose allocated fields you want freed.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2451 @code{regfree} also sets @var{preg}'s @code{allocated} and @code{used}
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2452 fields to zero. After freeing a pattern buffer, you need to again
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2453 compile a regular expression in it (@pxref{POSIX Regular Expression
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2454 Compiling}) before passing it to the matching function (@pxref{POSIX
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2455 Matching}).
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2456
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2457
13533 ca70a11e70e2 Integrate the regex documentation. Bruno Haible <bruno@clisp.org> parents: 13532 diff changeset	2458 @node BSD Regex Functions
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2459 @section BSD Regex Functions
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2460
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2461 If you're writing code that has to be Berkeley @sc{unix} compatible,
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2462 you'll need to use these functions whose interfaces are the same as those
13532 b0bea693e638 Whitespace cleanup. Bruno Haible <bruno@clisp.org> parents: 13531 diff changeset	2463 in Berkeley @sc{unix}.
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2464
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2465 @menu
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2466 * BSD Regular Expression Compiling:: re_comp ()
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2467 * BSD Searching:: re_exec ()
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2468 @end menu
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2469
13533 ca70a11e70e2 Integrate the regex documentation. Bruno Haible <bruno@clisp.org> parents: 13532 diff changeset	2470 @node BSD Regular Expression Compiling
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2471 @subsection BSD Regular Expression Compiling
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2472
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2473 With Berkeley @sc{unix}, you can only search for a given regular
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2474 expression; you can't match one. To search for it, you must first
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2475 compile it. Before you compile it, you must indicate the regular
13532 b0bea693e638 Whitespace cleanup. Bruno Haible <bruno@clisp.org> parents: 13531 diff changeset	2476 expression syntax you want it compiled according to by setting the
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2477 variable @code{re_syntax_options} (declared in @file{regex.h} to some
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2478 syntax (@pxref{Regular Expression Syntax}).
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2479
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2480 To compile a regular expression use:
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2481
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2482 @findex re_comp
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2483 @example
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2484 char *
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2485 re_comp (char *@var{regex})
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2486 @end example
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2487
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2488 @noindent
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2489 @var{regex} is the address of a null-terminated regular expression.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2490 @code{re_comp} uses an internal pattern buffer, so you can use only the
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2491 most recently compiled pattern buffer. This means that if you want to
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2492 use a given regular expression that you've already compiled---but it
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2493 isn't the latest one you've compiled---you'll have to recompile it. If
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2494 you call @code{re_comp} with the null string (@emph{not} the empty
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2495 string) as the argument, it doesn't change the contents of the pattern
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2496 buffer.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2497
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2498 If @code{re_comp} successfully compiles the regular expression, it
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2499 returns zero. If it can't compile the regular expression, it returns
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2500 an error string. @code{re_comp}'s error messages are identical to those
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2501 of @code{re_compile_pattern} (@pxref{GNU Regular Expression
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2502 Compiling}).
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2503
13533 ca70a11e70e2 Integrate the regex documentation. Bruno Haible <bruno@clisp.org> parents: 13532 diff changeset	2504 @node BSD Searching
13532 b0bea693e638 Whitespace cleanup. Bruno Haible <bruno@clisp.org> parents: 13531 diff changeset	2505 @subsection BSD Searching
13531 de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2506
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2507 Searching the Berkeley @sc{unix} way means searching in a string
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2508 starting at its first character and trying successive positions within
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2509 it to find a match. Once you've compiled a pattern using @code{re_comp}
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2510 (@pxref{BSD Regular Expression Compiling}), you can ask Regex
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2511 to search for that pattern in a string using:
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2512
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2513 @findex re_exec
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2514 @example
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2515 int
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2516 re_exec (char *@var{string})
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2517 @end example
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2518
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2519 @noindent
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2520 @var{string} is the address of the null-terminated string in which you
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2521 want to search.
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2522
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2523 @code{re_exec} returns either 1 for success or 0 for failure. It
de7ebb2f1530 Add regex documentation. Bruno Haible <bruno@clisp.org> parents: diff changeset	2524 automatically uses a @sc{gnu} fastmap (@pxref{Searching with Fastmaps}).