Mercurial > hg > octave-jordi > gnulib-hg
changeset 13532:b0bea693e638
Whitespace cleanup.
author | Bruno Haible <bruno@clisp.org> |
---|---|
date | Sun, 01 Aug 2010 17:29:07 +0200 |
parents | de7ebb2f1530 |
children | ca70a11e70e2 |
files | ChangeLog doc/regex.texi |
diffstat | 2 files changed, 102 insertions(+), 99 deletions(-) [+] |
line wrap: on
line diff
--- a/ChangeLog +++ b/ChangeLog @@ -1,5 +1,8 @@ 2010-08-01 Bruno Haible <bruno@clisp.org> + Whitespace cleanup. + * doc/regex.texi: Remove trailing spaces. + Add regex documentation. * doc/regex.texi: New file. Taken from regex-0.12/doc/regex.texi in http://ftp.gnu.org/old-gnu/regex/regex-0.12.tar.gz.
--- a/doc/regex.texi +++ b/doc/regex.texi @@ -127,7 +127,7 @@ * Back-reference Operator:: \digit * Anchoring Operators:: ^ $ -Repetition Operators +Repetition Operators * Match-zero-or-more Operator:: * * Match-one-or-more Operator:: + @@ -139,7 +139,7 @@ * Character Class Operators:: [:class:] * Range Operator:: start-end -Anchoring Operators +Anchoring Operators * Match-beginning-of-line Operator:: ^ * Match-end-of-line Operator:: $ @@ -159,7 +159,7 @@ * Match-word-constituent Operator:: \w * Match-non-word-constituent Operator:: \W -Buffer Operators +Buffer Operators * Match-beginning-of-buffer Operator:: \` * Match-end-of-buffer Operator:: \' @@ -220,7 +220,7 @@ @itemize @bullet @item -see if a string matches a specified pattern as a whole, and +see if a string matches a specified pattern as a whole, and @item search within a string for a substring matching a specified pattern. @@ -246,7 +246,7 @@ number of times. The Regex library consists of two source files: @file{regex.h} and -@file{regex.c}. +@file{regex.c}. @pindex regex.h @pindex regex.c Regex provides three groups of functions with which you can operate on @@ -302,7 +302,7 @@ @node Syntax Bits, Predefined Syntaxes, , Regular Expression Syntax -@section Syntax Bits +@section Syntax Bits @cindex syntax bits @@ -322,7 +322,7 @@ of bits; we refer to these bits as @dfn{syntax bits}. In most cases, they affect what characters represent what operators. We describe the meanings of the operators to which we refer in @ref{Common Operators}, -@ref{GNU Operators}, and @ref{GNU Emacs Operators}. +@ref{GNU Operators}, and @ref{GNU Emacs Operators}. For reference, here is the complete list of syntax bits, in alphabetical order: @@ -462,7 +462,7 @@ @node Predefined Syntaxes, Collating Elements vs. Characters, Syntax Bits, Regular Expression Syntax -@section Predefined Syntaxes +@section Predefined Syntaxes If you're programming with Regex, you can set a pattern buffer's (@pxref{GNU Pattern Buffers}, and @ref{POSIX Pattern Buffers}) @@ -470,10 +470,10 @@ (@pxref{Syntax Bits}) or else to the configurations defined by Regex. These configurations define the syntaxes used by certain programs---@sc{gnu} Emacs, -@cindex Emacs +@cindex Emacs @sc{posix} Awk, @cindex POSIX Awk -traditional Awk, +traditional Awk, @cindex Awk Grep, @cindex Grep @@ -544,7 +544,7 @@ @end example @node Collating Elements vs. Characters, The Backslash Character, Predefined Syntaxes, Regular Expression Syntax -@section Collating Elements vs.@: Characters +@section Collating Elements vs.@: Characters @sc{posix} generalizes the notion of a character to that of a collating element. It defines a @dfn{collating element} to be ``a @@ -674,7 +674,7 @@ represents the open-group operator. Which one does depends on the setting of a syntax bit, in this case @code{RE_NO_BK_PARENS}. Why is this so? Historical reasons dictate some of the varying -representations, while @sc{posix} dictates others. +representations, while @sc{posix} dictates others. Finally, almost all characters lose any special meaning inside a list (@pxref{List Operators}). @@ -731,7 +731,7 @@ example, @samp{xy} (two match-self operators) matches @samp{xy}. @node Repetition Operators, Alternation Operator, Concatenation Operator, Common Operators -@section Repetition Operators +@section Repetition Operators Repetition operators repeat the preceding regular expression a specified number of times. @@ -761,10 +761,10 @@ case when it: @itemize @bullet -@item +@item is first in a regular expression, or -@item +@item follows a match-beginning-of-line, open-group, or alternation operator. @@ -791,7 +791,7 @@ @cindex backtracking The matcher processes a match-zero-or-more operator by first matching as many repetitions of the smallest preceding regular expression as it can. -Then it continues to match the rest of the pattern. +Then it continues to match the rest of the pattern. If it can't match the rest of the pattern, it backtracks (as many times as necessary), each time discarding one of the matches until it can @@ -807,7 +807,7 @@ @node Match-one-or-more Operator, Match-zero-or-one Operator, Match-zero-or-more Operator, Repetition Operators @subsection The Match-one-or-more Operator (@code{+} or @code{\+}) -@cindex @samp{+} +@cindex @samp{+} If the syntax bit @code{RE_LIMITED_OPS} is set, then Regex doesn't recognize this operator. Otherwise, if the syntax bit @code{RE_BK_PLUS_QM} isn't @@ -881,7 +881,7 @@ @itemize @bullet @item -@var{min} is greater than @var{max}, or +@var{min} is greater than @var{max}, or @item any of @var{count}, @var{min}, or @var{max} are outside the range @@ -960,11 +960,11 @@ match @samp{foo} or @samp{bar}.) @cindex backtracking -The matcher usually tries all combinations of alternatives so as to +The matcher usually tries all combinations of alternatives so as to match the longest possible string. For example, when matching @samp{(fooq|foo)*(qbarquux|bar)} against @samp{fooqbarquux}, it cannot take, say, the first (``depth-first'') combination it could match, since -then it would be content to match just @samp{fooqbar}. +then it would be content to match just @samp{fooqbar}. @comment xx something about leftmost-longest @@ -987,7 +987,7 @@ more items. An @dfn{item} is a character, @ignore (These get added when they get implemented.) -a collating symbol, an equivalence class expression, +a collating symbol, an equivalence class expression, @end ignore a character class expression, or a range expression. The syntax bits affect which kinds of items you can put in a list. We explain the last @@ -996,7 +996,7 @@ A @dfn{matching list} matches a single character represented by one of the list items. You form a matching list by enclosing one or more items within an @dfn{open-matching-list operator} (represented by @samp{[}) -and a @dfn{close-list operator} (represented by @samp{]}). +and a @dfn{close-list operator} (represented by @samp{]}). For example, @samp{[ab]} matches either @samp{a} or @samp{b}. @samp{[ad]*} matches the empty string and any string composed of just @@ -1011,10 +1011,10 @@ the first character in the list. If you put a @samp{^} character first in (what you think is) a matching list, you'll turn it into a nonmatching list.}) instead of an open-matching-list operator to start a -nonmatching list. +nonmatching list. For example, @samp{[^ab]} matches any character except @samp{a} or -@samp{b}. +@samp{b}. If the @code{posix_newline} field in the pattern buffer (@pxref{GNU Pattern Buffers} is set, then nonmatching lists do not match a newline. @@ -1060,15 +1060,15 @@ @code{RE_CHAR_CLASSES} is set and what precedes it is an open-character-class operator followed by a valid character class name. -@item - +@item - represents the range operator (@pxref{Range Operator}) if it's not first or last in a list or the ending point of a range. @end table @noindent -All other characters are ordinary. For example, @samp{[.*]} matches -@samp{.} and @samp{*}. +All other characters are ordinary. For example, @samp{[.*]} matches +@samp{.} and @samp{*}. @menu * Character Class Operators:: [:class:] @@ -1129,7 +1129,7 @@ @table @code -@item alnum +@item alnum letters and digits @item alpha @@ -1148,11 +1148,11 @@ @item graph same as @code{print} except omits space -@item lower +@item lower lowercase letters @item print -printable characters (in the @sc{ascii} encoding, space +printable characters (in the @sc{ascii} encoding, space tilde---codes 040 through 0176) @item punct @@ -1183,7 +1183,7 @@ Regex recognizes @dfn{range expressions} inside a list. They represent those characters that fall between two elements in the current collating sequence. You -form a range expression by putting a @dfn{range operator} between two +form a range expression by putting a @dfn{range operator} between two @ignore (If these get implemented, then substitute this for ``characters.'') of any of the following: characters, collating elements, collating symbols, @@ -1262,7 +1262,7 @@ Operator}) or a repetition operator (@pxref{Repetition Operators}). -@item +@item keep track of the indices of the substring that matched a given group. @xref{Using Registers}, for a precise explanation. This lets you: @@ -1271,7 +1271,7 @@ @item use the back-reference operator (@pxref{Back-reference Operator}). -@item +@item use registers (@pxref{Using Registers}). @end itemize @@ -1352,7 +1352,7 @@ @node Anchoring Operators, , Back-reference Operator, Common Operators -@section Anchoring Operators +@section Anchoring Operators @cindex anchoring @cindex regexp anchoring @@ -1463,7 +1463,7 @@ @end menu @node Non-Emacs Syntax Tables, Match-word-boundary Operator, , Word Operators -@subsection Non-Emacs Syntax Tables +@subsection Non-Emacs Syntax Tables A @dfn{syntax table} is an array indexed by the characters in your character set. In the @sc{ascii} encoding, therefore, a syntax table @@ -1543,7 +1543,7 @@ @node Buffer Operators, , Word Operators, GNU Operators -@section Buffer Operators +@section Buffer Operators Following are operators which work on buffers. In Emacs, a @dfn{buffer} is, naturally, an Emacs buffer. For other programs, Regex considers the @@ -1577,7 +1577,7 @@ Following are operators that @sc{gnu} defines (and @sc{posix} doesn't) that you can use only when Regex is compiled with the preprocessor -symbol @code{emacs} defined. +symbol @code{emacs} defined. @menu * Syntactic Class Operators:: @@ -1710,7 +1710,7 @@ unsigned long allocated; /* Number of bytes actually used in `buffer'. */ - unsigned long used; + unsigned long used; /* Syntax setting with which the pattern was compiled. */ reg_syntax_t syntax; @@ -1754,7 +1754,7 @@ unsigned no_sub : 1; /* If set, a beginning-of-line anchor doesn't match at the - beginning of the string. */ + beginning of the string. */ unsigned not_bol : 1; /* Similarly for an end-of-line anchor. */ @@ -1824,8 +1824,8 @@ @findex re_compile_pattern @example -char * -re_compile_pattern (const char *@var{regex}, const int @var{regex_size}, +char * +re_compile_pattern (const char *@var{regex}, const int @var{regex_size}, struct re_pattern_buffer *@var{pattern_buffer}) @end example @@ -1858,7 +1858,7 @@ @vindex fastmap_accurate @r{field, set by @code{re_compile_pattern}} to zero on the theory that the pattern you're compiling is different than the one previously compiled into @code{buffer}; in that case (since -you can't make a fastmap without a compiled pattern), +you can't make a fastmap without a compiled pattern), @code{fastmap} would either contain an incompatible fastmap, or nothing at all. @@ -1871,7 +1871,7 @@ @node GNU Matching, GNU Searching, GNU Regular Expression Compiling, GNU Regex Functions -@subsection GNU Matching +@subsection GNU Matching @cindex matching with GNU functions @@ -1884,8 +1884,8 @@ @findex re_match @example int -re_match (struct re_pattern_buffer *@var{pattern_buffer}, - const char *@var{string}, const int @var{size}, +re_match (struct re_pattern_buffer *@var{pattern_buffer}, + const char *@var{string}, const int @var{size}, const int @var{start}, struct re_registers *@var{regs}) @end example @@ -1921,7 +1921,7 @@ @node GNU Searching, Matching/Searching with Split Data, GNU Matching, GNU Regex Functions -@subsection GNU Searching +@subsection GNU Searching @cindex searching with GNU functions @@ -1935,10 +1935,10 @@ @findex re_search @example -int -re_search (struct re_pattern_buffer *@var{pattern_buffer}, - const char *@var{string}, const int @var{size}, - const int @var{start}, const int @var{range}, +int +re_search (struct re_pattern_buffer *@var{pattern_buffer}, + const char *@var{string}, const int @var{size}, + const int @var{start}, const int @var{range}, struct re_registers *@var{regs}) @end example @@ -1954,7 +1954,7 @@ that fails, and so on, up to @math{@var{start} + @var{range}}; if @var{range} is negative, then it attempts a match starting first at index @var{start}, then at @math{@var{start} -1} if that fails, and so -on. +on. If @var{start} is not between zero and @var{size}, then @code{re_search} returns @math{-1}. When @var{range} is positive, @code{re_search} @@ -1978,18 +1978,18 @@ @subsection Matching and Searching with Split Data Using the functions @code{re_match_2} and @code{re_search_2}, you can -match or search in data that is divided into two strings. +match or search in data that is divided into two strings. The function: @findex re_match_2 @example int -re_match_2 (struct re_pattern_buffer *@var{buffer}, - const char *@var{string1}, const int @var{size1}, - const char *@var{string2}, const int @var{size2}, - const int @var{start}, - struct re_registers *@var{regs}, +re_match_2 (struct re_pattern_buffer *@var{buffer}, + const char *@var{string1}, const int @var{size1}, + const char *@var{string2}, const int @var{size2}, + const int @var{start}, + struct re_registers *@var{regs}, const int @var{stop}) @end example @@ -2001,18 +2001,18 @@ characters of @var{string} it matched. Regard @var{string1} and @var{string2} as concatenated when you set the arguments @var{start} and @var{stop} and use the contents of @var{regs}; @code{re_match_2} never -returns a value larger than @math{@var{size1} + @var{size2}}. +returns a value larger than @math{@var{size1} + @var{size2}}. The function: @findex re_search_2 @example int -re_search_2 (struct re_pattern_buffer *@var{buffer}, - const char *@var{string1}, const int @var{size1}, - const char *@var{string2}, const int @var{size2}, - const int @var{start}, const int @var{range}, - struct re_registers *@var{regs}, +re_search_2 (struct re_pattern_buffer *@var{buffer}, + const char *@var{string1}, const int @var{size1}, + const char *@var{string2}, const int @var{size2}, + const int @var{start}, const int @var{range}, + struct re_registers *@var{regs}, const int @var{stop}) @end example @@ -2038,7 +2038,7 @@ address to the pattern buffer's @code{fastmap} field. You either can compile the fastmap yourself or have @code{re_search} do it for you; when @code{fastmap} is nonzero, it automatically compiles a fastmap the -first time you search using a particular compiled pattern. +first time you search using a particular compiled pattern. To compile a fastmap yourself, use: @@ -2182,7 +2182,7 @@ @sc{posix}, on the other hand, requires a different interface: the caller is supposed to pass in a fixed-length array which the matcher -fills. Therefore, if @code{regs_allocated} is @code{REGS_FIXED} +fills. Therefore, if @code{regs_allocated} is @code{REGS_FIXED} @vindex REGS_FIXED the matcher simply fills that array. @@ -2195,7 +2195,7 @@ @itemize @bullet -@item +@item If the regular expression has an @w{@var{i}-th} group not contained within another group that matches a substring of @var{string}, then the function sets @@ -2210,16 +2210,16 @@ @itemize @item -0 in @code{@w{@var{regs}->}start[0]} and 2 in @code{@w{@var{regs}->}end[0]} +0 in @code{@w{@var{regs}->}start[0]} and 2 in @code{@w{@var{regs}->}end[0]} @item -0 in @code{@w{@var{regs}->}start[1]} and 2 in @code{@w{@var{regs}->}end[1]} +0 in @code{@w{@var{regs}->}start[1]} and 2 in @code{@w{@var{regs}->}end[1]} @item -0 in @code{@w{@var{regs}->}start[2]} and 1 in @code{@w{@var{regs}->}end[2]} +0 in @code{@w{@var{regs}->}start[2]} and 1 in @code{@w{@var{regs}->}end[2]} @item -1 in @code{@w{@var{regs}->}start[3]} and 2 in @code{@w{@var{regs}->}end[3]} +1 in @code{@w{@var{regs}->}start[3]} and 2 in @code{@w{@var{regs}->}end[3]} @end itemize @item @@ -2232,10 +2232,10 @@ @itemize @item -0 in @code{@w{@var{regs}->}start[0]} and 2 in @code{@w{@var{regs}->}end[0]} +0 in @code{@w{@var{regs}->}start[0]} and 2 in @code{@w{@var{regs}->}end[0]} @item -1 in @code{@w{@var{regs}->}start[1]} and 2 in @code{@w{@var{regs}->}end[1]} +1 in @code{@w{@var{regs}->}start[1]} and 2 in @code{@w{@var{regs}->}end[1]} @end itemize @item @@ -2250,27 +2250,27 @@ @itemize @item -0 in @code{@w{@var{regs}->}start[0]} and 1 in @code{@w{@var{regs}->}end[0]} +0 in @code{@w{@var{regs}->}start[0]} and 1 in @code{@w{@var{regs}->}end[0]} @item -@math{-1} in @code{@w{@var{regs}->}start[1]} and @math{-1} in @code{@w{@var{regs}->}end[1]} +@math{-1} in @code{@w{@var{regs}->}start[1]} and @math{-1} in @code{@w{@var{regs}->}end[1]} @end itemize @item If the @w{@var{i}-th} group matches a zero-length string, then the function sets @code{@w{@var{regs}->}start[@var{i}]} and @code{@w{@var{regs}->}end[@var{i}]} to the index just beyond that -zero-length string. +zero-length string. For example, when you match the pattern @samp{(a*)b} against the string @samp{b}, you get: @itemize @item -0 in @code{@w{@var{regs}->}start[0]} and 1 in @code{@w{@var{regs}->}end[0]} +0 in @code{@w{@var{regs}->}start[0]} and 1 in @code{@w{@var{regs}->}end[0]} @item -0 in @code{@w{@var{regs}->}start[1]} and 0 in @code{@w{@var{regs}->}end[1]} +0 in @code{@w{@var{regs}->}start[1]} and 0 in @code{@w{@var{regs}->}end[1]} @end itemize @ignore @@ -2283,15 +2283,15 @@ @itemize @item -0 in @code{@w{@var{regs}->}start[0]} and 0 in @code{@w{@var{regs}->}end[0]} +0 in @code{@w{@var{regs}->}start[0]} and 0 in @code{@w{@var{regs}->}end[0]} @item -0 in @code{@w{@var{regs}->}start[1]} and 0 in @code{@w{@var{regs}->}end[1]} +0 in @code{@w{@var{regs}->}start[1]} and 0 in @code{@w{@var{regs}->}end[1]} @end itemize @end ignore @item -If an @w{@var{i}-th} group contains a @w{@var{j}-th} group +If an @w{@var{i}-th} group contains a @w{@var{j}-th} group in turn not contained within any other group within group @var{i} and the function reports a match of the @w{@var{i}-th} group, then it records in @code{@w{@var{regs}->}start[@var{j}]} and @@ -2304,13 +2304,13 @@ @itemize @item -0 in @code{@w{@var{regs}->}start[0]} and 3 in @code{@w{@var{regs}->}end[0]} +0 in @code{@w{@var{regs}->}start[0]} and 3 in @code{@w{@var{regs}->}end[0]} @item -2 in @code{@w{@var{regs}->}start[1]} and 3 in @code{@w{@var{regs}->}end[1]} +2 in @code{@w{@var{regs}->}start[1]} and 3 in @code{@w{@var{regs}->}end[1]} @item -2 in @code{@w{@var{regs}->}start[2]} and 2 in @code{@w{@var{regs}->}end[2]} +2 in @code{@w{@var{regs}->}start[2]} and 2 in @code{@w{@var{regs}->}end[2]} @end itemize When you match the pattern @samp{((a)*b)*} against the string @@ -2319,20 +2319,20 @@ @itemize @item -0 in @code{@w{@var{regs}->}start[0]} and 3 in @code{@w{@var{regs}->}end[0]} +0 in @code{@w{@var{regs}->}start[0]} and 3 in @code{@w{@var{regs}->}end[0]} @item -2 in @code{@w{@var{regs}->}start[1]} and 3 in @code{@w{@var{regs}->}end[1]} +2 in @code{@w{@var{regs}->}start[1]} and 3 in @code{@w{@var{regs}->}end[1]} @item -0 in @code{@w{@var{regs}->}start[2]} and 1 in @code{@w{@var{regs}->}end[2]} +0 in @code{@w{@var{regs}->}start[2]} and 1 in @code{@w{@var{regs}->}end[2]} @end itemize @item If an @w{@var{i}-th} group contains a @w{@var{j}-th} group in turn not contained within any other group within group @var{i} -and the function sets -@code{@w{@var{regs}->}start[@var{i}]} and +and the function sets +@code{@w{@var{regs}->}start[@var{i}]} and @code{@w{@var{regs}->}end[@var{i}]} to @math{-1}, then it also sets @code{@w{@var{regs}->}start[@var{j}]} and @code{@w{@var{regs}->}end[@var{j}]} to @math{-1}. @@ -2342,13 +2342,13 @@ @itemize @item -0 in @code{@w{@var{regs}->}start[0]} and 1 in @code{@w{@var{regs}->}end[0]} +0 in @code{@w{@var{regs}->}start[0]} and 1 in @code{@w{@var{regs}->}end[0]} @item -@math{-1} in @code{@w{@var{regs}->}start[1]} and @math{-1} in @code{@w{@var{regs}->}end[1]} +@math{-1} in @code{@w{@var{regs}->}start[1]} and @math{-1} in @code{@w{@var{regs}->}end[1]} @item -@math{-1} in @code{@w{@var{regs}->}start[2]} and @math{-1} in @code{@w{@var{regs}->}end[2]} +@math{-1} in @code{@w{@var{regs}->}start[2]} and @math{-1} in @code{@w{@var{regs}->}end[2]} @end itemize @end itemize @@ -2543,7 +2543,7 @@ @node POSIX Matching, Reporting Errors, POSIX Regular Expression Compiling, POSIX Regex Functions -@subsection POSIX Matching +@subsection POSIX Matching Matching the @sc{posix} way means trying to match a null-terminated string starting at its first character. Once you've compiled a pattern @@ -2553,13 +2553,13 @@ @findex regexec @example int -regexec (const regex_t *@var{preg}, const char *@var{string}, +regexec (const regex_t *@var{preg}, const char *@var{string}, size_t @var{nmatch}, regmatch_t @var{pmatch}[], int @var{eflags}) @end example @noindent @var{preg} is the address of a pattern buffer for a compiled pattern. -@var{string} is the string you want to match. +@var{string} is the string you want to match. @xref{Using Byte Offsets}, for an explanation of @var{pmatch}. If you pass zero for @var{nmatch} or you compiled @var{preg} with the @@ -2613,7 +2613,7 @@ corresponding to @var{errcode} (including its terminating null). If @var{errbuf} and @var{errbuf_size} are nonzero, it also returns in @var{errbuf} the first @math{@var{errbuf_size} - 1} characters of the -error string, followed by a null. +error string, followed by a null. @var{errbuf_size} must be a nonnegative number less than or equal to the size in bytes of @var{errbuf}. @@ -2654,7 +2654,7 @@ @findex regfree @example -void +void regfree (regex_t *@var{preg}) @end example @@ -2672,7 +2672,7 @@ If you're writing code that has to be Berkeley @sc{unix} compatible, you'll need to use these functions whose interfaces are the same as those -in Berkeley @sc{unix}. +in Berkeley @sc{unix}. @menu * BSD Regular Expression Compiling:: re_comp () @@ -2685,7 +2685,7 @@ With Berkeley @sc{unix}, you can only search for a given regular expression; you can't match one. To search for it, you must first compile it. Before you compile it, you must indicate the regular -expression syntax you want it compiled according to by setting the +expression syntax you want it compiled according to by setting the variable @code{re_syntax_options} (declared in @file{regex.h} to some syntax (@pxref{Regular Expression Syntax}). @@ -2714,7 +2714,7 @@ Compiling}). @node BSD Searching, , BSD Regular Expression Compiling, BSD Regex Functions -@subsection BSD Searching +@subsection BSD Searching Searching the Berkeley @sc{unix} way means searching in a string starting at its first character and trying successive positions within