Mercurial > hg > octave-jordi
changeset 16149:49dfba4fd3c5
use pure parser and reentrant lexer interfaces
Making the Octave parser and lexer properly reentrant (and perhaps
eventually thread safe as well) is still a work in progress. With the
current set of changes the parser and lexer still use many global
variables, so these changes alone do NOT make the Octave parser
reentrant unless you take care to properly save and restore (typically
with an unwind_protect object) relevant global values before and after
calling the parser. Even if global variables are properly saved and
restored, the parser will NOT be thread safe.
* lex.ll: Use %option reentrant an %option bison-bridge.
(yylval): Delete macro.
(YY_EXTRA_TYPE, curr_lexer): New macros. Undefine curr_lexer
(YY_FATAL_ERROR): Update decl for reentrant scanner.
(lexical_feedback::reset): Update call to yyrestart for reentrant
scanner interface.
(lexical_feedback::fatal_error): Update call to yy_fatal_error for
reentrant scanner interface.
(lexical_feedback::text_yyinput): Update calls to yyinput and yyunput
for reentrant scanner interface.
(lexical_feedback::flex_yyleng): Use function interface to access
yyleng.
(lexical_feedback::flex_yytext): Use function interface to access
yytext.
(lexical_feedback::push_token, lexical_feedback::current_token):
Use function interface to access yylval.
* oct-parse.yy: Use %define api.pure, %parse-param, and %lex-param
options.
(curr_lexer): Define for syntax rules section.
(scanner): New macro.
* oct-parse.yy: Include oct-parse.h.
(octave_lex): Declare.
(yyerror): Update declaration for pure parser.
* parse.h (octave_lex): Delete decl.
* oct-parse.yy (octave_parser::run): Pass pointer to octave_parser
object to octave_parse.
* lex.ll (lexical_feedback::octave_read): Call fatal_error directly
instead of using YY_FATAL_ERROR.
* oct-parse.yy (parse_fcn_file): Pass line and column info for lexter
to gobble_leading_whitespace. Access prep_for_script_file,
prep_for_function_file, parsing_class_method, input_line_number, and
current_input_column through curr_parser.
* parse.h, oct-parse.yy (YY_BUFFER_STATE, create_buffer,
current_buffer, switch_to_buffer, delete_buffer, clear_all_buffers):
Delete.
* toplev.cc (main_loop): Don't create new buffer for lexer.
* input.cc (get_debug_input): Likewise.
* oct-parse.yy (eval_string, parse_fcn_file): Likewise.
* octave.cc (octave_initialize_interpreter): Likewise.
* input.cc (get_debug_input): Likewise.
* oct-parse.yy (eval_string, parse_fcn_file): Create parser as needed.
* octave.cc (octave_initialize_interpreter): Likewise.
* input.cc (get_debug_input): Likewise.
* input.cc (input_even_hook): Allow function to run even if currently
defining a function.
* lex.h, lex.ll (curr_lexer): Delete global variable.
* parse.h, oct-parse.yy (octave_parser::curr_lexer): New data member.
(octave_parser::octave_parser): Create lexer here.
(curr_parser): Delete global variable.
* toplev.cc (main_loop): Don't protect global curr_lexer and
curr_parser variables.
* oct-parse.yy (eval_string, parse_fcn_file): Likewise.
* input.cc (get_debug_input): Likewise.
* lex.h, lex.ll (curr_lexer): Delete global variable.
* parse.h, oct-parse.yy (CURR_LEXER): New temporary global.
(octave_parser::octave_parser): Set global CURR_LEXER here.
* toplev.cc (main_loop): Protect CURR_LEXER prior to constructing
new parser object.
* input.cc (get_debug_input): Likewise.
* oct-parse.yy (eval_string, parse_fcn_file): Likewise.
* lex.h, lex.ll (lexical_feedback::scanner): New data member.
(lexical_feedback::init): Create it. Call yylex_set_extra to store
pointer to lexical_feedback object in scanner data.
(lexical_feedback::~lexical_feedback): Delete it.
* lex.ll (YYG): New macro.
(lexical_feedback::reset, lexical_feedback::prep_for_script_file,
lexical_feedback::prep_for_function_file,
lexical_feedback::process_comment,
lexical_feedback::handle_close_bracket,
lexical_feedback::handle_identifier, lexical_feedback::lexer_debug):
Use it to access scanner data.
author | John W. Eaton <jwe@octave.org> |
---|---|
date | Wed, 27 Feb 2013 18:49:16 -0500 |
parents | 10abbc493f50 |
children | 891a2a4df71f |
files | libinterp/interpfcn/input.cc libinterp/interpfcn/toplev.cc libinterp/octave.cc libinterp/parse-tree/lex.h libinterp/parse-tree/lex.ll libinterp/parse-tree/oct-parse.yy libinterp/parse-tree/parse.h |
diffstat | 7 files changed, 168 insertions(+), 208 deletions(-) [+] |
line wrap: on
line diff
--- a/libinterp/interpfcn/input.cc +++ b/libinterp/interpfcn/input.cc @@ -667,29 +667,18 @@ frame.protect_var (get_input_from_eval_string); get_input_from_eval_string = false; - - YY_BUFFER_STATE old_buf = current_buffer (); - YY_BUFFER_STATE new_buf = create_buffer (get_input_from_stdin ()); - - // FIXME: are these safe? - frame.add_fcn (switch_to_buffer, old_buf); - frame.add_fcn (delete_buffer, new_buf); - - switch_to_buffer (new_buf); } - frame.protect_var (curr_lexer); - curr_lexer = new lexical_feedback (); - frame.add_fcn (lexical_feedback::cleanup, curr_lexer); - - frame.protect_var (curr_parser); - curr_parser = new octave_parser (); - frame.add_fcn (octave_parser::cleanup, curr_parser); - while (Vdebugging) { unwind_protect middle_frame; + // octave_parser constructor sets this for us. + middle_frame.protect_var (CURR_LEXER); + + octave_parser *curr_parser = new octave_parser (); + middle_frame.add_fcn (octave_parser::cleanup, curr_parser); + reset_error_handler (); curr_parser->reset (); @@ -1199,33 +1188,28 @@ static int input_event_hook (void) { - if (! curr_lexer->defining_func) - { - hook_fcn_map_type::iterator p = hook_fcn_map.begin (); + hook_fcn_map_type::iterator p = hook_fcn_map.begin (); - while (p != hook_fcn_map.end ()) - { - std::string hook_fcn = p->first; - octave_value user_data = p->second; + while (p != hook_fcn_map.end ()) + { + std::string hook_fcn = p->first; + octave_value user_data = p->second; - hook_fcn_map_type::iterator q = p++; + hook_fcn_map_type::iterator q = p++; - if (is_valid_function (hook_fcn)) - { - if (user_data.is_defined ()) - feval (hook_fcn, user_data, 0); - else - feval (hook_fcn, octave_value_list (), 0); - } + if (is_valid_function (hook_fcn)) + { + if (user_data.is_defined ()) + feval (hook_fcn, user_data, 0); else - hook_fcn_map.erase (q); + feval (hook_fcn, octave_value_list (), 0); } - - if (hook_fcn_map.empty ()) - command_editor::remove_event_hook (input_event_hook); + else + hook_fcn_map.erase (q); } - return 0; + if (hook_fcn_map.empty ()) + command_editor::remove_event_hook (input_event_hook); } DEFUN (add_input_event_hook, args, ,
--- a/libinterp/interpfcn/toplev.cc +++ b/libinterp/interpfcn/toplev.cc @@ -557,16 +557,6 @@ octave_initialized = true; - unwind_protect frame; - - frame.protect_var (curr_lexer); - curr_lexer = new lexical_feedback (); - frame.add_fcn (lexical_feedback::cleanup, curr_lexer); - - frame.protect_var (curr_parser); - curr_parser = new octave_parser (); - frame.add_fcn (octave_parser::cleanup, curr_parser); - // The big loop. int retval = 0; @@ -574,7 +564,13 @@ { try { - unwind_protect inner_frame; + unwind_protect frame; + + // octave_parser constructor sets this for us. + frame.protect_var (CURR_LEXER); + + octave_parser *curr_parser = new octave_parser (); + frame.add_fcn (octave_parser::cleanup, curr_parser); reset_error_handler ();
--- a/libinterp/octave.cc +++ b/libinterp/octave.cc @@ -997,9 +997,6 @@ // Now argv should have the full set of args. intern_argv (octave_cmdline_argc, octave_cmdline_argv); - if (! octave_embedded) - switch_to_buffer (create_buffer (get_input_from_stdin ())); - // Force input to be echoed if not really interactive, but the user // has forced interactive behavior.
--- a/libinterp/parse-tree/lex.h +++ b/libinterp/parse-tree/lex.h @@ -27,25 +27,6 @@ #include <set> #include <stack> -// FIXME -- these input buffer things should be members of a -// parser input stream class. - -typedef struct yy_buffer_state *YY_BUFFER_STATE; - -// Associate a buffer with a new file to read. -extern OCTINTERP_API YY_BUFFER_STATE create_buffer (FILE *f); - -// Report the current buffer. -extern OCTINTERP_API YY_BUFFER_STATE current_buffer (void); - -// Connect to new buffer buffer. -extern OCTINTERP_API void switch_to_buffer (YY_BUFFER_STATE buf); - -// Delete a buffer. -extern OCTINTERP_API void delete_buffer (YY_BUFFER_STATE buf); - -extern OCTINTERP_API void clear_all_buffers (void); - extern OCTINTERP_API void cleanup_parser (void); // Is the given string a keyword? @@ -173,8 +154,8 @@ }; lexical_feedback (void) - : convert_spaces_to_comma (true), do_comma_insert (false), - at_beginning_of_statement (true), + : scanner (0), convert_spaces_to_comma (true), + do_comma_insert (false), at_beginning_of_statement (true), looking_at_anon_fcn_args (false), looking_at_return_list (false), looking_at_parameter_list (false), looking_at_decl_list (false), looking_at_initializer_expression (false), @@ -195,12 +176,7 @@ ~lexical_feedback (void); - void init (void) - { - // The closest paren, brace, or bracket nesting is not an object - // index. - looking_at_object_index.push_front (false); - } + void init (void); void reset (void); @@ -297,6 +273,9 @@ void lexer_debug (const char *pattern, const char *text); + // Internal state of the flex-generated lexer. + void *scanner; + // TRUE means that we should convert spaces to a comma inside a // matrix definition. bool convert_spaces_to_comma; @@ -407,7 +386,4 @@ lexical_feedback& operator = (const lexical_feedback&); }; -// The current state of the lexer. -extern lexical_feedback *curr_lexer; - #endif
--- a/libinterp/parse-tree/lex.ll +++ b/libinterp/parse-tree/lex.ll @@ -20,8 +20,19 @@ */ +// We are using the pure parser interface and the reentrant lexer +// interface but the Octave parser and lexer are NOT properly +// reentrant because both still use many global variables. It should be +// safe to create a parser object and call it while anotehr parser +// object is active (to parse a callback function while the main +// interactive parser is waiting for input, for example) if you take +// care to properly save and restore (typically with an unwind_protect +// object) relevant global values before and after the nested call. + %option prefix = "octave_" %option noyywrap +%option reentrant +%option bison-bridge %top { #ifdef HAVE_CONFIG_H @@ -96,7 +107,8 @@ #error lex.l requires flex version 2.5.4 or later #endif -#define yylval octave_lval +#define YY_EXTRA_TYPE lexical_feedback * +#define curr_lexer yyextra // Arrange to get input via readline. @@ -107,14 +119,12 @@ result = curr_lexer->octave_read (buf, max_size) // Try to avoid crashing out completely on fatal scanner errors. -// The call to yy_fatal_error should never happen, but it avoids a -// 'static function defined but not used' warning from gcc. #ifdef YY_FATAL_ERROR #undef YY_FATAL_ERROR #endif #define YY_FATAL_ERROR(msg) \ - curr_lexer->fatal_error (msg) + (yyget_extra (yyscanner))->fatal_error (msg) #define DISPLAY_TOK_AND_RETURN(tok) \ do \ @@ -206,9 +216,6 @@ } \ while (0) -// The state of the lexer. -lexical_feedback *curr_lexer = 0; - static bool Vdisplay_tokens = false; static unsigned int Vtoken_count = 0; @@ -1117,70 +1124,9 @@ } } -// Tell us all what the current buffer is. - -YY_BUFFER_STATE -current_buffer (void) -{ - return YY_CURRENT_BUFFER; -} - -// Create a new buffer. - -YY_BUFFER_STATE -create_buffer (FILE *f) -{ - return yy_create_buffer (f, YY_BUF_SIZE); -} - -// Start reading a new buffer. - -void -switch_to_buffer (YY_BUFFER_STATE buf) -{ - yy_switch_to_buffer (buf); -} - -// Delete a buffer. - -void -delete_buffer (YY_BUFFER_STATE buf) -{ - yy_delete_buffer (buf); - - // Prevent invalid yyin from being used by yyrestart. - if (! current_buffer ()) - yyin = 0; -} - -// Delete all buffers from the stack. -void -clear_all_buffers (void) -{ - while (current_buffer ()) - octave_pop_buffer_state (); -} - void cleanup_parser (void) { - clear_all_buffers (); -} - -// Restore a buffer (for unwind-prot). - -void -restore_input_buffer (void *buf) -{ - switch_to_buffer (static_cast<YY_BUFFER_STATE> (buf)); -} - -// Delete a buffer (for unwind-prot). - -void -delete_input_buffer (void *buf) -{ - delete_buffer (static_cast<YY_BUFFER_STATE> (buf)); } // Return 1 if the given character matches any character in the given @@ -1366,11 +1312,38 @@ delete token_stack.top (); token_stack.pop (); } + + yylex_destroy (scanner); } void +lexical_feedback::init (void) +{ + // The closest paren, brace, or bracket nesting is not an object + // index. + looking_at_object_index.push_front (false); + + yylex_init (&scanner); + + // Make lexical_feedback object available through yyextra in + // flex-generated lexer. + yyset_extra (this, scanner); +} + +// Inside Flex-generated functions, yyg is the scanner cast to its real +// type. The BEGIN macro uses yyg and we want to use that in +// lexical_feedback member functions. If we could set the start state +// by calling a function instead of using the BEGIN macro, we could +// eliminate the OCTAVE_YYG macro. + +#define OCTAVE_YYG \ + struct yyguts_t *yyg = static_cast<struct yyguts_t*> (scanner) + +void lexical_feedback::reset (void) { + OCTAVE_YYG; + // Start off on the right foot. BEGIN (INITIAL); @@ -1389,7 +1362,7 @@ || reading_script_file || get_input_from_eval_string || input_from_startup_file)) - yyrestart (stdin); + yyrestart (stdin, scanner); // Clear the buffer for help text. while (! help_buf.empty ()) @@ -1399,12 +1372,16 @@ void lexical_feedback::prep_for_script_file (void) { + OCTAVE_YYG; + BEGIN (SCRIPT_FILE_BEGIN); } void lexical_feedback::prep_for_function_file (void) { + OCTAVE_YYG; + BEGIN (FUNCTION_FILE_BEGIN); } @@ -1466,7 +1443,7 @@ status = YY_NULL; if (! eof) - YY_FATAL_ERROR ("octave_read () in flex scanner failed"); + fatal_error ("octave_read () in flex scanner failed"); } return status; @@ -1475,13 +1452,13 @@ char * lexical_feedback::flex_yytext (void) { - return yytext; + return yyget_text (scanner); } int lexical_feedback::flex_yyleng (void) { - return yyleng; + return yyget_leng (scanner); } // GAG. @@ -1508,7 +1485,7 @@ int lexical_feedback::text_yyinput (void) { - int c = yyinput (); + int c = yyinput (scanner); if (lexer_debug_flag) { @@ -1521,7 +1498,7 @@ if (c == '\r') { - c = yyinput (); + c = yyinput (scanner); if (lexer_debug_flag) { @@ -1556,13 +1533,14 @@ if (c == '\n') input_line_number--; - yyunput (c, buf); + yyunput (c, buf, scanner); } void lexical_feedback::xunput (char c) { char *yytxt = flex_yytext (); + xunput (c, yytxt); } @@ -2077,6 +2055,8 @@ int lexical_feedback::process_comment (bool start_in_block, bool& eof) { + OCTAVE_YYG; + eof = false; std::string help_txt; @@ -2839,6 +2819,8 @@ int lexical_feedback::handle_close_bracket (bool spc_gobbled, int bracket_type) { + OCTAVE_YYG; + int retval = bracket_type; if (! nesting_level.none ()) @@ -3283,6 +3265,8 @@ int lexical_feedback::handle_identifier (void) { + OCTAVE_YYG; + bool at_bos = at_beginning_of_statement; char *yytxt = flex_yytext (); @@ -3518,14 +3502,16 @@ void lexical_feedback::push_token (token *tok) { - yylval.tok_val = tok; + YYSTYPE *lval = yyget_lval (scanner); + lval->tok_val = tok; token_stack.push (tok); } token * lexical_feedback::current_token (void) { - return yylval.tok_val; + YYSTYPE *lval = yyget_lval (scanner); + return lval->tok_val; } void @@ -3706,12 +3692,14 @@ OCTAVE_QUIT; - yy_fatal_error (msg); + yy_fatal_error (msg, scanner); } void lexical_feedback::lexer_debug (const char *pattern, const char *text) { + OCTAVE_YYG; + std::cerr << std::endl; display_state (YY_START);
--- a/libinterp/parse-tree/oct-parse.yy +++ b/libinterp/parse-tree/oct-parse.yy @@ -77,6 +77,15 @@ #include "utils.h" #include "variables.h" +// oct-parse.h must be included after pt-all.h +#include <oct-parse.h> + +extern int octave_lex (YYSTYPE *, void *); + +// Global access to currently active lexer. +// FIXME -- to be removed after more parser+lexer refactoring. +lexical_feedback *CURR_LEXER = 0; + #if defined (GNULIB_NAMESPACE) // Calls to the following functions appear in the generated output from // Bison without the namespace tag. Redefine them so we will use them @@ -86,9 +95,6 @@ #define malloc GNULIB_NAMESPACE::malloc #endif -// The state of the parser. -octave_parser *curr_parser = 0; - // Buffer for help text snagged from function files. std::stack<std::string> help_buf; @@ -155,7 +161,7 @@ // Forward declarations for some functions defined at the bottom of // the file. -static void yyerror (const char *s); +static void yyerror (octave_parser *curr_parser, const char *s); // Finish building a statement. template <class T> @@ -182,6 +188,9 @@ } \ while (0) +#define curr_lexer curr_parser->curr_lexer +#define scanner curr_lexer->scanner + %} // Bison declarations. @@ -191,6 +200,19 @@ %name-prefix="octave_" +// We are using the pure parser interface and the reentrant lexer +// interface but the Octave parser and lexer are NOT properly +// reentrant because both still use many global variables. It should be +// safe to create a parser object and call it while anotehr parser +// object is active (to parse a callback function while the main +// interactive parser is waiting for input, for example) if you take +// care to properly save and restore (typically with an unwind_protect +// object) relevant global values before and after the nested call. + +%define api.pure +%parse-param { octave_parser *curr_parser } +%lex-param { void *scanner } + %union { // The type of the basic tokens returned by the lexer. @@ -1493,8 +1515,10 @@ // Generic error messages. +#undef curr_lexer + static void -yyerror (const char *s) +yyerror (octave_parser *curr_parser, const char *s) { curr_parser->bison_error (s); } @@ -1502,7 +1526,7 @@ int octave_parser::run (void) { - return octave_parse (); + return octave_parse (this); } // Error mesages for mismatched end tokens. @@ -3281,7 +3305,7 @@ if (eof) break; - txt = curr_lexer->grab_comment_block (stdio_reader, true, eof); + txt = CURR_LEXER->grab_comment_block (stdio_reader, true, eof); if (txt.empty ()) break; @@ -3377,17 +3401,19 @@ { bool eof; - frame.protect_var (curr_lexer); - curr_lexer = new lexical_feedback (); - frame.add_fcn (lexical_feedback::cleanup, curr_lexer); - - frame.protect_var (curr_parser); - curr_parser = new octave_parser (); + // octave_parser constructor sets this for us. + frame.protect_var (CURR_LEXER); + + octave_parser *curr_parser = new octave_parser (); frame.add_fcn (octave_parser::cleanup, curr_parser); curr_parser->reset (); - std::string help_txt = gobble_leading_white_space (ffile, eof); + std::string help_txt + = gobble_leading_white_space + (ffile, eof, + curr_parser->curr_lexer->input_line_number, + curr_parser->curr_lexer->current_input_column); if (! help_txt.empty ()) help_buf.push (help_txt); @@ -3439,14 +3465,6 @@ reading_script_file = true; } - YY_BUFFER_STATE old_buf = current_buffer (); - YY_BUFFER_STATE new_buf = create_buffer (ffile); - - frame.add_fcn (switch_to_buffer, old_buf); - frame.add_fcn (delete_buffer, new_buf); - - switch_to_buffer (new_buf); - frame.protect_var (primary_fcn_ptr); primary_fcn_ptr = 0; @@ -3460,11 +3478,11 @@ help_buf.push (help_txt); if (reading_script_file) - curr_lexer->prep_for_script_file (); + curr_parser->curr_lexer->prep_for_script_file (); else - curr_lexer->prep_for_function_file (); - - curr_lexer->parsing_class_method = ! dispatch_type.empty (); + curr_parser->curr_lexer->prep_for_function_file (); + + curr_parser->curr_lexer->parsing_class_method = ! dispatch_type.empty (); frame.protect_var (global_command); @@ -3486,9 +3504,11 @@ } else { + int l = curr_parser->curr_lexer->input_line_number; + int c = curr_parser->curr_lexer->current_input_column; + tree_statement *end_of_script - = curr_parser->make_end ("endscript", curr_lexer->input_line_number, - curr_lexer->current_input_column); + = curr_parser->make_end ("endscript", l, c); curr_parser->make_script (0, end_of_script); @@ -4187,12 +4207,10 @@ unwind_protect frame; - frame.protect_var (curr_lexer); - curr_lexer = new lexical_feedback (); - frame.add_fcn (lexical_feedback::cleanup, curr_lexer); - - frame.protect_var (curr_parser); - curr_parser = new octave_parser (); + // octave_parser constructor sets this for us. + frame.protect_var (CURR_LEXER); + + octave_parser *curr_parser = new octave_parser (); frame.add_fcn (octave_parser::cleanup, curr_parser); frame.protect_var (get_input_from_eval_string); @@ -4220,14 +4238,6 @@ current_eval_string = s; - YY_BUFFER_STATE old_buf = current_buffer (); - YY_BUFFER_STATE new_buf = create_buffer (0); - - frame.add_fcn (switch_to_buffer, old_buf); - frame.add_fcn (delete_buffer, new_buf); - - switch_to_buffer (new_buf); - do { curr_parser->reset ();
--- a/libinterp/parse-tree/parse.h +++ b/libinterp/parse-tree/parse.h @@ -32,8 +32,6 @@ #include "lex.h" #include "token.h" -extern int octave_lex (void); - class octave_comment_list; class octave_function; class octave_user_function; @@ -132,14 +130,25 @@ extern OCTINTERP_API void cleanup_statement_list (tree_statement_list **lst); +// Global access to currently active lexer. +// FIXME -- to be removed after more parser+lexer refactoring. +extern lexical_feedback *CURR_LEXER; + class octave_parser { public: - octave_parser (void) : end_of_input (false) { } + octave_parser (void) + : end_of_input (false), curr_lexer (new lexical_feedback ()) + { + CURR_LEXER = curr_lexer; + } - ~octave_parser (void) { } + ~octave_parser (void) + { + delete curr_lexer; + } void reset (void) { @@ -332,6 +341,9 @@ // TRUE means that we have encountered EOF on the input stream. bool end_of_input; + // State of the lexer. + lexical_feedback *curr_lexer; + // For unwind protect. static void cleanup (octave_parser *parser) { delete parser; } @@ -344,7 +356,4 @@ octave_parser& operator = (const octave_parser&); }; -// The current state of the parser. -extern octave_parser *curr_parser; - #endif