BaseTools/Source/C/VfrCompile/Pccts/antlr/antlr.1 - edk2 - Git at Google

 .TH ANTLR 1 "September 1995" "ANTLR" "PCCTS Manual Pages"
 .SH NAME
 antlr \- ANother Tool for Language Recognition
 .SH SYNTAX
 .LP
 \fBantlr\fR [\fIoptions\fR] \fIgrammar_files\fR
 .SH DESCRIPTION
 .PP
 \fIAntlr\fP converts an extended form of context-free grammar into a
 set of C functions which directly implement an efficient form of
 deterministic recursive-descent LL(k) parser.  Context-free grammars
 may be augmented with predicates to allow semantics to influence
 parsing; this allows a form of context-sensitive parsing.  Selective
 backtracking is also available to handle non-LL(k) and even
 non-LALR(k) constructs.  \fIAntlr\fP also produces a definition of a
 lexer which can be automatically converted into C code for a DFA-based
 lexer by \fIdlg\fR.  Hence, \fIantlr\fR serves a function much like
 that of \fIyacc\fR, however, it is notably more flexible and is more
 integrated with a lexer generator (\fIantlr\fR directly generates
 \fIdlg\fR code, whereas \fIyacc\fR and \fIlex\fR are given independent
 descriptions).  Unlike \fIyacc\fR which accepts LALR(1) grammars,
 \fIantlr\fR accepts LL(k) grammars in an extended BNF notation \(em
 which eliminates the need for precedence rules.
 .PP
 Like \fIyacc\fR grammars, \fIantlr\fR grammars can use
 automatically-maintained symbol attribute values referenced as dollar
 variables.  Further, because \fIantlr\fR generates top-down parsers,
 arbitrary values may be inherited from parent rules (passed like
 function parameters).  \fIAntlr\fP also has a mechanism for creating
 and manipulating abstract-syntax-trees.
 .PP
 There are various other niceties in \fIantlr\fR, including the ability to
 spread one grammar over multiple files or even multiple grammars in a single
 file, the ability to generate a version of the grammar with actions stripped
 out (for documentation purposes), and lots more.
 .SH OPTIONS
 .IP "\fB-ck \fIn\fR"
 Use up to \fIn\fR symbols of lookahead when using compressed (linear
 approximation) lookahead.  This type of lookahead is very cheap to
 compute and is attempted before full LL(k) lookahead, which is of
 exponential complexity in the worst case.  In general, the compressed
 lookahead can be much deeper (e.g, \f(CW-ck 10\fP) than the full
 lookahead (which usually must be less than 4).
 .IP \fB-CC\fP
 Generate C++ output from both ANTLR and DLG.
 .IP \fB-cr\fP
 Generate a cross-reference for all rules.  For each rule, print a list
 of all other rules that reference it.
 .IP \fB-e1\fP
 Ambiguities/errors shown in low detail (default).
 .IP \fB-e2\fP
 Ambiguities/errors shown in more detail.
 .IP \fB-e3\fP
 Ambiguities/errors shown in excruciating detail.
 .IP "\fB-fe\fP file"
 Rename \fBerr.c\fP to file.
 .IP "\fB-fh\fP file"
 Rename \fBstdpccts.h\fP header (turns on \fB-gh\fP) to file.
 .IP "\fB-fl\fP file"
 Rename lexical output, \fBparser.dlg\fP, to file.
 .IP "\fB-fm\fP file"
 Rename file with lexical mode definitions, \fBmode.h\fP, to file.
 .IP "\fB-fr\fP file"
 Rename file which remaps globally visible symbols, \fBremap.h\fP, to file.
 .IP "\fB-ft\fP file"
 Rename \fBtokens.h\fP to file.
 .IP \fB-ga\fP
 Generate ANSI-compatible code (default case).  This has not been
 rigorously tested to be ANSI XJ11 C compliant, but it is close.  The
 normal output of \fIantlr\fP is currently compilable under both K&R,
 ANSI C, and C++\(emthis option does nothing because \fIantlr\fP
 generates a bunch of #ifdef's to do the right thing depending on the
 language.
 .IP \fB-gc\fP
 Indicates that \fIantlr\fP should generate no C code, i.e., only
 perform analysis on the grammar.
 .IP \fB-gd\fP
 C code is inserted in each of the \fIantlr\fR generated parsing functions to
 provide for user-defined handling of a detailed parse trace.  The inserted
 code consists of calls to the user-supplied macros or functions called
 \fBzzTRACEIN\fR and \fBzzTRACEOUT\fP.  The only argument is a
 \fIchar *\fR pointing to a C-style string which is the grammar rule
 recognized by the current parsing function.  If no definition is given
 for the trace functions, upon rule entry and exit, a message will be
 printed indicating that a particular rule as been entered or exited.
 .IP \fB-ge\fP
 Generate an error class for each non-terminal.
 .IP \fB-gh\fP
 Generate \fBstdpccts.h\fP for non-ANTLR-generated files to include.
 This file contains all defines needed to describe the type of parser
 generated by \fIantlr\fP (e.g. how much lookahead is used and whether
 or not trees are constructed) and contains the \fBheader\fP action
 specified by the user.
 .IP \fB-gk\fP
 Generate parsers that delay lookahead fetches until needed.  Without
 this option, \fIantlr\fP generates parsers which always have \fIk\fP
 tokens of lookahead available.
 .IP \fB-gl\fP
 Generate line info about grammar actions in C parser of the form
 \fB#\ \fIline\fP\ "\fIfile\fP"\fR which makes error messages from
 the C/C++ compiler make more sense as they will \*Qpoint\*U into the
 grammar file not the resulting C file.  Debugging is easier as well,
 because you will step through the grammar not C file.
 .IP \fB-gs\fR
 Do not generate sets for token expression lists; instead generate a
 \fB||\fP-separated sequence of \fBLA(1)==\fItoken_number\fR.  The
 default is to generate sets.
 .IP \fB-gt\fP
 Generate code for Abstract-Syntax Trees.
 .IP \fB-gx\fP
 Do not create the lexical analyzer files (dlg-related).  This option
 should be given when the user wishes to provide a customized lexical
 analyzer.  It may also be used in \fImake\fR scripts to cause only the
 parser to be rebuilt when a change not affecting the lexical structure
 is made to the input grammars.
 .IP "\fB-k \fIn\fR"
 Set k of LL(k) to \fIn\fR; i.e. set tokens of look-ahead (default==1).
 .IP "\fB-o\fP dir
 Directory where output files should go (default=".").  This is very
 nice for keeping the source directory clear of ANTLR and DLG spawn.
 .IP \fB-p\fP
 The complete grammar, collected from all input grammar files and
 stripped of all comments and embedded actions, is listed to
 \fBstdout\fP.  This is intended to aid in viewing the entire grammar
 as a whole and to eliminate the need to keep actions concisely stated
 so that the grammar is easier to read.  Hence, it is preferable to
 embed even complex actions directly in the grammar, rather than to
 call them as subroutines, since the subroutine call overhead will be
 saved.
 .IP \fB-pa\fP
 This option is the same as \fB-p\fP except that the output is
 annotated with the first sets determined from grammar analysis.
 .IP "\fB-prc on\fR
 Turn on the computation and hoisting of predicate context.
 .IP "\fB-prc off\fR
 Turn off the computation and hoisting of predicate context.  This
 option makes 1.10 behave like the 1.06 release with option \fB-pr\fR
 on.  Context computation is off by default.
 .IP "\fB-rl \fIn\fR
 Limit the maximum number of tree nodes used by grammar analysis to
 \fIn\fP.  Occasionally, \fIantlr\fP is unable to analyze a grammar
 submitted by the user.  This rare situation can only occur when the
 grammar is large and the amount of lookahead is greater than one.  A
 nonlinear analysis algorithm is used by PCCTS to handle the general
 case of LL(k) parsing.  The average complexity of analysis, however, is
 near linear due to some fancy footwork in the implementation which
 reduces the number of calls to the full LL(k) algorithm.  An error
 message will be displayed, if this limit is reached, which indicates
 the grammar construct being analyzed when \fIantlr\fP hit a
 non-linearity.  Use this option if \fIantlr\fP seems to go out to
 lunch and your disk start thrashing; try \fIn\fP=10000 to start.  Once
 the offending construct has been identified, try to remove the
 ambiguity that \fIantlr\fP was trying to overcome with large lookahead
 analysis.  The introduction of (...)? backtracking blocks eliminates
 some of these problems\ \(em \fIantlr\fP does not analyze alternatives
 that begin with (...)? (it simply backtracks, if necessary, at run
 time).
 .IP \fB-w1\fR
 Set low warning level.  Do not warn if semantic predicates and/or
 (...)? blocks are assumed to cover ambiguous alternatives.
 .IP \fB-w2\fR
 Ambiguous parsing decisions yield warnings even if semantic predicates
 or (...)? blocks are used.  Warn if predicate context computed and
 semantic predicates incompletely disambiguate alternative productions.
 .IP \fB-\fR
 Read grammar from standard input and generate \fBstdin.c\fP as the
 parser file.
 .SH "SPECIAL CONSIDERATIONS"
 .PP
 \fIAntlr\fP works...  we think.  There is no implicit guarantee of
 anything.  We reserve no \fBlegal\fP rights to the software known as
 the Purdue Compiler Construction Tool Set (PCCTS) \(em PCCTS is in the
 public domain.  An individual or company may do whatever they wish
 with source code distributed with PCCTS or the code generated by
 PCCTS, including the incorporation of PCCTS, or its output, into
 commercial software.  We encourage users to develop software with
 PCCTS.  However, we do ask that credit is given to us for developing
 PCCTS.  By "credit", we mean that if you incorporate our source code
 into one of your programs (commercial product, research project, or
 otherwise) that you acknowledge this fact somewhere in the
 documentation, research report, etc...  If you like PCCTS and have
 developed a nice tool with the output, please mention that you
 developed it using PCCTS.  As long as these guidelines are followed,
 we expect to continue enhancing this system and expect to make other
 tools available as they are completed.
 .SH FILES
 .IP *.c
 output C parser.
 .IP *.cpp
 output C++ parser when C++ mode is used.
 .IP \fBparser.dlg\fP
 output \fIdlg\fR lexical analyzer.
 .IP \fBerr.c\fP
 token string array, error sets and error support routines.  Not used in
 C++ mode.
 .IP \fBremap.h\fP
 file that redefines all globally visible parser symbols.  The use of
 the #parser directive creates this file.  Not used in
 C++ mode.
 .IP \fBstdpccts.h\fP
 list of definitions needed by C files, not generated by PCCTS, that
 reference PCCTS objects.  This is not generated by default.  Not used in
 C++ mode.
 .IP \fBtokens.h\fP
 output \fI#defines\fR for tokens used and function prototypes for
 functions generated for rules.
 .SH "SEE ALSO"
 .LP
 dlg(1), pccts(1)
	.TH ANTLR 1 "September 1995" "ANTLR" "PCCTS Manual Pages"
	.SH NAME
	antlr \- ANother Tool for Language Recognition
	.SH SYNTAX
	.LP
	\fBantlr\fR [\fIoptions\fR] \fIgrammar_files\fR
	.SH DESCRIPTION
	.PP
	\fIAntlr\fP converts an extended form of context-free grammar into a
	set of C functions which directly implement an efficient form of
	deterministic recursive-descent LL(k) parser. Context-free grammars
	may be augmented with predicates to allow semantics to influence
	parsing; this allows a form of context-sensitive parsing. Selective
	backtracking is also available to handle non-LL(k) and even
	non-LALR(k) constructs. \fIAntlr\fP also produces a definition of a
	lexer which can be automatically converted into C code for a DFA-based
	lexer by \fIdlg\fR. Hence, \fIantlr\fR serves a function much like
	that of \fIyacc\fR, however, it is notably more flexible and is more
	integrated with a lexer generator (\fIantlr\fR directly generates
	\fIdlg\fR code, whereas \fIyacc\fR and \fIlex\fR are given independent
	descriptions). Unlike \fIyacc\fR which accepts LALR(1) grammars,
	\fIantlr\fR accepts LL(k) grammars in an extended BNF notation \(em
	which eliminates the need for precedence rules.
	.PP
	Like \fIyacc\fR grammars, \fIantlr\fR grammars can use
	automatically-maintained symbol attribute values referenced as dollar
	variables. Further, because \fIantlr\fR generates top-down parsers,
	arbitrary values may be inherited from parent rules (passed like
	function parameters). \fIAntlr\fP also has a mechanism for creating
	and manipulating abstract-syntax-trees.
	.PP
	There are various other niceties in \fIantlr\fR, including the ability to
	spread one grammar over multiple files or even multiple grammars in a single
	file, the ability to generate a version of the grammar with actions stripped
	out (for documentation purposes), and lots more.
	.SH OPTIONS
	.IP "\fB-ck \fIn\fR"
	Use up to \fIn\fR symbols of lookahead when using compressed (linear
	approximation) lookahead. This type of lookahead is very cheap to
	compute and is attempted before full LL(k) lookahead, which is of
	exponential complexity in the worst case. In general, the compressed
	lookahead can be much deeper (e.g, \f(CW-ck 10\fP) than the full
	lookahead (which usually must be less than 4).
	.IP \fB-CC\fP
	Generate C++ output from both ANTLR and DLG.
	.IP \fB-cr\fP
	Generate a cross-reference for all rules. For each rule, print a list
	of all other rules that reference it.
	.IP \fB-e1\fP
	Ambiguities/errors shown in low detail (default).
	.IP \fB-e2\fP
	Ambiguities/errors shown in more detail.
	.IP \fB-e3\fP
	Ambiguities/errors shown in excruciating detail.
	.IP "\fB-fe\fP file"
	Rename \fBerr.c\fP to file.
	.IP "\fB-fh\fP file"
	Rename \fBstdpccts.h\fP header (turns on \fB-gh\fP) to file.
	.IP "\fB-fl\fP file"
	Rename lexical output, \fBparser.dlg\fP, to file.
	.IP "\fB-fm\fP file"
	Rename file with lexical mode definitions, \fBmode.h\fP, to file.
	.IP "\fB-fr\fP file"
	Rename file which remaps globally visible symbols, \fBremap.h\fP, to file.
	.IP "\fB-ft\fP file"
	Rename \fBtokens.h\fP to file.
	.IP \fB-ga\fP
	Generate ANSI-compatible code (default case). This has not been
	rigorously tested to be ANSI XJ11 C compliant, but it is close. The
	normal output of \fIantlr\fP is currently compilable under both K&R,
	ANSI C, and C++\(emthis option does nothing because \fIantlr\fP
	generates a bunch of #ifdef's to do the right thing depending on the
	language.
	.IP \fB-gc\fP
	Indicates that \fIantlr\fP should generate no C code, i.e., only
	perform analysis on the grammar.
	.IP \fB-gd\fP
	C code is inserted in each of the \fIantlr\fR generated parsing functions to
	provide for user-defined handling of a detailed parse trace. The inserted
	code consists of calls to the user-supplied macros or functions called
	\fBzzTRACEIN\fR and \fBzzTRACEOUT\fP. The only argument is a
	\fIchar *\fR pointing to a C-style string which is the grammar rule
	recognized by the current parsing function. If no definition is given
	for the trace functions, upon rule entry and exit, a message will be
	printed indicating that a particular rule as been entered or exited.
	.IP \fB-ge\fP
	Generate an error class for each non-terminal.
	.IP \fB-gh\fP
	Generate \fBstdpccts.h\fP for non-ANTLR-generated files to include.
	This file contains all defines needed to describe the type of parser
	generated by \fIantlr\fP (e.g. how much lookahead is used and whether
	or not trees are constructed) and contains the \fBheader\fP action
	specified by the user.
	.IP \fB-gk\fP
	Generate parsers that delay lookahead fetches until needed. Without
	this option, \fIantlr\fP generates parsers which always have \fIk\fP
	tokens of lookahead available.
	.IP \fB-gl\fP
	Generate line info about grammar actions in C parser of the form
	\fB#\ \fIline\fP\ "\fIfile\fP"\fR which makes error messages from
	the C/C++ compiler make more sense as they will \Qpoint\U into the
	grammar file not the resulting C file. Debugging is easier as well,
	because you will step through the grammar not C file.
	.IP \fB-gs\fR
	Do not generate sets for token expression lists; instead generate a
	\fB\|\|\fP-separated sequence of \fBLA(1)==\fItoken_number\fR. The
	default is to generate sets.
	.IP \fB-gt\fP
	Generate code for Abstract-Syntax Trees.
	.IP \fB-gx\fP
	Do not create the lexical analyzer files (dlg-related). This option
	should be given when the user wishes to provide a customized lexical
	analyzer. It may also be used in \fImake\fR scripts to cause only the
	parser to be rebuilt when a change not affecting the lexical structure
	is made to the input grammars.
	.IP "\fB-k \fIn\fR"
	Set k of LL(k) to \fIn\fR; i.e. set tokens of look-ahead (default==1).
	.IP "\fB-o\fP dir
	Directory where output files should go (default="."). This is very
	nice for keeping the source directory clear of ANTLR and DLG spawn.
	.IP \fB-p\fP
	The complete grammar, collected from all input grammar files and
	stripped of all comments and embedded actions, is listed to
	\fBstdout\fP. This is intended to aid in viewing the entire grammar
	as a whole and to eliminate the need to keep actions concisely stated
	so that the grammar is easier to read. Hence, it is preferable to
	embed even complex actions directly in the grammar, rather than to
	call them as subroutines, since the subroutine call overhead will be
	saved.
	.IP \fB-pa\fP
	This option is the same as \fB-p\fP except that the output is
	annotated with the first sets determined from grammar analysis.
	.IP "\fB-prc on\fR
	Turn on the computation and hoisting of predicate context.
	.IP "\fB-prc off\fR
	Turn off the computation and hoisting of predicate context. This
	option makes 1.10 behave like the 1.06 release with option \fB-pr\fR
	on. Context computation is off by default.
	.IP "\fB-rl \fIn\fR
	Limit the maximum number of tree nodes used by grammar analysis to
	\fIn\fP. Occasionally, \fIantlr\fP is unable to analyze a grammar
	submitted by the user. This rare situation can only occur when the
	grammar is large and the amount of lookahead is greater than one. A
	nonlinear analysis algorithm is used by PCCTS to handle the general
	case of LL(k) parsing. The average complexity of analysis, however, is
	near linear due to some fancy footwork in the implementation which
	reduces the number of calls to the full LL(k) algorithm. An error
	message will be displayed, if this limit is reached, which indicates
	the grammar construct being analyzed when \fIantlr\fP hit a
	non-linearity. Use this option if \fIantlr\fP seems to go out to
	lunch and your disk start thrashing; try \fIn\fP=10000 to start. Once
	the offending construct has been identified, try to remove the
	ambiguity that \fIantlr\fP was trying to overcome with large lookahead
	analysis. The introduction of (...)? backtracking blocks eliminates
	some of these problems\ \(em \fIantlr\fP does not analyze alternatives
	that begin with (...)? (it simply backtracks, if necessary, at run
	time).
	.IP \fB-w1\fR
	Set low warning level. Do not warn if semantic predicates and/or
	(...)? blocks are assumed to cover ambiguous alternatives.
	.IP \fB-w2\fR
	Ambiguous parsing decisions yield warnings even if semantic predicates
	or (...)? blocks are used. Warn if predicate context computed and
	semantic predicates incompletely disambiguate alternative productions.
	.IP \fB-\fR
	Read grammar from standard input and generate \fBstdin.c\fP as the
	parser file.
	.SH "SPECIAL CONSIDERATIONS"
	.PP
	\fIAntlr\fP works... we think. There is no implicit guarantee of
	anything. We reserve no \fBlegal\fP rights to the software known as
	the Purdue Compiler Construction Tool Set (PCCTS) \(em PCCTS is in the
	public domain. An individual or company may do whatever they wish
	with source code distributed with PCCTS or the code generated by
	PCCTS, including the incorporation of PCCTS, or its output, into
	commercial software. We encourage users to develop software with
	PCCTS. However, we do ask that credit is given to us for developing
	PCCTS. By "credit", we mean that if you incorporate our source code
	into one of your programs (commercial product, research project, or
	otherwise) that you acknowledge this fact somewhere in the
	documentation, research report, etc... If you like PCCTS and have
	developed a nice tool with the output, please mention that you
	developed it using PCCTS. As long as these guidelines are followed,
	we expect to continue enhancing this system and expect to make other
	tools available as they are completed.
	.SH FILES
	.IP *.c
	output C parser.
	.IP *.cpp
	output C++ parser when C++ mode is used.
	.IP \fBparser.dlg\fP
	output \fIdlg\fR lexical analyzer.
	.IP \fBerr.c\fP
	token string array, error sets and error support routines. Not used in
	C++ mode.
	.IP \fBremap.h\fP
	file that redefines all globally visible parser symbols. The use of
	the #parser directive creates this file. Not used in
	C++ mode.
	.IP \fBstdpccts.h\fP
	list of definitions needed by C files, not generated by PCCTS, that
	reference PCCTS objects. This is not generated by default. Not used in
	C++ mode.
	.IP \fBtokens.h\fP
	output \fI#defines\fR for tokens used and function prototypes for
	functions generated for rules.
	.SH "SEE ALSO"
	.LP
	dlg(1), pccts(1)