source: trunk/essentials/sys-devel/m4/doc/m4.texinfo

Last change on this file was 3090, checked in by bird, 18 years ago

m4 1.4.8

File size: 197.2 KB
Line 
1\input texinfo @c -*- texinfo -*-
2@comment ========================================================
3@comment %**start of header
4@setfilename m4.info
5@settitle GNU M4 macro processor
6@setchapternewpage odd
7@ifnothtml
8@setcontentsaftertitlepage
9@end ifnothtml
10@finalout
11
12@include version.texi
13
14@c @tabchar{}
15@c ----------
16@c The testsuite expects literal tab output in some examples, but
17@c literal tabs in texinfo lead to formatting issues.
18@macro tabchar
19@ @c
20@end macro
21
22@c @ovar{ARG}
23@c -------------------
24@c The ARG is an optional argument. To be used for macro arguments in
25@c their documentation.
26@macro ovar{varname}
27@r{[}@var{\varname\}@r{]}
28@end macro
29
30@c @dvar{ARG, DEFAULT}
31@c -------------------
32@c The ARG is an optional argument, defaulting to DEFAULT. To be used
33@c for macro arguments in their documentation.
34@macro dvar{varname, default}
35@r{[}@var{\varname\} = @samp{\default\}@r{]}
36@end macro
37
38@comment %**end of header
39@comment ========================================================
40
41@copying
42
43This manual is for @acronym{GNU} M4 (version @value{VERSION}, @value{UPDATED}),
44a package containing an implementation of the m4 macro language.
45
46Copyright @copyright{} 1989, 1990, 1991, 1992, 1993, 1994, 2004, 2005,
472006 Free Software Foundation, Inc.
48
49@quotation
50Permission is granted to copy, distribute and/or modify this document
51under the terms of the @acronym{GNU} Free Documentation License,
52Version 1.2 or any later version published by the Free Software
53Foundation; with no Invariant Sections, no Front-Cover Texts, and no
54Back-Cover Texts. A copy of the license is included in the section
55entitled ``@acronym{GNU} Free Documentation License.''
56@end quotation
57@end copying
58
59@dircategory GNU programming tools
60@direntry
61* M4: (m4). A powerful macro processor.
62@end direntry
63
64@titlepage
65@title GNU M4, version @value{VERSION}
66@subtitle A powerful macro processor
67@subtitle Edition @value{EDITION}, @value{UPDATED}
68@author by Ren@'e Seindal
69
70@page
71@vskip 0pt plus 1filll
72@insertcopying
73@end titlepage
74
75@contents
76
77@ifnottex
78@node Top
79@top GNU M4
80@insertcopying
81@end ifnottex
82
83@acronym{GNU} @code{m4} is an implementation of the traditional UNIX macro
84processor. It is mostly SVR4 compatible, although it has some
85extensions (for example, handling more than 9 positional parameters
86to macros). @code{m4} also has builtin functions for including
87files, running shell commands, doing arithmetic, etc. Autoconf needs
88@acronym{GNU} @code{m4} for generating @file{configure} scripts, but not for
89running them.
90
91@acronym{GNU} @code{m4} was originally written by Ren@'e Seindal, with
92subsequent changes by Fran@,{c}ois Pinard and other volunteers
93on the Internet. All names and email addresses can be found in the
94files @file{m4-@value{VERSION}/@/AUTHORS} and
95@file{m4-@value{VERSION}/@/THANKS} from the @acronym{GNU} M4
96distribution.
97
98This is release @value{VERSION}. It is now considered stable: future
99releases in the 1.4.x series are only meant to fix bugs, increase speed,
100or improve documentation. However@dots{}
101
102An experimental feature, which would improve @code{m4} usefulness,
103allows for changing the syntax for what is a @dfn{word} in @code{m4}.
104You should use:
105@comment ignore
106@example
107./configure --enable-changeword
108@end example
109@noindent
110if you want this feature compiled in. The current implementation
111slows down @code{m4} considerably and is hardly acceptable. In the
112future, @code{m4} 2.0 will come with a different set of new features
113that provide similar capabilities, but without the inefficiencies, so
114changeword will go away and @emph{you should not count on it}.
115
116@menu
117* Preliminaries:: Introduction and preliminaries
118* Invoking m4:: Invoking @code{m4}
119* Syntax:: Lexical and syntactic conventions
120
121* Macros:: How to invoke macros
122* Definitions:: How to define new macros
123* Conditionals:: Conditionals, loops, and recursion
124
125* Debugging:: How to debug macros and input
126
127* Input Control:: Input control
128* File Inclusion:: File inclusion
129* Diversions:: Diverting and undiverting output
130
131* Text handling:: Macros for text handling
132* Arithmetic:: Macros for doing arithmetic
133* Shell commands:: Macros for running shell commands
134* Miscellaneous:: Miscellaneous builtin macros
135* Frozen files:: Fast loading of frozen state
136
137* Compatibility:: Compatibility with other versions of @code{m4}
138* Answers:: Correct version of some examples
139* Copying This Manual:: How to make copies of this manual
140* Indices:: Indices of concepts and macros
141
142@detailmenu
143 --- The Detailed Node Listing ---
144
145Introduction and preliminaries
146
147* Intro:: Introduction to @code{m4}
148* History:: Historical references
149* Bugs:: Problems and bugs
150* Manual:: Using this manual
151
152Invoking @code{m4}
153
154* Operation modes:: Command line options for operation modes
155* Preprocessor features:: Command line options for preprocessor features
156* Limits control:: Command line options for limits control
157* Frozen state:: Command line options for frozen state
158* Debugging options:: Command line options for debugging
159* Command line files:: Specifying input files on the command line
160
161Lexical and syntactic conventions
162
163* Names:: Macro names
164* Quoted strings:: Quoting input to @code{m4}
165* Comments:: Comments in @code{m4} input
166* Other tokens:: Other kinds of input tokens
167* Input processing:: How @code{m4} copies input to output
168
169How to invoke macros
170
171* Invocation:: Macro invocation
172* Inhibiting Invocation:: Preventing macro invocation
173* Macro Arguments:: Macro arguments
174* Quoting Arguments:: On Quoting Arguments to macros
175* Macro expansion:: Expanding macros
176
177How to define new macros
178
179* Define:: Defining a new macro
180* Arguments:: Arguments to macros
181* Pseudo Arguments:: Special arguments to macros
182* Undefine:: Deleting a macro
183* Defn:: Renaming macros
184* Pushdef:: Temporarily redefining macros
185
186* Indir:: Indirect call of macros
187* Builtin:: Indirect call of builtins
188
189Conditionals, loops, and recursion
190
191* Ifdef:: Testing if a macro is defined
192* Ifelse:: If-else construct, or multibranch
193* Shift:: Recursion in @code{m4}
194* Forloop:: Iteration by counting
195* Foreach:: Iteration by list contents
196
197How to debug macros and input
198
199* Dumpdef:: Displaying macro definitions
200* Trace:: Tracing macro calls
201* Debug Levels:: Controlling debugging output
202* Debug Output:: Saving debugging output
203
204Input control
205
206* Dnl:: Deleting whitespace in input
207* Changequote:: Changing the quote characters
208* Changecom:: Changing the comment delimiters
209* Changeword:: Changing the lexical structure of words
210* M4wrap:: Saving text until end of input
211
212File inclusion
213
214* Include:: Including named files
215* Search Path:: Searching for include files
216
217Diverting and undiverting output
218
219* Divert:: Diverting output
220* Undivert:: Undiverting output
221* Divnum:: Diversion numbers
222* Cleardivert:: Discarding diverted text
223
224Macros for text handling
225
226* Len:: Calculating length of strings
227* Index macro:: Searching for substrings
228* Regexp:: Searching for regular expressions
229* Substr:: Extracting substrings
230* Translit:: Translating characters
231* Patsubst:: Substituting text by regular expression
232* Format:: Formatting strings (printf-like)
233
234Macros for doing arithmetic
235
236* Incr:: Decrement and increment operators
237* Eval:: Evaluating integer expressions
238
239Macros for running shell commands
240
241* Platform macros:: Determining the platform
242* Syscmd:: Executing simple commands
243* Esyscmd:: Reading the output of commands
244* Sysval:: Exit status
245* Mkstemp:: Making temporary files
246
247Miscellaneous builtin macros
248
249* Errprint:: Printing error messages
250* Location:: Printing current location
251* M4exit:: Exiting from @code{m4}
252
253Fast loading of frozen state
254
255* Using frozen files:: Using frozen files
256* Frozen file format:: Frozen file format
257
258Compatibility with other versions of @code{m4}
259
260* Extensions:: Extensions in @acronym{GNU} M4
261* Incompatibilities:: Facilities in System V m4 not in GNU M4
262* Other Incompatibilities:: Other incompatibilities
263
264Correct version of some examples
265
266* Improved exch:: Solution for @code{exch}
267* Improved forloop:: Solution for @code{forloop}
268* Improved foreach:: Solution for @code{foreach}
269* Improved cleardivert:: Solution for @code{cleardivert}
270* Improved fatal_error:: Solution for @code{fatal_error}
271
272How to make copies of this manual
273
274* GNU Free Documentation License:: License for copying this manual
275
276Indices of concepts and macros
277
278* Concept index:: Index for many concepts
279* Macro index:: Index for all @code{m4} macros
280
281@end detailmenu
282@end menu
283
284@node Preliminaries
285@chapter Introduction and preliminaries
286
287This first chapter explains what @acronym{GNU} @code{m4} is, where @code{m4}
288comes from, how to read and use this documentation, how to call the
289@code{m4} program, and how to report bugs about it. It concludes by
290giving tips for reading the remainder of the manual.
291
292The following chapters then detail all the features of the @code{m4}
293language.
294
295@menu
296* Intro:: Introduction to @code{m4}
297* History:: Historical references
298* Bugs:: Problems and bugs
299* Manual:: Using this manual
300@end menu
301
302@node Intro
303@section Introduction to @code{m4}
304
305@code{m4} is a macro processor, in the sense that it copies its
306input to the output, expanding macros as it goes. Macros are either
307builtin or user-defined, and can take any number of arguments.
308Besides just doing macro expansion, @code{m4} has builtin functions
309for including named files, running shell commands, doing integer
310arithmetic, manipulating text in various ways, performing recursion,
311etc.@dots{} @code{m4} can be used either as a front-end to a compiler,
312or as a macro processor in its own right.
313
314The @code{m4} macro processor is widely available on all UNIXes, and has
315been standardized by @acronym{POSIX}.
316Usually, only a small percentage of users are aware of its existence.
317However, those who find it often become committed users. The
318popularity of @acronym{GNU} Autoconf, which requires @acronym{GNU}
319@code{m4} for @emph{generating} @file{configure} scripts, is an incentive
320for many to install it, while these people will not themselves
321program in @code{m4}. @acronym{GNU} @code{m4} is mostly compatible with the
322System V, Release 3 version, except for some minor differences.
323@xref{Compatibility}, for more details.
324
325Some people find @code{m4} to be fairly addictive. They first use
326@code{m4} for simple problems, then take bigger and bigger challenges,
327learning how to write complex sets of @code{m4} macros along the way.
328Once really addicted, users pursue writing of sophisticated @code{m4}
329applications even to solve simple problems, devoting more time
330debugging their @code{m4} scripts than doing real work. Beware that
331@code{m4} may be dangerous for the health of compulsive programmers.
332
333@node History
334@section Historical references
335
336@code{GPM} was an important ancestor of @code{m4}. See
337C. Stratchey: ``A General Purpose Macro generator'', Computer Journal
3388,3 (1965), pp. 225 ff. @code{GPM} is also succinctly described into
339David Gries classic ``Compiler Construction for Digital Computers''.
340
341The classic B. Kernighan and P.J. Plauger: ``Software Tools'',
342Addison-Wesley, Inc. (1976) describes and implements a Unix
343macro-processor language, which inspired Dennis Ritchie to write
344@code{m3}, a macro processor for the AP-3 minicomputer.
345
346Kernighan and Ritchie then joined forces to develop the original
347@code{m4}, as described in ``The M4 Macro Processor'', Bell
348Laboratories (1977). It had only 21 builtin macros.
349
350While @code{GPM} was more @emph{pure}, @code{m4} is meant to deal with
351the true intricacies of real life: macros can be recognized without
352being pre-announced, skipping whitespace or end-of-lines is easier,
353more constructs are builtin instead of derived, etc.
354
355Originally, the Kernighan and Plauger macro-processor, and then
356@code{m3}, formed the engine for the Rational FORTRAN preprocessor,
357that is, the @code{Ratfor} equivalent of @code{cpp}. Later, @code{m4}
358was used as a frontend for @code{Ratfor}, @code{C} and @code{Cobol}.
359
360Ren@'e Seindal released his implementation of @code{m4}, @acronym{GNU}
361@code{m4},
362in 1990, with the aim of removing the artificial limitations in many
363of the traditional @code{m4} implementations, such as maximum line
364length, macro size, or number of macros.
365
366The late Professor A. Dain Samples described and implemented a further
367evolution in the form of @code{M5}: ``User's Guide to the M5 Macro
368Language: 2nd edition'', Electronic Announcement on comp.compilers
369newsgroup (1992).
370
371Fran@,{c}ois Pinard took over maintenance of @acronym{GNU} @code{m4} in
3721992, until 1994 when he released @acronym{GNU} @code{m4} 1.4, which was
373the stable release for 10 years. It was at this time that @acronym{GNU}
374Autoconf decided to require @acronym{GNU} @code{m4} as its underlying
375engine, since all other implementations of @code{m4} had too many
376limitations.
377
378More recently, in 2004, Paul Eggert released 1.4.1 and 1.4.2 which
379addressed some long standing bugs in the venerable 1.4 release.
380Then in 2005 Gary V. Vaughan collected together the many
381patches to @acronym{GNU} @code{m4} 1.4 that were floating around the net and
382released 1.4.3 and 1.4.4. And in 2006, Eric Blake joined the team and
383prepared patches for the release of 1.4.5, 1.4.6, 1.4.7, and 1.4.8.
384
385Meanwhile, development has continued on new features for @code{m4}, such
386as dynamic module loading and additional builtins. When complete,
387@acronym{GNU} @code{m4} 2.0 will start a new series of releases.
388
389@node Bugs
390@section Problems and bugs
391
392If you have problems with @acronym{GNU} M4 or think you've found a bug,
393please report it. Before reporting a bug, make sure you've actually
394found a real bug. Carefully reread the documentation and see if it
395really says you can do what you're trying to do. If it's not clear
396whether you should be able to do something or not, report that too; it's
397a bug in the documentation!
398
399Before reporting a bug or trying to fix it yourself, try to isolate it
400to the smallest possible input file that reproduces the problem. Then
401send us the input file and the exact results @code{m4} gave you. Also
402say what you expected to occur; this will help us decide whether the
403problem was really in the documentation.
404
405Once you've got a precise problem, send e-mail to (Internet)
406@email{bug-m4@@gnu.org}. Please include the version number of @code{m4}
407you are using. You can get this information with the command
408@kbd{m4 --version}. Also provide details about the platform you are
409executing on.
410
411Non-bug suggestions are always welcome as well. If you have questions
412about things that are unclear in the documentation or are just obscure
413features, please report them too.
414
415@node Manual
416@section Using this manual
417
418This manual contains a number of examples of @code{m4} input and output,
419and a simple notation is used to distinguish input, output and error
420messages from @code{m4}. Examples are set out from the normal text, and
421shown in a fixed width font, like this
422
423@comment ignore
424@example
425This is an example of an example!
426@end example
427
428To distinguish input from output, all output from @code{m4} is prefixed
429by the string @samp{@result{}}, and all error messages by the string
430@samp{@error{}}. Thus
431
432@comment ignore
433@example
434Example of input line
435@result{}Output line from m4
436@error{}and an error message
437@end example
438
439The sequence @samp{^D} in an example indicates the end of the input file.
440The majority of these examples are self-contained, and you can run them
441with similar results by invoking @kbd{m4 -d}. In fact, the testsuite
442that is bundled in the @acronym{GNU} M4 package consists of the examples
443in this document!
444
445As each of the predefined macros in @code{m4} is described, a prototype
446call of the macro will be shown, giving descriptive names to the
447arguments, e.g.,
448
449@deffn Composite example (@var{string}, @dvar{count, 1}, @
450 @ovar{argument}@dots{})
451This is a sample prototype. There is not really a macro named
452@code{example}, but this documents that if there were, it would be a
453Composite macro, rather than a Builtin. It requires at least one
454argument, @var{string}. Remember that in @code{m4}, there must not be a
455space between the macro name and the opening parenthesis, unless it was
456intended to call the macro without any arguments. The brackets around
457@var{count} and @var{argument} show that these arguments are optional.
458If @var{count} is omitted, the macro behaves as if count were @samp{1},
459whereas if @var{argument} is omitted, the macro behaves as if it were
460the empty string. A blank argument is not the same as an omitted
461argument. For example, @samp{example(`a')}, @samp{example(`a',`1')},
462and @samp{example(`a',`1',)} would behave identically with @var{count}
463set to @samp{1}; while @samp{example(`a',)} and @samp{example(`a',`')}
464would explicitly pass the empty string for @var{count}. The ellipses
465(@samp{@dots{}}) show that the macro processes additional arguments
466after @var{argument}, rather than ignoring them.
467@end deffn
468
469All macro arguments in @code{m4} are strings, but some are given
470special interpretation, e.g., as numbers, file names, regular
471expressions, etc. The documentation for each macro will state how the
472parameters are interpreted, and what happens if the argument cannot be
473parsed according to the desired interpretation. Unless specified
474otherwise, a parameter specified to be a number is parsed as a decimal,
475even if the argument has leading zeros; and parsing the empty string as
476a number results in 0 rather than an error, although a warning will be
477issued.
478
479This document consistently writes and uses @dfn{builtin}, without a
480hyphen, as if it were an English word. This is how the @code{builtin}
481primitive is spelled within @code{m4}.
482
483@node Invoking m4
484@chapter Invoking @code{m4}
485
486The format of the @code{m4} command is:
487
488@comment ignore
489@example
490@code{m4} @r{[}@var{option}@dots{}@r{]} @r{[}@var{file}@dots{}@r{]}
491@end example
492
493@cindex command line, options
494@cindex options, command line
495@cindex @env{POSIXLY_CORRECT}
496All options begin with @samp{-}, or if long option names are used, with
497@samp{--}. A long option name need not be written completely, any
498unambiguous prefix is sufficient. @acronym{POSIX} requires @code{m4} to
499recognize arguments intermixed with files, even when
500@env{POSIXLY_CORRECT} is set in the environment. Most options take
501effect at startup regardless of their position, but some are documented
502below as taking effect after any files that occurred earlier in the
503command line. The argument @option{--} is a marker to denote the end of
504options.
505
506With short options, options that do not take arguments may be combined
507into a single command line argument with subsequent options, options
508with mandatory arguments may be provided either as a single command line
509argument or as two arguments, and options with optional arguments must
510be provided as a single argument. In other words,
511@kbd{m4 -QPDfoo -d a -d+f} is equivalent to
512@kbd{m4 -Q -P -D foo -d -d+f -- ./a}, although the latter form is
513considered canonical.
514
515With long options, options with mandatory arguments may be provided with
516an equal sign (@samp{=}) in a single argument, or as two arguments, and
517options with optional arguments must be provided as a single argument.
518In other words, @kbd{m4 --def foo --debug a} is equivalent to
519@kbd{m4 --define=foo --debug= -- ./a}, although the latter form is
520considered canonical (not to mention more robust, in case a future
521version of @code{m4} introduces an option named @option{--default}).
522
523@code{m4} understands the following options, grouped by functionality.
524
525@menu
526* Operation modes:: Command line options for operation modes
527* Preprocessor features:: Command line options for preprocessor features
528* Limits control:: Command line options for limits control
529* Frozen state:: Command line options for frozen state
530* Debugging options:: Command line options for debugging
531* Command line files:: Specifying input files on the command line
532@end menu
533
534@node Operation modes
535@section Command line options for operation modes
536
537Several options control the overall operation of @code{m4}:
538
539@table @code
540@item --help
541Print a help summary on standard output, then immediately exit
542@code{m4} without reading any input files or performing any other
543actions.
544
545@item --version
546Print the version number of the program on standard output, then
547immediately exit @code{m4} without reading any input files or
548performing any other actions.
549
550@item -E
551@itemx --fatal-warnings
552Stop execution and exit @code{m4} once the first warning has been
553issued, considering all of them to be fatal.
554
555@item -i
556@itemx --interactive
557@itemx -e
558Makes this invocation of @code{m4} interactive. This means that all
559output will be unbuffered, and interrupts will be ignored. The
560spelling @option{-e} exists for compatibility with other @code{m4}
561implementations, and issues a warning because it may be withdrawn in a
562future version of @acronym{GNU} M4.
563
564@item -P
565@itemx --prefix-builtins
566Internally modify @emph{all} builtin macro names so they all start with
567the prefix @samp{m4_}. For example, using this option, one should write
568@samp{m4_define} instead of @samp{define}, and @samp{m4___file__}
569instead of @samp{__file__}. This option has no effect if @option{-R}
570is also specified.
571
572@item -Q
573@itemx --quiet
574@itemx --silent
575Suppress warnings, such as missing or superfluous arguments in macro
576calls, or treating the empty string as zero.
577
578@item -W @var{REGEXP}
579@itemx --word-regexp=@var{REGEXP}
580Use @var{REGEXP} as an alternative syntax for macro names. This
581experimental option will not be present in all @acronym{GNU} @code{m4}
582implementations (@pxref{Changeword}).
583@end table
584
585@node Preprocessor features
586@section Command line options for preprocessor features
587
588@cindex macro definitions, on the command line
589@cindex command line, macro definitions on the
590Several options allow @code{m4} to behave more like a preprocessor.
591Macro definitions and deletions can be made on the command line, the
592search path can be altered, and the output file can track where the
593input came from. These features occur with the following options:
594
595@table @code
596@item -D @var{NAME}@r{[}=@var{VALUE}@r{]}
597@itemx --define=@var{NAME}@r{[}=@var{VALUE}@r{]}
598This enters @var{NAME} into the symbol table, before any input files are
599read. If @samp{=@var{VALUE}} is missing, the value is taken to be the
600empty string. The @var{VALUE} can be any string, and the macro can be
601defined to take arguments, just as if it was defined from within the
602input. This option may be given more than once; order with respect to
603file names is significant, and redefining the same @var{NAME} loses the
604previous value.
605
606@item -I @var{DIRECTORY}
607@itemx --include=@var{DIRECTORY}
608Make @code{m4} search @var{DIRECTORY} for included files that are not
609found in the current working directory. @xref{Search Path}, for more
610details. This option may be given more than once.
611
612@item -s
613@itemx --synclines
614Generate synchronization lines, for use by the C preprocessor or other
615similar tools. Order is significant with respect to file names. This
616option is useful, for example, when @code{m4} is used as a
617front end to a compiler. Source file name and line number information
618is conveyed by directives of the form @samp{#line @var{linenum}
619"@var{file}"}, which are inserted as needed into the middle of the
620output. Such directives mean that the following line originated or was
621expanded from the contents of input file @var{file} at line
622@var{linenum}. The @samp{"@var{file}"} part is often omitted when
623the file name did not change from the previous directive.
624
625Synchronization directives are always given on complete lines by
626themselves. When a synchronization discrepancy occurs in the middle of
627an output line, the associated synchronization directive is delayed
628until the beginning of the next generated line.
629
630@item -U @var{NAME}
631@itemx --undefine=@var{NAME}
632This deletes any predefined meaning @var{NAME} might have. Obviously,
633only predefined macros can be deleted in this way. This option may be
634given more than once; undefining a @var{NAME} that does not have a
635definition is silently ignored. Order is significant with respect to
636file names.
637@end table
638
639@node Limits control
640@section Command line options for limits control
641
642There are some limits within @code{m4} that can be tuned. For
643compatibility, @code{m4} also accepts some options that control limits
644in other implementations, but which are automatically unbounded (limited
645only by your hardware and operating system constraints) in @acronym{GNU}
646@code{m4}.
647
648@table @code
649@item -G
650@itemx --traditional
651Suppress all the extensions made in this implementation, compared to the
652System V version. @xref{Compatibility}, for a list of these.
653
654@item -H @var{NUM}
655@itemx --hashsize=@var{NUM}
656Make the internal hash table for symbol lookup be @var{NUM} entries big.
657For better performance, the number should be prime, but this is not
658checked. The default is 509 entries. It should not be necessary to
659increase this value, unless you define an excessive number of macros.
660
661@item -L @var{NUM}
662@itemx --nesting-limit=@var{NUM}
663Artificially limit the nesting of macro calls to @var{NUM} levels,
664stopping program execution if this limit is ever exceeded. When not
665specified, nesting is limited to 1024 levels. A value of zero means
666unlimited; but then heavily nested code could potentially cause a stack
667overflow.
668
669The precise effect of this option might be more correctly associated
670with textual nesting than dynamic recursion. It has been useful
671when some complex @code{m4} input was generated by mechanical means.
672Most users would never need this option. If shown to be obtrusive,
673this option (which is still experimental) might well disappear.
674
675This option does @emph{not} have the ability to break endless
676rescanning loops, since these do not necessarily consume much memory
677or stack space. Through clever usage of rescanning loops, one can
678request complex, time-consuming computations from @code{m4} with useful
679results. Putting limitations in this area would break @code{m4} power.
680There are many pathological cases: @w{@samp{define(`a', `a')a}} is
681only the simplest example (but @pxref{Compatibility}). Expecting @acronym{GNU}
682@code{m4} to detect these would be a little like expecting a compiler
683system to detect and diagnose endless loops: it is a quite @emph{hard}
684problem in general, if not undecidable!
685
686@item -B @var{NUM}
687@itemx -S @var{NUM}
688@itemx -T @var{NUM}
689These options are present for compatibility with System V @code{m4}, but
690do nothing in this implementation. They may disappear in future
691releases, and issue a warning to that effect.
692
693@item -N @var{NUM}
694@itemx --diversions=@var{NUM}
695These options are present only for compatibility with previous
696versions of @acronym{GNU} @code{m4}, and were controlling the number of
697possible diversions which could be used at the same time. They do nothing,
698because there is no fixed limit anymore. They may disappear in future
699releases, and issue a warning to that effect.
700@end table
701
702@node Frozen state
703@section Command line options for frozen state
704
705@acronym{GNU} @code{m4} comes with a feature of freezing internal state
706(@pxref{Frozen files}). This can be used to speed up @code{m4}
707execution when reusing a common initialization script.
708
709@table @code
710@item -F @var{FILE}
711@itemx --freeze-state=@var{FILE}
712Once execution is finished, write out the frozen state on the specified
713@var{FILE}. It is conventional, but not required, for @var{FILE} to end
714in @samp{.m4f}.
715
716@item -R @var{FILE}
717@itemx --reload-state=@var{FILE}
718Before execution starts, recover the internal state from the specified
719frozen @var{FILE}. The options @option{-D}, @option{-U}, and
720@option{-t} take effect after state is reloaded, but before the input
721files are read.
722@end table
723
724@node Debugging options
725@section Command line options for debugging
726
727Finally, there are several options for aiding in debugging @code{m4}
728scripts.
729
730@table @code
731@item -d@r{[}@var{FLAGS}@r{]}
732@itemx --debug@r{[}=@var{FLAGS}@r{]}
733Set the debug-level according to the flags @var{FLAGS}. The debug-level
734controls the format and amount of information presented by the debugging
735functions. @xref{Debug Levels}, for more details on the format and
736meaning of @var{FLAGS}. If omitted, @var{FLAGS} defaults to @samp{aeq}.
737
738@item --debugfile=@var{FILE}
739@itemx -o @var{FILE}
740@itemx --error-output=@var{FILE}
741Redirect @code{dumpdef} output, debug messages, and trace output to the
742named @var{FILE}. Warnings, error messages, and @code{errprint} output
743are still printed to standard error. If unspecified, debug output goes
744to standard error; if empty, debug output is discarded. @xref{Debug
745Output}, for more details. The spellings @option{-o} and
746@option{--error-output} are misleading and inconsistent with other
747@acronym{GNU} tools; for now they are silently accepted as synonyms of
748@option{--debugfile}, but in a future version of M4, using them will
749cause a warning to be issued.
750
751@item -l @var{NUM}
752@itemx --arglength=@var{NUM}
753Restrict the size of the output generated by macro tracing to @var{NUM}
754characters per trace line. If unspecified or zero, output is
755unlimited. @xref{Debug Levels}, for more details.
756
757@item -t @var{NAME}
758@itemx --trace=@var{NAME}
759This enables tracing for the macro @var{NAME}, at any point where it is
760defined. @var{NAME} need not be defined when this option is given.
761This option may be given more than once, and order is significant with
762respect to file names. @xref{Trace}, for more details.
763@end table
764
765@node Command line files
766@section Specifying input files on the command line
767
768@cindex command line, file names on the
769@cindex file names, on the command line
770The remaining arguments on the command line are taken to be input file
771names. If no names are present, standard input is read. A file
772name of @file{-} is taken to mean standard input. It is
773conventional, but not required, for input files to end in @samp{.m4}.
774
775The input files are read in the sequence given. Standard input can be
776read more than once, so the file name @file{-} may appear multiple times
777on the command line; this makes a difference when input is from a
778terminal or other special file type. It is an error if an input file
779ends in the middle of argument collection, a comment, or a quoted
780string.
781
782The options @option{--define} (@option{-D}), @option{--undefine}
783(@option{-U}), @option{--synclines} (@option{-s}), and @option{--trace}
784(@option{-t}) only take effect after processing input from any file
785names that occur earlier on the command line.
786
787If none of the input files invoked @code{m4exit} (@pxref{M4exit}), the
788exit status of @code{m4} will be 0 for success, 1 for general failure
789(such as problems with reading an input file), and 63 for version
790mismatch (@pxref{Using frozen files}).
791
792If you need to read a file whose name starts with a @file{-}, you can
793specify it as @samp{./-file}, or use @option{--} to mark the end of
794options.
795
796@node Syntax
797@chapter Lexical and syntactic conventions
798
799@cindex input tokens
800@cindex tokens
801As @code{m4} reads its input, it separates it into @dfn{tokens}. A
802token is either a name, a quoted string, or any single character, that
803is not a part of either a name or a string. Input to @code{m4} can also
804contain comments. @acronym{GNU} @code{m4} does not yet understand
805locales; all operations are byte-oriented rather than
806character-oriented. However, @code{m4} is eight-bit clean, so you can
807use non-@sc{ascii} characters in quoted strings (@pxref{Changequote}),
808comments (@pxref{Changecom}), and macro names (@pxref{Indir}), with the
809exception of the @sc{nul} character (the zero byte @samp{'\0'}).
810
811@menu
812* Names:: Macro names
813* Quoted strings:: Quoting input to @code{m4}
814* Comments:: Comments in @code{m4} input
815* Other tokens:: Other kinds of input tokens
816* Input processing:: How @code{m4} copies input to output
817@end menu
818
819@node Names
820@section Macro names
821
822@cindex names
823A name is any sequence of letters, digits, and the character @samp{_}
824(underscore), where the first character is not a digit. @code{m4} will
825use the longest such sequence found in the input. If a name has a
826macro definition, it will be subject to macro expansion
827(@pxref{Macros}). Names are case-sensitive.
828
829Examples of legal names are: @samp{foo}, @samp{_tmp}, and @samp{name01}.
830
831@node Quoted strings
832@section Quoting input to @code{m4}
833
834@cindex quoted string
835A quoted string is a sequence of characters surrounded by quote
836strings, defaulting to
837@samp{`} and @samp{'}, where the nested begin and end quotes within the
838string are balanced. The value of a string token is the text, with one
839level of quotes stripped off. Thus
840
841@comment ignore
842@example
843`'
844@result{}
845@end example
846
847@noindent
848is the empty string, and double-quoting turns into single-quoting.
849
850@comment ignore
851@example
852``quoted''
853@result{}`quoted'
854@end example
855
856The quote characters can be changed at any time, using the builtin macro
857@code{changequote}. @xref{Changequote}, for more information.
858
859@node Comments
860@section Comments in @code{m4} input
861
862@cindex comments
863Comments in @code{m4} are normally delimited by the characters @samp{#}
864and newline. All characters between the comment delimiters are ignored,
865but the entire comment (including the delimiters) is passed through to
866the output---comments are @emph{not} discarded by @code{m4}.
867
868Comments cannot be nested, so the first newline after a @samp{#} ends
869the comment. The commenting effect of the begin-comment string
870can be inhibited by quoting it.
871
872@example
873`quoted text' # `commented text'
874@result{}quoted text # `commented text'
875`quoting inhibits' `#' `comments'
876@result{}quoting inhibits # comments
877@end example
878
879The comment delimiters can be changed to any string at any time, using
880the builtin macro @code{changecom}. @xref{Changecom}, for more
881information.
882
883@node Other tokens
884@section Other kinds of input tokens
885
886Any character, that is neither a part of a name, nor of a quoted string,
887nor a comment, is a token by itself. When not in the context of macro
888expansion, all of these tokens are just copied to output. However,
889during macro expansion, whitespace characters (space, tab, newline,
890formfeed, carriage return, vertical tab), parentheses (@samp{(} and
891@samp{)}), comma (@samp{,}), and dollar (@samp{$}) have additional
892roles, explained later.
893
894@node Input processing
895@section How @code{m4} copies input to output
896
897As @code{m4} reads the input token by token, it will copy each token
898directly to the output immediately.
899
900The exception is when it finds a word with a macro definition. In that
901case @code{m4} will calculate the macro's expansion, possibly reading
902more input to get the arguments. It then inserts the expansion in front
903of the remaining input. In other words, the resulting text from a macro
904call will be read and parsed into tokens again.
905
906@code{m4} expands a macro as soon as possible. If it finds a macro call
907when collecting the arguments to another, it will expand the second
908call first. For a running example, examine how @code{m4} handles this
909input:
910
911@comment ignore
912@example
913format(`Result is %d', eval(`2**15'))
914@end example
915
916@noindent
917First, @code{m4} sees that the token @samp{format} is a macro name, so
918it collects the tokens @samp{(}, @samp{`Result is %d'}, @samp{,},
919and @samp{@w{ }}, before encountering another potential macro. Sure
920enough, @samp{eval} is a macro name, so the nested argument collection
921picks up @samp{(}, @samp{`2**15'}, and @samp{)}, invoking the eval macro
922with the lone argument of @samp{2**15}. The expansion of
923@samp{eval(2**15)} is @samp{32768}, which is then rescanned as the five
924tokens @samp{3}, @samp{2}, @samp{7}, @samp{6}, and @samp{8}; and
925combined with the next @samp{)}, the format macro now has all its
926arguments, as if the user had typed:
927
928@comment ignore
929@example
930format(`Result is %d', 32768)
931@end example
932
933@noindent
934The format macro expands to @samp{Result is 32768}, and we have another
935round of scanning for the tokens @samp{Result}, @samp{@w{ }},
936@samp{is}, @samp{@w{ }}, @samp{3}, @samp{2}, @samp{7}, @samp{6}, and
937@samp{8}. None of these are macros, so the final output is
938
939@comment ignore
940@example
941@result{}Result is 32768
942@end example
943
944The order in which @code{m4} expands the macros can be explored using
945the trace facilities of @acronym{GNU} @code{m4} (@pxref{Trace}).
946
947This process continues until there are no more macro calls to expand and
948all the input has been consumed.
949
950@node Macros
951@chapter How to invoke macros
952
953This chapter covers macro invocation, macro arguments and how macro
954expansion is treated.
955
956@menu
957* Invocation:: Macro invocation
958* Inhibiting Invocation:: Preventing macro invocation
959* Macro Arguments:: Macro arguments
960* Quoting Arguments:: On Quoting Arguments to macros
961* Macro expansion:: Expanding macros
962@end menu
963
964@node Invocation
965@section Macro invocation
966
967@cindex macro invocation
968Macro invocations has one of the forms
969
970@comment ignore
971@example
972name
973@end example
974
975@noindent
976which is a macro invocation without any arguments, or
977
978@comment ignore
979@example
980name(arg1, arg2, @dots{}, arg@var{n})
981@end example
982
983@noindent
984which is a macro invocation with @var{n} arguments. Macros can have any
985number of arguments. All arguments are strings, but different macros
986might interpret the arguments in different ways.
987
988The opening parenthesis @emph{must} follow the @var{name} directly, with
989no spaces in between. If it does not, the macro is called with no
990arguments at all.
991
992For a macro call to have no arguments, the parentheses @emph{must} be
993left out. The macro call
994
995@comment ignore
996@example
997name()
998@end example
999
1000@noindent
1001is a macro call with one argument, which is the empty string, not a call
1002with no arguments.
1003
1004@node Inhibiting Invocation
1005@section Preventing macro invocation
1006
1007An innovation of the @code{m4} language, compared to some of its
1008predecessors (like Stratchey's @code{GPM}, for example), is the ability
1009to recognize macro calls without resorting to any special, prefixed
1010invocation character. While generally useful, this feature might
1011sometimes be the source of spurious, unwanted macro calls. So, @acronym{GNU}
1012@code{m4} offers several mechanisms or techniques for inhibiting the
1013recognition of names as macro calls.
1014
1015First of all, many builtin macros cannot meaningfully be called
1016without arguments. For any of these macros, whenever an opening
1017parenthesis does not immediately follow their name, the builtin macro
1018call is not triggered. This solves the most usual cases, like for
1019@samp{include} or @samp{eval}. Later in this document, the sentence
1020``This macro is recognized only with parameters'' refers to this
1021specific provision.
1022
1023There is also a command line option (@option{--prefix-builtins}, or
1024@option{-P}, @pxref{Operation modes, , Invoking m4}) that renames all
1025builtin macros with a prefix of @samp{m4_} at startup. The option has
1026no effect whatsoever on user defined macros. For example, with this option,
1027one has to write @code{m4_dnl} and even @code{m4_m4exit}. It also has
1028no effect on whether a macro requires parameters.
1029
1030Another alternative is to redefine problematic macros to a name less
1031likely to cause conflicts, @xref{Definitions}.
1032
1033If your version of @acronym{GNU} @code{m4} has the @code{changeword} feature
1034compiled in, it offers far more flexibility in specifying the
1035syntax of macro names, both builtin or user-defined. @xref{Changeword},
1036for more information on this experimental feature.
1037
1038Of course, the simplest way to prevent a name from being interpreted
1039as a call to an existing macro is to quote it. The remainder of
1040this section studies a little more deeply how quoting affects macro
1041invocation, and how quoting can be used to inhibit macro invocation.
1042
1043Even if quoting is usually done over the whole macro name, it can also
1044be done over only a few characters of this name (provided, of course,
1045that the unquoted portions are not also a macro). It is also possible
1046to quote the empty string, but this works only @emph{inside} the name.
1047For example:
1048
1049@example
1050`divert'
1051@result{}divert
1052`d'ivert
1053@result{}divert
1054di`ver't
1055@result{}divert
1056div`'ert
1057@result{}divert
1058@end example
1059
1060@noindent
1061all yield the string @samp{divert}. While in both:
1062
1063@example
1064`'divert
1065@result{}
1066divert`'
1067@result{}
1068@end example
1069
1070@noindent
1071the @code{divert} builtin macro will be called, which expands to the
1072empty string.
1073
1074The output of macro evaluations is always rescanned. The following
1075example would yield the string @samp{de}, exactly as if @code{m4}
1076has been given @w{@samp{substr(`abcde', `3', `2')}} as input:
1077
1078@example
1079define(`x', `substr(ab')
1080@result{}
1081define(`y', `cde, `3', `2')')
1082@result{}
1083x`'y
1084@result{}de
1085@end example
1086
1087Unquoted strings on either side of a quoted string are subject to
1088being recognized as macro names. In the following example, quoting the
1089empty string allows for the second @code{macro} to be recognized as such:
1090
1091@example
1092define(`macro', `m')
1093@result{}
1094macro(`m')macro
1095@result{}mmacro
1096macro(`m')`'macro
1097@result{}mm
1098@end example
1099
1100Quoting may prevent recognizing as a macro name the concatenation of a
1101macro expansion with the surrounding characters. In this example:
1102
1103@example
1104define(`macro', `di$1')
1105@result{}
1106macro(`v')`ert'
1107@result{}divert
1108macro(`v')ert
1109@result{}
1110@end example
1111
1112@noindent
1113the input will produce the string @samp{divert}. When the quotes were
1114removed, the @code{divert} builtin was called instead.
1115
1116@node Macro Arguments
1117@section Macro arguments
1118
1119@cindex macros, arguments to
1120@cindex arguments to macros
1121When a name is seen, and it has a macro definition, it will be expanded
1122as a macro.
1123
1124If the name is followed by an opening parenthesis, the arguments will be
1125collected before the macro is called. If too few arguments are
1126supplied, the missing arguments are taken to be the empty string.
1127However, some builtins are documented to behave differently for a
1128missing optional argument than for an explicit empty string. If there
1129are too many arguments, the excess arguments are ignored. Unquoted
1130leading whitespace is stripped off all arguments, but whitespace
1131generated by a macro expansion or occuring after a macro that expanded
1132to an empty string remains intact. Whitespace includes space, tab,
1133newline, carriage return, vertical tab, and formfeed.
1134
1135@example
1136define(`macro', `$1')
1137@result{}
1138macro( unquoted leading space lost)
1139@result{}unquoted leading space lost
1140macro(` quoted leading space kept')
1141@result{} quoted leading space kept
1142macro(
1143 divert `unquoted space kept after expansion')
1144@result{} unquoted space kept after expansion
1145macro(macro(`
1146')`whitespace from expansion kept')
1147@result{}
1148@result{}whitespace from expansion kept
1149macro(`unquoted trailing whitespace kept'
1150)
1151@result{}unquoted trailing whitespace kept
1152@result{}
1153@end example
1154
1155Normally @code{m4} will issue warnings if a builtin macro is called
1156with an inappropriate number of arguments, but it can be suppressed with
1157the @option{--quiet} command line option (or @option{--silent}, or
1158@option{-Q}, @pxref{Operation modes, , Invoking m4}). For user
1159defined macros, there is no check of the number of arguments given.
1160
1161Macros are expanded normally during argument collection, and whatever
1162commas, quotes and parentheses that might show up in the resulting
1163expanded text will serve to define the arguments as well. Thus, if
1164@var{foo} expands to @samp{, b, c}, the macro call
1165
1166@comment ignore
1167@example
1168bar(a foo, d)
1169@end example
1170
1171@noindent
1172is a macro call with four arguments, which are @samp{a }, @samp{b},
1173@samp{c} and @samp{d}. To understand why the first argument contains
1174whitespace, remember that unquoted leading whitespace is never part
1175of an argument, but trailing whitespace always is.
1176
1177It is possible for a macro's definition to change during argument
1178collection, in which case the expansion uses the definition that was in
1179effect at the time the opening @samp{(} was seen.
1180
1181@example
1182define(`f', `1')
1183@result{}
1184f(define(`f', `2'))
1185@result{}1
1186f
1187@result{}2
1188@end example
1189
1190It is an error if the end of file occurs while collecting arguments.
1191
1192@example
1193hello world
1194@result{}hello world
1195define(
1196^D
1197@error{}m4:stdin:2: ERROR: end of file in argument list
1198@end example
1199
1200@node Quoting Arguments
1201@section On Quoting Arguments to macros
1202
1203@cindex quoted macro arguments
1204@cindex macros, quoted arguments to
1205@cindex arguments, quoted macro
1206Each argument has unquoted leading whitespace removed. Within each
1207argument, all unquoted parentheses must match. For example, if
1208@var{foo} is a macro,
1209
1210@comment ignore
1211@example
1212foo(() (`(') `(')
1213@end example
1214
1215@noindent
1216is a macro call, with one argument, whose value is @samp{() (() (}.
1217Commas separate arguments, except when they occur inside quotes,
1218comments, or unquoted parentheses. @xref{Pseudo Arguments}, for
1219examples.
1220
1221It is common practice to quote all arguments to macros, unless you are
1222sure you want the arguments expanded. Thus, in the above
1223example with the parentheses, the `right' way to do it is like this:
1224
1225@comment ignore
1226@example
1227foo(`() (() (')
1228@end example
1229
1230It is, however, in certain cases necessary or convenient to leave out
1231quotes for some arguments, and there is nothing wrong in doing it. It
1232just makes life a bit harder, if you are not careful. For consistency,
1233this manual follows the rule of thumb that each layer of parentheses
1234introduces another layer of single quoting, except when showing the
1235consequences of quoting rules. This is done even when the quoted string
1236cannot be a macro, such as with integers when you have not changed the
1237syntax via @code{changeword} (@pxref{Changeword}).
1238
1239@node Macro expansion
1240@section Macro expansion
1241
1242@cindex macros, expansion of
1243@cindex expansion of macros
1244When the arguments, if any, to a macro call have been collected, the
1245macro is expanded, and the expansion text is pushed back onto the input
1246(unquoted), and reread. The expansion text from one macro call might
1247therefore result in more macros being called, if the calls are included,
1248completely or partially, in the first macro calls' expansion.
1249
1250Taking a very simple example, if @var{foo} expands to @samp{bar}, and
1251@var{bar} expands to @samp{Hello world}, the input
1252
1253@comment ignore
1254@example
1255foo
1256@end example
1257
1258@noindent
1259will expand first to @samp{bar}, and when this is reread and
1260expanded, into @samp{Hello world}.
1261
1262@node Definitions
1263@chapter How to define new macros
1264
1265@cindex macros, how to define new
1266@cindex defining new macros
1267Macros can be defined, redefined and deleted in several different ways.
1268Also, it is possible to redefine a macro without losing a previous
1269value, and bring back the original value at a later time.
1270
1271@menu
1272* Define:: Defining a new macro
1273* Arguments:: Arguments to macros
1274* Pseudo Arguments:: Special arguments to macros
1275* Undefine:: Deleting a macro
1276* Defn:: Renaming macros
1277* Pushdef:: Temporarily redefining macros
1278
1279* Indir:: Indirect call of macros
1280* Builtin:: Indirect call of builtins
1281@end menu
1282
1283@node Define
1284@section Defining a macro
1285
1286The normal way to define or redefine macros is to use the builtin
1287@code{define}:
1288
1289@deffn Builtin define (@var{name}, @ovar{expansion})
1290Defines @var{name} to expand to @var{expansion}. If
1291@var{expansion} is not given, it is taken to be empty.
1292
1293The expansion of @code{define} is void.
1294The macro @code{define} is recognized only with parameters.
1295@end deffn
1296
1297The following example defines the macro @var{foo} to expand to the text
1298@samp{Hello World.}.
1299
1300@example
1301define(`foo', `Hello world.')
1302@result{}
1303foo
1304@result{}Hello world.
1305@end example
1306
1307The empty line in the output is there because the newline is not
1308a part of the macro definition, and it is consequently copied to
1309the output. This can be avoided by use of the macro @code{dnl}.
1310@xref{Dnl}, for details.
1311
1312The first argument to @code{define} should be quoted; otherwise, if the
1313macro is already defined, you will be defining a different macro. This
1314example shows the problems with underquoting, since we did not want to
1315redefine @code{one}:
1316
1317@example
1318define(foo, one)
1319@result{}
1320define(foo, two)
1321@result{}
1322one
1323@result{}two
1324@end example
1325
1326@cindex @acronym{GNU} extensions
1327@acronym{GNU} @code{m4} normally replaces only the @emph{topmost}
1328definition of a macro if it has several definitions from @code{pushdef}
1329(@pxref{Pushdef}). Some other implementations of @code{m4} replace all
1330definitions of a macro with @code{define}. @xref{Incompatibilities},
1331for more details.
1332
1333As a @acronym{GNU} extension, the first argument to @code{define} does
1334not have to be a simple word.
1335It can be any text string, even the empty string. A macro with a
1336non-standard name cannot be invoked in the normal way, as the name is
1337not recognized. It can only be referenced by the builtins @code{Indir}
1338(@pxref{Indir}) and @code{Defn} (@pxref{Defn}).
1339
1340@cindex arrays
1341Arrays and associative arrays can be simulated by using this trick.
1342
1343@example
1344define(`array', `defn(format(``array[%d]'', `$1'))')
1345@result{}
1346define(`array_set', `define(format(``array[%d]'', `$1'), `$2')')
1347@result{}
1348array_set(`4', `array element no. 4')
1349@result{}
1350array_set(`17', `array element no. 17')
1351@result{}
1352array(`4')
1353@result{}array element no. 4
1354array(eval(`10 + 7'))
1355@result{}array element no. 17
1356@end example
1357
1358Change the @code{%d} to @code{%s} and it is an associative array.
1359
1360@node Arguments
1361@section Arguments to macros
1362
1363@cindex macros, arguments to
1364@cindex Arguments to macros
1365Macros can have arguments. The @var{n}th argument is denoted by
1366@code{$n} in the expansion text, and is replaced by the @var{n}th actual
1367argument, when the macro is expanded. Replacement of arguments happens
1368before rescanning, regardless of how many nesting levels of quoting
1369appear in the expansion. Here is an example of a macro with
1370two arguments. It simply exchanges the order of the two arguments.
1371
1372@example
1373define(`exch', `$2, $1')
1374@result{}
1375exch(`arg1', `arg2')
1376@result{}arg2, arg1
1377@end example
1378
1379This can be used, for example, if you like the arguments to
1380@code{define} to be reversed.
1381
1382@example
1383define(`exch', `$2, $1')
1384@result{}
1385define(exch(``expansion text'', ``macro''))
1386@result{}
1387macro
1388@result{}expansion text
1389@end example
1390
1391@xref{Quoting Arguments}, for an explanation of the double quotes.
1392(You should try and improve this example so that clients of @code{exch}
1393do not have to double quote; or @pxref{Improved exch, , Answers}).
1394
1395@cindex @acronym{GNU} extensions
1396@acronym{GNU} @code{m4} allows the number following the @samp{$} to
1397consist of one
1398or more digits, allowing macros to have any number of arguments. This
1399is not so in UNIX implementations of @code{m4}, which only recognize
1400one digit.
1401
1402As a special case, the zeroth argument, @code{$0}, is always the name
1403of the macro being expanded.
1404
1405@example
1406define(`test', ``Macro name: $0'')
1407@result{}
1408test
1409@result{}Macro name: test
1410@end example
1411
1412If you want quoted text to appear as part of the expansion text,
1413remember that quotes can be nested in quoted strings. Thus, in
1414
1415@example
1416define(`foo', `This is macro `foo'.')
1417@result{}
1418foo
1419@result{}This is macro foo.
1420@end example
1421
1422@noindent
1423The @samp{foo} in the expansion text is @emph{not} expanded, since it is
1424a quoted string, and not a name.
1425
1426@node Pseudo Arguments
1427@section Special arguments to macros
1428
1429@cindex special arguments to macros
1430@cindex macros, special arguments to
1431@cindex arguments to macros, special
1432There is a special notation for the number of actual arguments supplied,
1433and for all the actual arguments.
1434
1435The number of actual arguments in a macro call is denoted by @code{$#}
1436in the expansion text. Thus, a macro to display the number of arguments
1437given can be
1438
1439@example
1440define(`nargs', `$#')
1441@result{}
1442nargs
1443@result{}0
1444nargs()
1445@result{}1
1446nargs(`arg1', `arg2', `arg3')
1447@result{}3
1448nargs(`commas can be quoted, like this')
1449@result{}1
1450nargs(arg1#inside comments, commas do not separate arguments
1451still arg1)
1452@result{}1
1453nargs((unquoted parentheses, like this, group arguments))
1454@result{}1
1455@end example
1456
1457The notation @code{$*} can be used in the expansion text to denote all
1458the actual arguments, unquoted, with commas in between. For example
1459
1460@example
1461define(`echo', `$*')
1462@result{}
1463echo(arg1, arg2, arg3 , arg4)
1464@result{}arg1,arg2,arg3 ,arg4
1465@end example
1466
1467Often each argument should be quoted, and the notation @code{$@@} handles
1468that. It is just like @code{$*}, except that it quotes each argument.
1469A simple example of that is:
1470
1471@example
1472define(`echo', `$@@')
1473@result{}
1474echo(arg1, arg2, arg3 , arg4)
1475@result{}arg1,arg2,arg3 ,arg4
1476@end example
1477
1478Where did the quotes go? Of course, they were eaten, when the expanded
1479text were reread by @code{m4}. To show the difference, try
1480
1481@example
1482define(`echo1', `$*')
1483@result{}
1484define(`echo2', `$@@')
1485@result{}
1486define(`foo', `This is macro `foo'.')
1487@result{}
1488echo1(foo)
1489@result{}This is macro This is macro foo..
1490echo1(`foo')
1491@result{}This is macro foo.
1492echo2(foo)
1493@result{}This is macro foo.
1494echo2(`foo')
1495@result{}foo
1496@end example
1497
1498@noindent
1499@xref{Trace}, if you do not understand this. As another example of the
1500difference, remember that comments encountered in arguments are passed
1501untouched to the macro, and that quoting disables comments.
1502
1503@example
1504define(`echo1', `$*')
1505@result{}
1506define(`echo2', `$@@')
1507@result{}
1508define(`foo', `bar')
1509@result{}
1510echo1(#foo'foo
1511foo)
1512@result{}#foo'foo
1513@result{}bar
1514echo2(#foo'foo
1515foo)
1516@result{}#foobar
1517@result{}bar'
1518@end example
1519
1520A @samp{$} sign in the expansion text, that is not followed by anything
1521@code{m4} understands, is simply copied to the macro expansion, as any
1522other text is.
1523
1524@example
1525define(`foo', `$$$ hello $$$')
1526@result{}
1527foo
1528@result{}$$$ hello $$$
1529@end example
1530
1531If you want a macro to expand to something like @samp{$12}, the
1532judicious use of nested quoting can put a safe character between the
1533@code{$} and the next character, relying on the rescanning to remove the
1534nested quote. This will prevent @code{m4} from interpreting the
1535@code{$} sign as a reference to an argument.
1536
1537@example
1538define(`foo', `no nested quote: $1')
1539@result{}
1540foo(`arg')
1541@result{}no nested quote: arg
1542define(`foo', `nested quote around $: `$'1')
1543@result{}
1544foo(`arg')
1545@result{}nested quote around $: $1
1546define(`foo', `nested empty quote after $: $`'1')
1547@result{}
1548foo(`arg')
1549@result{}nested empty quote after $: $1
1550define(`foo', `nested quote around next character: $`1'')
1551@result{}
1552foo(`arg')
1553@result{}nested quote around next character: $1
1554define(`foo', `nested quote around both: `$1'')
1555@result{}
1556foo(`arg')
1557@result{}nested quote around both: arg
1558@end example
1559
1560@node Undefine
1561@section Deleting a macro
1562
1563@cindex macros, how to delete
1564@cindex deleting macros
1565@cindex undefining macros
1566A macro definition can be removed with @code{undefine}:
1567
1568@deffn Builtin undefine (@var{name}@dots{})
1569For each argument, remove the macro @var{name}. The macro names must
1570necessarily be quoted, since they will be expanded otherwise.
1571
1572The expansion of @code{undefine} is void.
1573The macro @code{undefine} is recognized only with parameters.
1574@end deffn
1575
1576@example
1577foo bar blah
1578@result{}foo bar blah
1579define(`foo', `some')define(`bar', `other')define(`blah', `text')
1580@result{}
1581foo bar blah
1582@result{}some other text
1583undefine(`foo')
1584@result{}
1585foo bar blah
1586@result{}foo other text
1587undefine(`bar', `blah')
1588@result{}
1589foo bar blah
1590@result{}foo bar blah
1591@end example
1592
1593Undefining a macro inside that macro's expansion is safe; the macro
1594still expands to the definition that was in effect at the @samp{(}.
1595
1596@example
1597define(`f', ``$0':$1')
1598@result{}
1599f(f(f(undefine(`f')`hello world')))
1600@result{}f:f:f:hello world
1601f(`bye')
1602@result{}f(bye)
1603@end example
1604
1605It is not an error for @var{name} to have no macro definition. In that
1606case, @code{undefine} does nothing.
1607
1608@node Defn
1609@section Renaming macros
1610
1611@cindex macros, how to rename
1612@cindex renaming macros
1613It is possible to rename an already defined macro. To do this, you need
1614the builtin @code{defn}:
1615
1616@deffn Builtin defn (@var{name})
1617Expands to the @emph{quoted definition} of @var{name}. If the
1618argument is not a defined macro, the expansion is void.
1619
1620If @var{name} is a user-defined macro, the quoted definition is simply
1621the quoted expansion text. If, instead, @var{name} is a builtin, the
1622expansion is a special token, which points to the builtin's internal
1623definition. This token is only meaningful as the second argument to
1624@code{define} (and @code{pushdef}), and is silently converted to an
1625empty string in most other contexts.
1626
1627The macro @code{defn} is recognized only with parameters.
1628@end deffn
1629
1630Its normal use is best understood through an example, which shows how to
1631rename @code{undefine} to @code{zap}:
1632
1633@example
1634define(`zap', defn(`undefine'))
1635@result{}
1636zap(`undefine')
1637@result{}
1638undefine(`zap')
1639@result{}undefine(zap)
1640@end example
1641
1642In this way, @code{defn} can be used to copy macro definitions, and also
1643definitions of builtin macros. Even if the original macro is removed,
1644the other name can still be used to access the definition.
1645
1646The fact that macro definitions can be transferred also explains why you
1647should use @code{$0}, rather than retyping a macro's name in its
1648definition:
1649
1650@example
1651define(`foo', `This is `$0'')
1652@result{}
1653define(`bar', defn(`foo'))
1654@result{}
1655bar
1656@result{}This is bar
1657@end example
1658
1659Macros used as string variables should be referred through @code{defn},
1660to avoid unwanted expansion of the text:
1661
1662@example
1663define(`string', `The macro dnl is very useful
1664')
1665@result{}
1666string
1667@result{}The macro@w{ }
1668defn(`string')
1669@result{}The macro dnl is very useful
1670@result{}
1671@end example
1672
1673However, it is important to remember that @code{m4} rescanning is purely
1674textual. If an unbalanced end-quote string occurs in a macro
1675definition, the rescan will see that embedded quote as the termination
1676of the quoted string, and the remainder of the macro's definition will
1677be rescanned unquoted. Thus it is a good idea to avoid unbalanced
1678end-quotes in macro definitions or arguments to macros.
1679
1680@example
1681define(`foo', a'a)
1682@result{}
1683define(`a', `A')
1684@result{}
1685define(`echo', `$@@')
1686@result{}
1687foo
1688@result{}A'A
1689defn(`foo')
1690@result{}aA'
1691echo(foo)
1692@result{}AA'
1693@end example
1694
1695Using @code{defn} to generate special tokens for builtin macros outside
1696of expected contexts can sometimes trigger warnings. But most of the
1697time, such tokens are silently converted to the empty string.
1698
1699@example
1700defn(`defn')
1701@result{}
1702define(defn(`divnum'), `cannot redefine a builtin token')
1703@error{}m4:stdin:2: Warning: define: invalid macro name ignored
1704@result{}
1705divnum
1706@result{}0
1707@end example
1708
1709@node Pushdef
1710@section Temporarily redefining macros
1711
1712@cindex macros, temporary redefinition of
1713@cindex temporary redefinition of macros
1714@cindex redefinition of macros, temporary
1715It is possible to redefine a macro temporarily, reverting to the
1716previous definition at a later time. This is done with the builtins
1717@code{pushdef} and @code{popdef}:
1718
1719@deffn Builtin pushdef (@var{name}, @ovar{expansion})
1720@deffnx Builtin popdef (@var{name}@dots{})
1721Analogous to @code{define} and @code{undefine}.
1722
1723These macros work in a stack-like fashion. A macro is temporarily
1724redefined with @code{pushdef}, which replaces an existing definition of
1725@var{name}, while saving the previous definition, before the new one is
1726installed. If there is no previous definition, @code{pushdef} behaves
1727exactly like @code{define}.
1728
1729If a macro has several definitions (of which only one is accessible),
1730the topmost definition can be removed with @code{popdef}. If there is
1731no previous definition, @code{popdef} behaves like @code{undefine}.
1732
1733The expansion of both @code{pushdef} and @code{popdef} is void.
1734The macros @code{pushdef} and @code{popdef} are recognized only with
1735parameters.
1736@end deffn
1737
1738@example
1739define(`foo', `Expansion one.')
1740@result{}
1741foo
1742@result{}Expansion one.
1743pushdef(`foo', `Expansion two.')
1744@result{}
1745foo
1746@result{}Expansion two.
1747pushdef(`foo', `Expansion three.')
1748@result{}
1749pushdef(`foo', `Expansion four.')
1750@result{}
1751popdef(`foo')
1752@result{}
1753foo
1754@result{}Expansion three.
1755popdef(`foo', `foo')
1756@result{}
1757foo
1758@result{}Expansion one.
1759popdef(`foo')
1760@result{}
1761foo
1762@result{}foo
1763@end example
1764
1765If a macro with several definitions is redefined with @code{define}, the
1766topmost definition is @emph{replaced} with the new definition. If it is
1767removed with @code{undefine}, @emph{all} the definitions are removed,
1768and not only the topmost one.
1769
1770@example
1771define(`foo', `Expansion one.')
1772@result{}
1773foo
1774@result{}Expansion one.
1775pushdef(`foo', `Expansion two.')
1776@result{}
1777foo
1778@result{}Expansion two.
1779define(`foo', `Second expansion two.')
1780@result{}
1781foo
1782@result{}Second expansion two.
1783undefine(`foo')
1784@result{}
1785foo
1786@result{}foo
1787@end example
1788
1789@cindex local variables
1790@cindex variables, local
1791Local variables within macros are made with @code{pushdef} and
1792@code{popdef}. At the start of the macro a new definition is pushed,
1793within the macro it is manipulated and at the end it is popped,
1794revealing the former definition.
1795
1796It is possible to temporarily redefine a builtin with @code{pushdef}
1797and @code{defn}.
1798
1799@node Indir
1800@section Indirect call of macros
1801
1802@cindex indirect call of macros
1803@cindex call of macros, indirect
1804@cindex macros, indirect call of
1805@cindex @acronym{GNU} extensions
1806Any macro can be called indirectly with @code{indir}:
1807
1808@deffn Builtin indir (@var{name}, @ovar{args@dots{}})
1809Results in a call to the macro @var{name}, which is passed the
1810rest of the arguments @var{args}. If @var{name} is not defined, an
1811error message is printed, and the expansion is void.
1812
1813The macro @code{indir} is recognized only with parameters.
1814@end deffn
1815
1816This can be used to call macros with computed or ``invalid''
1817names (@code{define} allows such names to be defined):
1818
1819@example
1820define(`$$internal$macro', `Internal macro (name `$0')')
1821@result{}
1822$$internal$macro
1823@result{}$$internal$macro
1824indir(`$$internal$macro')
1825@result{}Internal macro (name $$internal$macro)
1826@end example
1827
1828The point is, here, that larger macro packages can have private macros
1829defined, that will not be called by accident. They can @emph{only} be
1830called through the builtin @code{indir}.
1831
1832One other point to observe is that argument collection occurs before
1833@code{indir} invokes @var{name}, so if argument collection changes the
1834value of @var{name}, that will be reflected in the final expansion.
1835This is different than the behavior when invoking macros directly,
1836where the definition that was in effect before argument collection is
1837used.
1838
1839@example
1840define(`f', `1')
1841@result{}
1842f(define(`f', `2'))
1843@result{}1
1844indir(`f', define(`f', `3'))
1845@result{}3
1846indir(`f', undefine(`f'))
1847@error{}m4:stdin:4: undefined macro `f'
1848@result{}
1849@end example
1850
1851When handed the result of @code{defn} (@pxref{Defn}) as one of its
1852arguments, @code{indir} defers to the invoked @var{name} for whether a
1853token representing a builtin is recognized or flattened to the empty
1854string.
1855
1856@example
1857indir(defn(`defn'), `divnum')
1858@error{}m4:stdin:1: Warning: indir: invalid macro name ignored
1859@result{}
1860indir(`define', defn(`defn'), `divnum')
1861@error{}m4:stdin:2: Warning: define: invalid macro name ignored
1862@result{}
1863indir(`define', `foo', defn(`divnum'))
1864@result{}
1865foo
1866@result{}0
1867indir(`divert', defn(`foo'))
1868@error{}m4:stdin:5: empty string treated as 0 in builtin `divert'
1869@result{}
1870@end example
1871
1872@node Builtin
1873@section Indirect call of builtins
1874
1875@cindex indirect call of builtins
1876@cindex call of builtins, indirect
1877@cindex builtins, indirect call of
1878@cindex @acronym{GNU} extensions
1879Builtin macros can be called indirectly with @code{builtin}:
1880
1881@deffn Builtin builtin (@var{name}, @ovar{args@dots{}})
1882Results in a call to the builtin @var{name}, which is passed the
1883rest of the arguments @var{args}. If @var{name} does not name a
1884builtin, an error message is printed, and the expansion is void.
1885
1886The macro @code{builtin} is recognized only with parameters.
1887@end deffn
1888
1889This can be used even if @var{name} has been given another definition
1890that has covered the original, or been undefined so that no macro
1891maps to the builtin.
1892
1893@example
1894pushdef(`define', `hidden')
1895@result{}
1896undefine(`undefine')
1897@result{}
1898define(`foo', `bar')
1899@result{}hidden
1900foo
1901@result{}foo
1902builtin(`define', `foo', defn(`divnum'))
1903@result{}
1904foo
1905@result{}0
1906builtin(`define', `foo', `BAR')
1907@result{}
1908foo
1909@result{}BAR
1910undefine(`foo')
1911@result{}undefine(foo)
1912foo
1913@result{}BAR
1914builtin(`undefine', `foo')
1915@result{}
1916foo
1917@result{}foo
1918@end example
1919
1920The @var{name} argument only matches the original name of the builtin,
1921even when the @option{--prefix-builtins} option (or @option{-P},
1922@pxref{Operation modes, , Invoking m4}) is in effect. This is different
1923from @code{indir}, which only tracks current macro names.
1924
1925Note that @code{indir} and @code{builtin} can be used to invoke builtins
1926without arguments, even when they normally require parameters to be
1927recognized; but it will provoke a warning, and result in a void expansion.
1928
1929@example
1930builtin
1931@result{}builtin
1932builtin()
1933@error{}m4:stdin:2: undefined builtin `'
1934@result{}
1935builtin(`builtin')
1936@error{}m4:stdin:3: Warning: too few arguments to builtin `builtin'
1937@result{}
1938builtin(`builtin',)
1939@error{}m4:stdin:4: undefined builtin `'
1940@result{}
1941@end example
1942
1943@node Conditionals
1944@chapter Conditionals, loops, and recursion
1945
1946Macros, expanding to plain text, perhaps with arguments, are not quite
1947enough. We would like to have macros expand to different things, based
1948on decisions taken at run-time. For that, we need some kind of conditionals.
1949Also, we would like to have some kind of loop construct, so we could do
1950something a number of times, or while some condition is true.
1951
1952@menu
1953* Ifdef:: Testing if a macro is defined
1954* Ifelse:: If-else construct, or multibranch
1955* Shift:: Recursion in @code{m4}
1956* Forloop:: Iteration by counting
1957* Foreach:: Iteration by list contents
1958@end menu
1959
1960@node Ifdef
1961@section Testing if a macro is defined
1962
1963@cindex conditionals
1964There are two different builtin conditionals in @code{m4}. The first is
1965@code{ifdef}:
1966
1967@deffn Builtin ifdef (@var{name}, @var{string-1}, @ovar{string-2})
1968If @var{name} is defined as a macro, @code{ifdef} expands to
1969@var{string-1}, otherwise to @var{string-2}. If @var{string-2} is
1970omitted, it is taken to be the empty string (according to the normal
1971rules).
1972
1973The macro @code{ifdef} is recognized only with parameters.
1974@end deffn
1975
1976@example
1977ifdef(`foo', ``foo' is defined', ``foo' is not defined')
1978@result{}foo is not defined
1979define(`foo', `')
1980@result{}
1981ifdef(`foo', ``foo' is defined', ``foo' is not defined')
1982@result{}foo is defined
1983ifdef(`no_such_macro', `yes', `no', `extra argument')
1984@error{}m4:stdin:4: Warning: excess arguments to builtin `ifdef' ignored
1985@result{}no
1986@end example
1987
1988@node Ifelse
1989@section If-else construct, or multibranch
1990
1991@cindex comparing strings
1992The other conditional, @code{ifelse}, is much more powerful. It can be
1993used as a way to introduce a long comment, as an if-else construct, or
1994as a multibranch, depending on the number of arguments supplied:
1995
1996@deffn Builtin ifelse (@var{comment})
1997@deffnx Builtin ifelse (@var{string-1}, @var{string-2}, @var{equal}, @
1998 @ovar{not-equal})
1999@deffnx Builtin ifelse (@var{string-1}, @var{string-2}, @var{equal-1}, @
2000 @var{string-3}, @var{string-4}, @var{equal-2}, @dots{})
2001Used with only one argument, the @code{ifelse} simply discards it and
2002produces no output.
2003
2004If called with three or four arguments, @code{ifelse} expands into
2005@var{equal}, if @var{string-1} and @var{string-2} are equal (character
2006for character), otherwise it expands to @var{not-equal}. A final fifth
2007argument is ignored, after triggering a warning.
2008
2009If called with six or more arguments, and @var{string-1} and
2010@var{string-2} are equal, @code{ifelse} expands into @var{equal-1},
2011otherwise the first three arguments are discarded and the processing
2012starts again.
2013
2014The macro @code{ifelse} is recognized only with parameters.
2015@end deffn
2016
2017Using only one argument is a common @code{m4} idiom for introducing a
2018block comment, as an alternative to repeatedly using @code{dnl}. This
2019special usage is recognized by @acronym{GNU} @code{m4}, so that in this
2020case, the warning about missing arguments is never triggered.
2021
2022@example
2023ifelse(`some comments')
2024@result{}
2025ifelse(`foo', `bar')
2026@error{}m4:stdin:2: Warning: too few arguments to builtin `ifelse'
2027@result{}
2028@end example
2029
2030Using three or four arguments provides decision points.
2031
2032@example
2033ifelse(`foo', `bar', `true')
2034@result{}
2035ifelse(`foo', `foo', `true')
2036@result{}true
2037define(`foo', `bar')
2038@result{}
2039ifelse(foo, `bar', `true', `false')
2040@result{}true
2041ifelse(foo, `foo', `true', `false')
2042@result{}false
2043@end example
2044
2045Notice how the first argument was used unquoted; it is common to compare
2046the expansion of a macro with a string. With this macro, you can now
2047reproduce the behavior of many of the builtins, where the macro is
2048recognized only with arguments.
2049
2050@example
2051define(`foo', `ifelse(`$#', `0', ``$0'', `arguments:$#')')
2052@result{}
2053foo
2054@result{}foo
2055foo()
2056@result{}arguments:1
2057foo(`a', `b', `c')
2058@result{}arguments:3
2059@end example
2060
2061@cindex multibranches
2062However, @code{ifelse} can take more than four arguments. If given more
2063than four arguments, @code{ifelse} works like a @code{case} or @code{switch}
2064statement in traditional programming languages. If @var{string-1} and
2065@var{string-2} are equal, @code{ifelse} expands into @var{equal-1}, otherwise
2066the procedure is repeated with the first three arguments discarded. This
2067calls for an example:
2068
2069@example
2070ifelse(`foo', `bar', `third', `gnu', `gnats')
2071@error{}m4:stdin:1: Warning: excess arguments to builtin `ifelse' ignored
2072@result{}gnu
2073ifelse(`foo', `bar', `third', `gnu', `gnats', `sixth')
2074@result{}
2075ifelse(`foo', `bar', `third', `gnu', `gnats', `sixth', `seventh')
2076@result{}seventh
2077ifelse(`foo', `bar', `3', `gnu', `gnats', `6', `7', `8')
2078@error{}m4:stdin:4: Warning: excess arguments to builtin `ifelse' ignored
2079@result{}7
2080@end example
2081
2082Naturally, the normal case will be slightly more advanced than these
2083examples. A common use of @code{ifelse} is in macros implementing loops
2084of various kinds.
2085
2086@node Shift
2087@section Recursion in @code{m4}
2088
2089@cindex recursive macros
2090@cindex macros, recursive
2091There is no direct support for loops in @code{m4}, but macros can be
2092recursive. There is no limit on the number of recursion levels, other
2093than those enforced by your hardware and operating system.
2094
2095@cindex loops
2096Loops can be programmed using recursion and the conditionals described
2097previously.
2098
2099There is a builtin macro, @code{shift}, which can, among other things,
2100be used for iterating through the actual arguments to a macro:
2101
2102@deffn Builtin shift (@var{arg1}, @dots{})
2103Takes any number of arguments, and expands to all its arguments except
2104@var{arg1}, separated by commas, with each argument quoted.
2105
2106The macro @code{shift} is recognized only with parameters.
2107@end deffn
2108
2109@example
2110shift
2111@result{}shift
2112shift(`bar')
2113@result{}
2114shift(`foo', `bar', `baz')
2115@result{}bar,baz
2116@end example
2117
2118An example of the use of @code{shift} is this macro:
2119
2120@deffn Composite reverse (@dots{})
2121Takes any number of arguments, and reverses their order.
2122@end deffn
2123
2124It is implemented as:
2125
2126@example
2127define(`reverse', `ifelse(`$#', `0', , `$#', `1', ``$1'',
2128 `reverse(shift($@@)), `$1'')')
2129@result{}
2130reverse
2131@result{}
2132reverse(`foo')
2133@result{}foo
2134reverse(`foo', `bar', `gnats', `and gnus')
2135@result{}and gnus, gnats, bar, foo
2136@end example
2137
2138While not a very interesting macro, it does show how simple loops can be
2139made with @code{shift}, @code{ifelse} and recursion. It also shows
2140that @code{shift} is usually used with @samp{$@@}. Sometimes, a
2141recursive algorithm requires adding quotes to each element:
2142
2143@deffn Composite quote (@dots{})
2144@deffnx Composite dquote (@dots{})
2145@deffnx Composite dquote_elt (@dots{})
2146Takes any number of arguments, and adds quoting. With @code{quote},
2147only one level of quoting is added, effectively removing whitespace
2148after commas and turning multiple arguments into a single string. With
2149@code{dquote}, two levels of quoting are added, one around each element,
2150and one around the list. And with @code{dquote_elt}, two levels of
2151quoting are added around each element.
2152@end deffn
2153
2154An actual implementation of these three macros is distributed as
2155@file{m4-@value{VERSION}/@/examples/@/quote.m4} in this package. First,
2156let's examine their usage:
2157
2158@example
2159include(`quote.m4')
2160@result{}
2161-quote-dquote-dquote_elt-
2162@result{}----
2163-quote()-dquote()-dquote_elt()-
2164@result{}--`'-`'-
2165-quote(`1')-dquote(`1')-dquote_elt(`1')-
2166@result{}-1-`1'-`1'-
2167-quote(`1', `2')-dquote(`1', `2')-dquote_elt(`1', `2')-
2168@result{}-1,2-`1',`2'-`1',`2'-
2169define(`n', `$#')dnl
2170-n(quote(`1', `2'))-n(dquote(`1', `2'))-n(dquote_elt(`1', `2'))-
2171@result{}-1-1-2-
2172dquote(dquote_elt(`1', `2'))
2173@result{}``1'',``2''
2174dquote_elt(dquote(`1', `2'))
2175@result{}``1',`2''
2176@end example
2177
2178The last two lines show that when given two arguments, @code{dquote}
2179results in one string, while @code{dquote_elt} results in two. Now,
2180examine the implementation. Note that @code{quote} and
2181@code{dquote_elt} make decisions based on their number of arguments, so
2182that when called without arguments, they result in nothing instead of a
2183quoted empty string; this is so that it is possible to distinquish
2184between no arguments and an empty first argument. @code{dquote}, on the
2185other hand, results in a string no matter what, since it is still
2186possible to tell whether it was invoked without arguments based on the
2187resulting string.
2188
2189@example
2190undivert(`quote.m4')dnl
2191@result{}divert(`-1')
2192@result{}# quote(args) - convert args to single-quoted string
2193@result{}define(`quote', `ifelse(`$#', `0', `', ``$*'')')
2194@result{}# dquote(args) - convert args to quoted list of quoted strings
2195@result{}define(`dquote', ``$@@'')
2196@result{}# dquote_elt(args) - convert args to list of double-quoted strings
2197@result{}define(`dquote_elt', `ifelse(`$#', `0', `', `$#', `1', ```$1''',
2198@result{} ```$1'',$0(shift($@@))')')
2199@result{}divert`'dnl
2200@end example
2201
2202@node Forloop
2203@section Iteration by counting
2204
2205@cindex for loops
2206@cindex loops, counting
2207@cindex counting loops
2208Here is an example of a loop macro that implements a simple for loop.
2209
2210@deffn Composite forloop (@var{iterator}, @var{start}, @var{end}, @var{text})
2211Takes the name in @var{iterator}, which must be a valid macro name, and
2212successively assign it each integer value from @var{start} to @var{end},
2213inclusive. For each assignment to @var{iterator}, append @var{text} to
2214the expansion of the @code{forloop}. @var{text} may refer to
2215@var{iterator}. Any definition of @var{iterator} prior to this
2216invocation is restored.
2217@end deffn
2218
2219It can, for example, be used for simple counting:
2220
2221@example
2222include(`forloop.m4')
2223@result{}
2224forloop(`i', `1', `8', `i ')
2225@result{}1 2 3 4 5 6 7 8@w{ }
2226@end example
2227
2228For-loops can be nested, like:
2229
2230@example
2231include(`forloop.m4')
2232@result{}
2233forloop(`i', `1', `4', `forloop(`j', `1', `8', ` (i, j)')
2234')
2235@result{} (1, 1) (1, 2) (1, 3) (1, 4) (1, 5) (1, 6) (1, 7) (1, 8)
2236@result{} (2, 1) (2, 2) (2, 3) (2, 4) (2, 5) (2, 6) (2, 7) (2, 8)
2237@result{} (3, 1) (3, 2) (3, 3) (3, 4) (3, 5) (3, 6) (3, 7) (3, 8)
2238@result{} (4, 1) (4, 2) (4, 3) (4, 4) (4, 5) (4, 6) (4, 7) (4, 8)
2239@result{}
2240@end example
2241
2242The implementation of the @code{forloop} macro is fairly
2243straightforward. The @code{forloop} macro itself is simply a wrapper,
2244which saves the previous definition of the first argument, calls the
2245internal macro @code{@w{_forloop}}, and re-establishes the saved
2246definition of the first argument.
2247
2248The macro @code{@w{_forloop}} expands the fourth argument once, and
2249tests to see if the iterator has reached the final value. If it has
2250not finished, it increments the iterator (using the predefined macro
2251@code{incr}, @pxref{Incr}), and recurses.
2252
2253Here is an actual implementation of @code{forloop}, distributed as
2254@file{m4-@value{VERSION}/@/examples/@/forloop.m4} in this package:
2255
2256@example
2257undivert(`forloop.m4')dnl
2258@result{}divert(`-1')
2259@result{}# forloop(var, from, to, stmt) - simple version
2260@result{}define(`forloop', `pushdef(`$1', `$2')_forloop($@@)popdef(`$1')')
2261@result{}define(`_forloop',
2262@result{} `$4`'ifelse($1, `$3', `', `define(`$1', incr($1))$0($@@)')')
2263@result{}divert`'dnl
2264@end example
2265
2266Notice the careful use of quotes. Certain macro arguments are left
2267unquoted, each for its own reason. Try to find out @emph{why} these
2268arguments are left unquoted, and see what happens if they are quoted.
2269(As presented, these two macros are useful but not very robust for
2270general use. They lack even basic error handling for cases like
2271@var{start} less than @var{end}, @var{end} not numeric, or
2272@var{iterator} not being a macro name. See if you can improve these
2273macros; or @pxref{Improved forloop, , Answers}).
2274
2275@node Foreach
2276@section Iteration by list contents
2277
2278@cindex for each loops
2279@cindex loops, list iteration
2280@cindex iterating over lists
2281Here is an example of a loop macro that implements list iteration.
2282
2283@deffn Composite foreach (@var{iterator}, @var{paren-list}, @var{text})
2284@deffnx Composite foreachq (@var{iterator}, @var{quote-list}, @var{text})
2285Takes the name in @var{iterator}, which must be a valid macro name, and
2286successively assign it each value from @var{paren-list} or
2287@var{quote-list}. In @code{foreach}, @var{paren-list} is a
2288comma-separated list of elements contained in parentheses. In
2289@code{foreachq}, @var{quote-list} is a comma-separated list of elements
2290contained in a quoted string. For each assignment to @var{iterator},
2291append @var{text} to the overall expansion. @var{text} may refer to
2292@var{iterator}. Any definition of @var{iterator} prior to this
2293invocation is restored.
2294@end deffn
2295
2296As an example, this displays each word in a list inside of a sentence,
2297using an implementation of @code{foreach} distributed as
2298@file{m4-@value{VERSION}/@/examples/@/foreach.m4}, and @code{foreachq}
2299in @file{m4-@value{VERSION}/@/examples/@/foreachq.m4}.
2300
2301@example
2302include(`foreach.m4')
2303@result{}
2304foreach(`x', (foo, bar, foobar), `Word was: x
2305')dnl
2306@result{}Word was: foo
2307@result{}Word was: bar
2308@result{}Word was: foobar
2309include(`foreachq.m4')
2310@result{}
2311foreachq(`x', `foo, bar, foobar', `Word was: x
2312')dnl
2313@result{}Word was: foo
2314@result{}Word was: bar
2315@result{}Word was: foobar
2316@end example
2317
2318It is possible to be more complex; each element of the @var{paren-list}
2319or @var{quote-list} can itself be a list, to pass as further arguments
2320to a helper macro. This example generates a shell case statement:
2321
2322@example
2323include(`foreach.m4')
2324@result{}
2325define(`_case', ` $1)
2326 $2=" $1";;
2327')dnl
2328define(`_cat', `$1$2')dnl
2329case $`'1 in
2330@result{}case $1 in
2331foreach(`x', `(`(`a', `vara')', `(`b', `varb')', `(`c', `varc')')',
2332 `_cat(`_case', x)')dnl
2333@result{} a)
2334@result{} vara=" a";;
2335@result{} b)
2336@result{} varb=" b";;
2337@result{} c)
2338@result{} varc=" c";;
2339esac
2340@result{}esac
2341@end example
2342
2343The implementation of the @code{foreach} macro is a bit more involved;
2344it is a wrapper around two helper macros. First, @code{@w{_arg1}} is
2345needed to grab the first element of a list. Second,
2346@code{@w{_foreach}} implements the recursion, successively walking
2347through the original list. Here is a simple implementation of
2348@code{foreach}:
2349
2350@example
2351undivert(`foreach.m4')dnl
2352@result{}divert(`-1')
2353@result{}# foreach(x, (item_1, item_2, ..., item_n), stmt)
2354@result{}# parenthesized list, simple version
2355@result{}define(`foreach', `pushdef(`$1')_foreach($@@)popdef(`$1')')
2356@result{}define(`_arg1', `$1')
2357@result{}define(`_foreach', `ifelse(`$2', `()', `',
2358@result{} `define(`$1', _arg1$2)$3`'$0(`$1', (shift$2), `$3')')')
2359@result{}divert`'dnl
2360@end example
2361
2362Unfortunately, that implementation is not robust to macro names as list
2363elements. Each iteration of @code{@w{_foreach}} is stripping another
2364layer of quotes, leading to erratic results if list elements are not
2365already fully expanded. The first cut at implementing @code{foreachq}
2366takes this into account. Also, when using quoted elements in a
2367@var{paren-list}, the overall list must be quoted. A @var{quote-list}
2368has the nice property of requiring fewer characters to create a list
2369containing the same quoted elements. To see the difference between the
2370two macros, we attempt to pass double-quoted macro names in a list,
2371expecting the macro name on output after one layer of quotes is removed
2372during list iteration and the final layer removed during the final
2373rescan:
2374
2375@example
2376define(`a', `1')define(`b', `2')define(`c', `3')
2377@result{}
2378include(`foreach.m4')
2379@result{}
2380include(`foreachq.m4')
2381@result{}
2382foreach(`x', `(``a'', ``(b'', ``c)'')', `x
2383')
2384@result{}1
2385@result{}(2)1
2386@result{}
2387@result{}, x
2388@result{})
2389foreachq(`x', ```a'', ``(b'', ``c)''', `x
2390')dnl
2391@result{}a
2392@result{}(b
2393@result{}c)
2394@end example
2395
2396Obviously, @code{foreachq} did a better job; here is its implementation:
2397
2398@example
2399undivert(`foreachq.m4')dnl
2400@result{}include(`quote.m4')dnl
2401@result{}divert(`-1')
2402@result{}# foreachq(x, `item_1, item_2, ..., item_n', stmt)
2403@result{}# quoted list, simple version
2404@result{}define(`foreachq', `pushdef(`$1')_foreachq($@@)popdef(`$1')')
2405@result{}define(`_arg1', `$1')
2406@result{}define(`_foreachq', `ifelse(quote($2), `', `',
2407@result{} `define(`$1', `_arg1($2)')$3`'$0(`$1', `shift($2)', `$3')')')
2408@result{}divert`'dnl
2409@end example
2410
2411Notice that @code{@w{_foreachq}} had to use the helper macro
2412@code{quote} defined earlier (@pxref{Shift}), to ensure that the
2413embedded @code{ifelse} call does not go haywire if a list element
2414contains a comma. Unfortunately, this implementation of @code{foreachq}
2415has its own severe flaw. Whereas the @code{foreach} implementation was
2416linear, this macro is quadratic in the number of list elements, and is
2417much more likely to trip up the limit set by the command line option
2418@option{--nesting-limit} (or @option{-L}, @pxref{Limits control, ,
2419Invoking m4}). (It is possible to have robust iteration with linear
2420behavior for either list style. See if you can learn from the best
2421elements of both of these implementations to create robust macros; or
2422@pxref{Improved foreach, , Answers}).
2423
2424@node Debugging
2425@chapter How to debug macros and input
2426
2427When writing macros for @code{m4}, they often do not work as intended on
2428the first try (as is the case with most programming languages).
2429Fortunately, there is support for macro debugging in @code{m4}.
2430
2431@menu
2432* Dumpdef:: Displaying macro definitions
2433* Trace:: Tracing macro calls
2434* Debug Levels:: Controlling debugging output
2435* Debug Output:: Saving debugging output
2436@end menu
2437
2438@node Dumpdef
2439@section Displaying macro definitions
2440
2441@cindex displaying macro definitions
2442@cindex macros, displaying definitions
2443@cindex definitions, displaying macro
2444If you want to see what a name expands into, you can use the builtin
2445@code{dumpdef}:
2446
2447@deffn Builtin dumpdef (@ovar{names@dots{}})
2448Accepts any number of arguments. If called without any arguments,
2449it displays the definitions of all known names, otherwise it displays
2450the definitions of the @var{names} given. The output is printed to the
2451current debug file (usually standard error), and is sorted by name. If
2452an unknown name is encountered, a warning is printed.
2453
2454The expansion of @code{dumpdef} is void.
2455@end deffn
2456
2457@example
2458define(`foo', `Hello world.')
2459@result{}
2460dumpdef(`foo')
2461@error{}foo:@tabchar{}`Hello world.'
2462@result{}
2463dumpdef(`define')
2464@error{}define:@tabchar{}<define>
2465@result{}
2466@end example
2467
2468The last example shows how builtin macros definitions are displayed.
2469The definition that is dumped corresponds to what would occur if the
2470macro were to be called at that point, even if other definitions are
2471still live due to redefining a macro during argument collection.
2472
2473@example
2474pushdef(`f', ``$0'1')pushdef(`f', ``$0'2')
2475@result{}
2476f(popdef(`f')dumpdef(`f'))
2477@error{}f:@tabchar{}``$0'1'
2478@result{}f2
2479f(popdef(`f')dumpdef(`f'))
2480@error{}m4:stdin:3: undefined macro `f'
2481@result{}f1
2482@end example
2483
2484@xref{Debug Levels}, for information on controlling the details of the
2485display.
2486
2487@node Trace
2488@section Tracing macro calls
2489
2490@cindex tracing macro expansion
2491@cindex macro expansion, tracing
2492@cindex expansion, tracing macro
2493It is possible to trace macro calls and expansions through the builtins
2494@code{traceon} and @code{traceoff}:
2495
2496@deffn Builtin traceon (@ovar{names@dots{}})
2497@deffnx Builtin traceoff (@ovar{names@dots{}})
2498When called without any arguments, @code{traceon} and @code{traceoff}
2499will turn tracing on and off, respectively, for all defined macros.
2500
2501When called with arguments, only the macros listed in @var{names} are
2502affected, whether or not they are currently defined.
2503
2504The expansion of @code{traceon} and @code{traceoff} is void.
2505@end deffn
2506
2507Whenever a traced macro is called and the arguments have been collected,
2508the call is displayed. If the expansion of the macro call is not void,
2509the expansion can be displayed after the call. The output is printed
2510to the current debug file (defaulting to standard error, @pxref{Debug
2511Output}).
2512
2513@example
2514define(`foo', `Hello World.')
2515@result{}
2516define(`echo', `$@@')
2517@result{}
2518traceon(`foo', `echo')
2519@result{}
2520foo
2521@error{}m4trace: -1- foo -> `Hello World.'
2522@result{}Hello World.
2523echo(`gnus', `and gnats')
2524@error{}m4trace: -1- echo(`gnus', `and gnats') -> ``gnus',`and gnats''
2525@result{}gnus,and gnats
2526@end example
2527
2528The number between dashes is the depth of the expansion. It is one most
2529of the time, signifying an expansion at the outermost level, but it
2530increases when macro arguments contain unquoted macro calls. The
2531maximum number that will appear between dashes is controlled by the
2532option @option{--nesting-limit} (@pxref{Limits control, , Invoking m4}).
2533
2534Tracing by name is an attribute that is preserved whether the macro is
2535defined or not. This allows the @option{-t} option to select macros to
2536trace before those macros are defined.
2537
2538@example
2539traceoff(`foo')
2540@result{}
2541traceon(`foo')
2542@result{}
2543foo
2544@result{}foo
2545define(`foo', `bar')
2546@result{}
2547foo
2548@error{}m4trace: -1- foo -> `bar'
2549@result{}bar
2550undefine(`foo')
2551@result{}
2552ifdef(`foo', `yes', `no')
2553@result{}no
2554indir(`foo')
2555@error{}m4:stdin:8: undefined macro `foo'
2556@result{}
2557define(`foo', `blah')
2558@result{}
2559foo
2560@error{}m4trace: -1- foo -> `blah'
2561@result{}blah
2562traceoff
2563@result{}
2564foo
2565@result{}blah
2566@end example
2567
2568Tracing even works on builtins. However, @command{defn} (@pxref{Defn})
2569does not transfer tracing status.
2570
2571@example
2572traceon(`eval', `m4_divnum')
2573@result{}
2574define(`m4_eval', defn(`eval'))
2575@result{}
2576define(`m4_divnum', defn(`divnum'))
2577@result{}
2578eval(divnum)
2579@error{}m4trace: -1- eval(`0') -> `0'
2580@result{}0
2581m4_eval(m4_divnum)
2582@error{}m4trace: -2- m4_divnum -> `0'
2583@result{}0
2584@end example
2585
2586@xref{Debug Levels}, for information on controlling the details of the
2587display.
2588
2589@node Debug Levels
2590@section Controlling debugging output
2591
2592@cindex controlling debugging output
2593@cindex debugging output, controlling
2594The @option{-d} option to @code{m4} (@pxref{Debugging options, ,
2595Invoking m4}) controls the amount of details presented, when using the
2596macros described in the preceding sections.
2597
2598The @var{flags} following the option can be one or more of the
2599following:
2600
2601@table @code
2602@item a
2603Show the actual arguments in each macro call. This applies to all macro
2604calls if the @samp{t} flag is used, otherwise only the macros covered by
2605calls of @code{traceon}.
2606
2607@item c
2608Show several trace lines for each macro call. A line is shown when the
2609macro is seen, but before the arguments are collected; a second line
2610when the arguments have been collected and a third line after the call
2611has completed.
2612
2613@item e
2614Show the expansion of each macro call, if it is not void. This applies
2615to all macro calls if the @samp{t} flag is used, otherwise only the
2616macros covered by calls of @code{traceon}.
2617
2618@item f
2619Show the name of the current input file in each trace output line.
2620
2621@item i
2622Print a message each time the current input file is changed, giving file
2623name and input line number.
2624
2625@item l
2626Show the current input line number in each trace output line.
2627
2628@item p
2629Print a message when a named file is found through the path search
2630mechanism (@pxref{Search Path}), giving the actual file name used.
2631
2632@item q
2633Quote actual arguments and macro expansions in the display with the
2634current quotes.
2635
2636@item t
2637Trace all macro calls made in this invocation of @code{m4}.
2638
2639@item x
2640Add a unique `macro call id' to each line of the trace output. This is
2641useful in connection with the @samp{c} flag above.
2642
2643@item V
2644A shorthand for all of the above flags.
2645@end table
2646
2647If no flags are specified with the @option{-d} option, the default is
2648@samp{aeq}. The examples throughout this manual assume the default
2649flags.
2650
2651@cindex @acronym{GNU} extensions
2652There is a builtin macro @code{debugmode}, which allows on-the-fly control of
2653the debugging output format:
2654
2655@deffn Builtin debugmode (@ovar{flags})
2656The argument @var{flags} should be a subset of the letters listed above.
2657As special cases, if the argument starts with a @samp{+}, the flags are
2658added to the current debug flags, and if it starts with a @samp{-}, they
2659are removed. If no argument is present, all debugging flags are cleared
2660(as if no @option{-d} was given), and with an empty argument the flags
2661are reset to the default of @samp{aeq}.
2662
2663The expansion of @code{debugmode} is void.
2664@end deffn
2665
2666@example
2667define(`foo', `FOO')
2668@result{}
2669traceon(`foo')
2670@result{}
2671debugmode()
2672@result{}
2673foo
2674@error{}m4trace: -1- foo -> `FOO'
2675@result{}FOO
2676debugmode
2677@result{}
2678foo
2679@error{}m4trace: -1- foo
2680@result{}FOO
2681debugmode(`+l')
2682@result{}
2683foo
2684@error{}m4trace:8: -1- foo
2685@result{}FOO
2686@end example
2687
2688@node Debug Output
2689@section Saving debugging output
2690
2691@cindex saving debugging output
2692@cindex debugging output, saving
2693@cindex output, saving debugging
2694@cindex @acronym{GNU} extensions
2695Debug and tracing output can be redirected to files using either the
2696@option{--debugfile} option to @code{m4} (@pxref{Debugging options, ,
2697Invoking m4}), or with the builtin macro @code{debugfile}:
2698
2699@deffn Builtin debugfile (@ovar{file})
2700Sends all further debug and trace output to @var{file}, opened in append
2701mode. If @var{file} is the empty string, debug and trace output are
2702discarded. If @code{debugfile} is called without any arguments, debug
2703and trace output are sent to standard error. This does not affect
2704warnings, error messages, or @code{errprint} output, which are
2705always sent to standard error. If @var{file} cannot be opened, the
2706current debug file is unchanged, and an error is issued.
2707
2708The expansion of @code{debugfile} is void.
2709@end deffn
2710
2711@example
2712traceon(`divnum')
2713@result{}
2714divnum(`extra')
2715@error{}m4:stdin:2: Warning: excess arguments to builtin `divnum' ignored
2716@error{}m4trace: -1- divnum(`extra') -> `0'
2717@result{}0
2718debugfile()
2719@result{}
2720divnum(`extra')
2721@error{}m4:stdin:4: Warning: excess arguments to builtin `divnum' ignored
2722@result{}0
2723debugfile
2724@result{}
2725divnum
2726@error{}m4trace: -1- divnum -> `0'
2727@result{}0
2728@end example
2729
2730@node Input Control
2731@chapter Input control
2732
2733This chapter describes various builtin macros for controlling the input
2734to @code{m4}.
2735
2736@menu
2737* Dnl:: Deleting whitespace in input
2738* Changequote:: Changing the quote characters
2739* Changecom:: Changing the comment delimiters
2740* Changeword:: Changing the lexical structure of words
2741* M4wrap:: Saving text until end of input
2742@end menu
2743
2744@node Dnl
2745@section Deleting whitespace in input
2746
2747@cindex deleting whitespace in input
2748The builtin @code{dnl} stands for ``Discard to Next Line'':
2749
2750@deffn Builtin dnl
2751All characters, up to and including the next newline, are discarded
2752without performing any macro expansion. A warning is issued if the end
2753of the file is encountered without a newline.
2754
2755The expansion of @code{dnl} is void.
2756@end deffn
2757
2758It is often used in connection with @code{define}, to remove the
2759newline that follows the call to @code{define}. Thus
2760
2761@example
2762define(`foo', `Macro `foo'.')dnl A very simple macro, indeed.
2763foo
2764@result{}Macro foo.
2765@end example
2766
2767The input up to and including the next newline is discarded, as opposed
2768to the way comments are treated (@pxref{Comments}).
2769
2770Usually, @code{dnl} is immediately followed by an end of line or some
2771other whitespace. @acronym{GNU} @code{m4} will produce a warning diagnostic if
2772@code{dnl} is followed by an open parenthesis. In this case, @code{dnl}
2773will collect and process all arguments, looking for a matching close
2774parenthesis. All predictable side effects resulting from this
2775collection will take place. @code{dnl} will return no output. The
2776input following the matching close parenthesis up to and including the
2777next newline, on whatever line containing it, will still be discarded.
2778
2779@example
2780dnl(`args are ignored, but side effects occur',
2781define(`foo', `like this')) while this text is ignored: undefine(`foo')
2782@error{}m4:stdin:1: Warning: excess arguments to builtin `dnl' ignored
2783See how `foo' was defined, foo?
2784@result{}See how foo was defined, like this?
2785@end example
2786
2787If the end of file is encountered without a newline character, a
2788warning is issued and dnl stops consuming input.
2789
2790@example
2791m4wrap(`m4wrap(`2 hi
2792')0 hi dnl 1 hi')
2793@result{}
2794define(`hi', `HI')
2795@result{}
2796^D
2797@error{}m4:stdin:1: Warning: end of file treated as newline
2798@result{}0 HI 2 HI
2799@end example
2800
2801@node Changequote
2802@section Changing the quote characters
2803
2804@cindex changing the quote delimiters
2805@cindex quote delimiters, changing the
2806The default quote delimiters can be changed with the builtin
2807@code{changequote}:
2808
2809@deffn Builtin changequote (@dvar{start, `}, @dvar{end, '})
2810This sets @var{start} as the new begin-quote delimiter and @var{end} as
2811the new end-quote delimiter. If both arguments are missing, the default
2812quotes (@code{`} and @code{'}) are used. If @var{start} is void, then
2813quoting is disabled. Otherwise, if @var{end} is missing or void, the
2814default end-quote delimiter (@code{'}) is used. The quote delimiters
2815can be of any length.
2816
2817The expansion of @code{changequote} is void.
2818@end deffn
2819
2820@example
2821changequote(`[', `]')
2822@result{}
2823define([foo], [Macro [foo].])
2824@result{}
2825foo
2826@result{}Macro foo.
2827@end example
2828
2829The quotation strings can safely contain eight-bit characters.
2830@ignore
2831@comment Yuck. I know of no clean way to render an 8-bit character in
2832@comment both info and dvi. This example uses the `open-guillemot' and
2833@comment `close-guillemot' characters of the Latin-1 character set.
2834
2835@example
2836define(`a', `b')
2837@result{}
2838«a»
2839@result{}«b»
2840changequote(`«', `»')
2841@result{}
2842«a»
2843@result{}a
2844@end example
2845@end ignore
2846If no single character is appropriate, @var{start} and @var{end} can be
2847of any length. Other implementations cap the delimiter length to five
2848characters, but @acronym{GNU} has no inherent limit.
2849
2850@example
2851changequote(`[[[', `]]]')
2852@result{}
2853define([[[foo]]], [[[Macro [[[[[foo]]]]].]]])
2854@result{}
2855foo
2856@result{}Macro [[foo]].
2857@end example
2858
2859Calling @code{changequote} with @var{start} as the empty string will
2860effectively disable the quoting mechanism, leaving no way to quote text.
2861However, using an empty string is not portable, as some other
2862implementations of @code{m4} revert to the default quoting, while others
2863preserve the prior non-empty delimiter. If @var{start} is not empty,
2864then an empty @var{end} will use the default end-quote delimiter of
2865@samp{'}, as otherwise, it would be impossible to end a quoted string.
2866Again, this is not portable, as some other @code{m4} implementations
2867reuse @var{start} as the end-quote delimiter, while others preserve the
2868previous non-empty value. Omitting both arguments restores the default
2869begin-quote and end-quote delimiters; fortunately this behavior is
2870portable to all implementations of @code{m4}.
2871
2872@example
2873define(`foo', `Macro `FOO'.')
2874@result{}
2875changequote(`', `')
2876@result{}
2877foo
2878@result{}Macro `FOO'.
2879`foo'
2880@result{}`Macro `FOO'.'
2881changequote(`,)
2882@result{}
2883foo
2884@result{}Macro FOO.
2885@end example
2886
2887There is no way in @code{m4} to quote a string containing an unmatched
2888begin-quote, except using @code{changequote} to change the current
2889quotes.
2890
2891If the quotes should be changed from, say, @samp{[} to @samp{[[},
2892temporary quote characters have to be defined. To achieve this, two
2893calls of @code{changequote} must be made, one for the temporary quotes
2894and one for the new quotes.
2895
2896Macros are recognized in preference to the begin-quote string, so if a
2897prefix of @var{start} can be recognized as part of a potential macro
2898name, the quoting mechanism is effectively disabled. Unless you use
2899@code{changeword} (@pxref{Changeword}), this means that @var{start}
2900should not begin with a letter, digit, or @samp{_} (underscore).
2901However, even though quoted strings are not recognized, the quote
2902characters can still be discerned in macro expansion and in trace
2903output.
2904
2905@example
2906define(`echo', `$@@')
2907@result{}
2908define(`hi', `HI')
2909@result{}
2910changequote(`q', `Q')
2911@result{}
2912q hi Q hi
2913@result{}q HI Q HI
2914echo(hi)
2915@result{}qHIQ
2916changequote
2917@result{}
2918changequote(`-', `EOF')
2919@result{}
2920- hi EOF hi
2921@result{} hi HI
2922changequote
2923@result{}
2924changequote(`1', `2')
2925@result{}
2926hi1hi2
2927@result{}hi1hi2
2928hi 1hi2
2929@result{}HI hi
2930@end example
2931
2932Quotes are recognized in preference to argument collection. In
2933particular, if @var{start} is a single @samp{(}, then argument
2934collection is effectively disabled. For portability with other
2935implementations, it is a good idea to avoid @samp{(}, @samp{,}, and
2936@samp{)} as the first character in @var{start}.
2937
2938@example
2939define(`echo', `$#:$@@:')
2940@result{}
2941define(`hi', `HI')
2942@result{}
2943changequote(`(',`)')
2944@result{}
2945echo(hi)
2946@result{}0::hi
2947changequote
2948@result{}
2949changequote(`((', `))')
2950@result{}
2951echo(hi)
2952@result{}1:HI:
2953echo((hi))
2954@result{}0::hi
2955changequote
2956@result{}
2957changequote(`,', `)')
2958@result{}
2959echo(hi,hi)bye)
2960@result{}1:HIhibye:
2961@end example
2962
2963If @var{end} is a prefix of @var{start}, the end-quote will be
2964recognized in preference to a nested begin-quote. In particular,
2965changing the quotes to have the same string for @var{start} and
2966@var{end} disables nesting of quotes. When quote nesting is disabled,
2967it is impossible to double-quote strings across macro expansions, so
2968using the same string is not done very often.
2969
2970@example
2971define(`hi', `HI')
2972@result{}
2973changequote(`""', `"')
2974@result{}
2975""hi"""hi"
2976@result{}hihi
2977""hi" ""hi"
2978@result{}hi hi
2979""hi"" "hi"
2980@result{}hi" "HI"
2981changequote
2982@result{}
2983`hi`hi'hi'
2984@result{}hi`hi'hi
2985changequote(`"', `"')
2986@result{}
2987"hi"hi"hi"
2988@result{}hiHIhi
2989@end example
2990
2991It is an error if the end of file occurs within a quoted string.
2992
2993@example
2994`hello world'
2995@result{}hello world
2996`dangling quote
2997^D
2998@error{}m4:stdin:2: ERROR: end of file in string
2999@end example
3000
3001@node Changecom
3002@section Changing the comment delimiters
3003
3004@cindex changing comment delimiters
3005@cindex comment delimiters, changing
3006The default comment delimiters can be changed with the builtin
3007macro @code{changecom}:
3008
3009@deffn Builtin changecom (@ovar{start}, @dvar{end, @key{NL}})
3010This sets @var{start} as the new begin-comment delimiter and @var{end}
3011as the new end-comment delimiter. If both arguments are missing, or
3012@var{start} is void, then comments are disabled. Otherwise, if
3013@var{end} is missing or void, the default end-comment delimiter of
3014newline is used. The comment delimiters can be of any length.
3015
3016The expansion of @code{changecom} is void.
3017@end deffn
3018
3019@example
3020define(`comment', `COMMENT')
3021@result{}
3022# A normal comment
3023@result{}# A normal comment
3024changecom(`/*', `*/')
3025@result{}
3026# Not a comment anymore
3027@result{}# Not a COMMENT anymore
3028But: /* this is a comment now */ while this is not a comment
3029@result{}But: /* this is a comment now */ while this is not a COMMENT
3030@end example
3031
3032@cindex comments, copied to output
3033Note how comments are copied to the output, much as if they were quoted
3034strings. If you want the text inside a comment expanded, quote the
3035begin-comment delimiter.
3036
3037Calling @code{changecom} without any arguments, or with @var{start} as
3038the empty string, will effectively disable the commenting mechanism. To
3039restore the original comment start of @samp{#}, you must explicitly ask
3040for it. If @var{start} is not empty, then an empty @var{end} will use
3041the default end-comment delimiter of newline, as otherwise, it would be
3042impossible to end a comment. However, this is not portable, as some
3043other @code{m4} implementations preserve the previous non-empty
3044delimiters instead.
3045
3046@example
3047define(`comment', `COMMENT')
3048@result{}
3049changecom
3050@result{}
3051# Not a comment anymore
3052@result{}# Not a COMMENT anymore
3053changecom(`#', `')
3054@result{}
3055# comment again
3056@result{}# comment again
3057@end example
3058
3059The comment strings can safely contain eight-bit characters.
3060@ignore
3061@comment Yuck. I know of no clean way to render an 8-bit character in
3062@comment both info and dvi. This example uses the `open-guillemot' and
3063@comment `close-guillemot' characters of the Latin-1 character set.
3064
3065@example
3066define(`a', `b')
3067@result{}
3068«a»
3069@result{}«b»
3070changecom(`«', `»')
3071@result{}
3072«a»
3073@result{}«a»
3074@end example
3075@end ignore
3076If no single character is appropriate, @var{start} and @var{end} can be
3077of any length. Other implementations cap the delimiter length to five
3078characters, but @acronym{GNU} has no inherent limit.
3079
3080Comments are recognized in preference to macros. However, this is not
3081compatible with other implementations, where macros and even quoting
3082takes precedence over comments, so it may change in a future release.
3083For portability, this means that @var{start} should not begin with a
3084letter, digit, or @samp{_} (underscore), and that neither the
3085start-quote nor the start-comment string should be a prefix of the
3086other.
3087
3088@example
3089define(`hi', `HI')
3090@result{}
3091define(`hi1hi2', `hello')
3092@result{}
3093changecom(`q', `Q')
3094@result{}
3095q hi Q hi
3096@result{}q hi Q HI
3097changecom(`1', `2')
3098@result{}
3099hi1hi2
3100@result{}hello
3101hi 1hi2
3102@result{}HI 1hi2
3103@end example
3104
3105Comments are recognized in preference to argument collection. In
3106particular, if @var{start} is a single @samp{(}, then argument
3107collection is effectively disabled. For portability with other
3108implementations, it is a good idea to avoid @samp{(}, @samp{,}, and
3109@samp{)} as the first character in @var{start}.
3110
3111@example
3112define(`echo', `$#:$@@:')
3113@result{}
3114define(`hi', `HI')
3115@result{}
3116changecom(`(',`)')
3117@result{}
3118echo(hi)
3119@result{}0::(hi)
3120changecom
3121@result{}
3122changecom(`((', `))')
3123@result{}
3124echo(hi)
3125@result{}1:HI:
3126echo((hi))
3127@result{}0::((hi))
3128changecom(`,', `)')
3129@result{}
3130echo(hi,hi)bye)
3131@result{}1:HI,hi)bye:
3132@end example
3133
3134It is an error if the end of file occurs within a comment.
3135
3136@example
3137changecom(`/*', `*/')
3138@result{}
3139/*dangling comment
3140^D
3141@error{}m4:stdin:2: ERROR: end of file in comment
3142@end example
3143
3144@node Changeword
3145@section Changing the lexical structure of words
3146
3147@cindex lexical structure of words
3148@cindex words, lexical structure of
3149@quotation
3150The macro @code{changeword} and all associated functionality is
3151experimental. It is only available if the @option{--enable-changeword}
3152option was given to @code{configure}, at @acronym{GNU} @code{m4} installation
3153time. The functionality will go away in the future, to be replaced by
3154other new features that are more efficient at providing the same
3155capabilities. @emph{Do not rely on it}. Please direct your comments
3156about it the same way you would do for bugs.
3157@end quotation
3158
3159A file being processed by @code{m4} is split into quoted strings, words
3160(potential macro names) and simple tokens (any other single character).
3161Initially a word is defined by the following regular expression:
3162
3163@comment ignore
3164@example
3165[_a-zA-Z][_a-zA-Z0-9]*
3166@end example
3167
3168Using @code{changeword}, you can change this regular expression:
3169
3170@deffn {Optional builtin} changeword (@var{regex})
3171Changes the regular expression for recognizing macro names to be
3172@var{regex}. If @var{regex} is empty, use
3173@samp{[_a-zA-Z][_a-zA-Z0-9]*}. @var{regex} must obey the constraint
3174that every prefix of the desired final pattern is also accepted by the
3175regular expression. If @var{regex} contains grouping parentheses, the
3176macro invoked is the portion that matched the first group, rather than
3177the entire matching string.
3178
3179The expansion of @code{changeword} is void.
3180The macro @code{changeword} is recognized only with parameters.
3181@end deffn
3182
3183Relaxing the lexical rules of @code{m4} might be useful (for example) if
3184you wanted to apply translations to a file of numbers:
3185
3186@example
3187ifdef(`changeword', `', `errprint(` skipping: no changeword support
3188')m4exit(`77')')dnl
3189changeword(`[_a-zA-Z0-9]+')
3190@result{}
3191define(`1', `0')1
3192@result{}0
3193@end example
3194
3195Tightening the lexical rules is less useful, because it will generally
3196make some of the builtins unavailable. You could use it to prevent
3197accidental call of builtins, for example:
3198
3199@example
3200ifdef(`changeword', `', `errprint(` skipping: no changeword support
3201')m4exit(`77')')dnl
3202define(`_indir', defn(`indir'))
3203@result{}
3204changeword(`_[_a-zA-Z0-9]*')
3205@result{}
3206esyscmd(`foo')
3207@result{}esyscmd(foo)
3208_indir(`esyscmd', `echo hi')
3209@result{}hi
3210@result{}
3211@end example
3212
3213Because @code{m4} constructs its words a character at a time, there
3214is a restriction on the regular expressions that may be passed to
3215@code{changeword}. This is that if your regular expression accepts
3216@samp{foo}, it must also accept @samp{f} and @samp{fo}.
3217
3218@example
3219ifdef(`changeword', `', `errprint(` skipping: no changeword support
3220')m4exit(`77')')dnl
3221define(`foo
3222', `bar
3223')
3224@result{}
3225dnl This example wants to recognize changeword, dnl, and `foo\n'.
3226dnl First, we check that our regexp will match.
3227regexp(`changeword', `[cd][a-z]*\|foo[
3228]')
3229@result{}0
3230regexp(`foo
3231', `[cd][a-z]*\|foo[
3232]')
3233@result{}0
3234regexp(`f', `[cd][a-z]*\|foo[
3235]')
3236@result{}-1
3237foo
3238@result{}foo
3239changeword(`[cd][a-z]*\|foo[
3240]')
3241@result{}
3242dnl Even though `foo\n' matches, we forgot to allow `f'.
3243foo
3244@result{}foo
3245changeword(`[cd][a-z]*\|fo*[
3246]?')
3247@result{}
3248dnl Now we can call `foo\n'.
3249foo
3250@result{}bar
3251@end example
3252
3253@ignore
3254@comment One more test of including newline in a macro name; but this
3255@comment does not need to be displayed in the manual. This ensures
3256@comment that line numbering is correct when dnl cuts across include
3257@comment file boundaries, and when __file__ or __line__ is the last
3258@comment token in an include file.
3259
3260@example
3261ifdef(`changeword', `', `errprint(` skipping: no changeword support
3262')m4exit(`77')')dnl
3263define(`bar
3264', defn(`dnl'))dnl
3265define(`baz', `dnl
3266include(`foo') ignored
3267dnl')dnl
3268changeword(`\([_a-zA-Z][_a-zA-Z0-9]*\|bar
3269\)')
3270@result{}
3271__file__:__line__
3272@result{}stdin:10
3273include(`foo') ignored
3274__file__:__line__
3275@result{}stdin:12
3276baz ignored
3277__file__:__line__
3278@result{}stdin:14
3279define(`bar
3280', defn(`__file__'))
3281@result{}
3282include(`foo')
3283@result{}../examples/foo
3284define(`bar
3285', defn(`__line__'))
3286@result{}
3287include(`foo')
3288@result{}1
3289__file__:__line__
3290@result{}stdin:21
3291@end example
3292@end ignore
3293
3294@code{changeword} has another function. If the regular expression
3295supplied contains any grouped subexpressions, then text outside
3296the first of these is discarded before symbol lookup. So:
3297
3298@example
3299ifdef(`changeword', `', `errprint(` skipping: no changeword support
3300')m4exit(`77')')dnl
3301ifdef(`__unix__', ,
3302 `errprint(` skipping: syscmd does not have unix semantics
3303')m4exit(`77')')dnl
3304changecom(`/*', `*/')dnl
3305define(`foo', `bar')dnl
3306changeword(`#\([_a-zA-Z0-9]*\)')
3307@result{}
3308#esyscmd(`echo foo \#foo')
3309@result{}foo bar
3310@result{}
3311@end example
3312
3313@code{m4} now requires a @samp{#} mark at the beginning of every
3314macro invocation, so one can use @code{m4} to preprocess plain
3315text without losing various words like @samp{divert}.
3316
3317In @code{m4}, macro substitution is based on text, while in @TeX{}, it
3318is based on tokens. @code{changeword} can throw this difference into
3319relief. For example, here is the same idea represented in @TeX{} and
3320@code{m4}. First, the @TeX{} version:
3321
3322@comment ignore
3323@example
3324\def\a@{\message@{Hello@}@}
3325\catcode`\@@=0
3326\catcode`\\=12
3327@@a
3328@@bye
3329@result{}Hello
3330@end example
3331
3332@noindent
3333Then, the @code{m4} version:
3334
3335@example
3336ifdef(`changeword', `', `errprint(` skipping: no changeword support
3337')m4exit(`77')')dnl
3338define(`a', `errprint(`Hello')')dnl
3339changeword(`@@\([_a-zA-Z0-9]*\)')
3340@result{}
3341@@a
3342@result{}errprint(Hello)
3343@end example
3344
3345In the @TeX{} example, the first line defines a macro @code{a} to
3346print the message @samp{Hello}. The second line defines @key{@@} to
3347be usable instead of @key{\} as an escape character. The third line
3348defines @key{\} to be a normal printing character, not an escape.
3349The fourth line invokes the macro @code{a}. So, when @TeX{} is run
3350on this file, it displays the message @samp{Hello}.
3351
3352When the @code{m4} example is passed through @code{m4}, it outputs
3353@samp{errprint(Hello)}. The reason for this is that @TeX{} does
3354lexical analysis of macro definition when the macro is @emph{defined}.
3355@code{m4} just stores the text, postponing the lexical analysis until
3356the macro is @emph{used}.
3357
3358You should note that using @code{changeword} will slow @code{m4} down
3359by a factor of about seven, once it is changed to something other
3360than the default regular expression. You can invoke @code{changeword}
3361with the empty string to restore the default word definition, and regain
3362the parsing speed.
3363
3364@node M4wrap
3365@section Saving text until end of input
3366
3367@cindex saving input
3368@cindex input, saving
3369It is possible to `save' some text until the end of the normal input has
3370been seen. Text can be saved, to be read again by @code{m4} when the
3371normal input has been exhausted. This feature is normally used to
3372initiate cleanup actions before normal exit, e.g., deleting temporary
3373files.
3374
3375To save input text, use the builtin @code{m4wrap}:
3376
3377@deffn Builtin m4wrap (@var{string}, @dots{})
3378Stores @var{string} in a safe place, to be reread when end of input is
3379reached. As a @acronym{GNU} extension, additional arguments are
3380concatenated with a space to the @var{string}.
3381
3382The expansion of @code{m4wrap} is void.
3383The macro @code{m4wrap} is recognized only with parameters.
3384@end deffn
3385
3386@example
3387define(`cleanup', `This is the `cleanup' action.
3388')
3389@result{}
3390m4wrap(`cleanup')
3391@result{}
3392This is the first and last normal input line.
3393@result{}This is the first and last normal input line.
3394^D
3395@result{}This is the cleanup action.
3396@end example
3397
3398The saved input is only reread when the end of normal input is seen, and
3399not if @code{m4exit} is used to exit @code{m4}.
3400
3401@comment FIXME: this contradicts POSIX, which requires that "If the
3402@comment m4wrap macro is used multiple times, the arguments specified
3403@comment shall be processed in the order in which the m4wrap macros were
3404@comment processed."
3405It is safe to call @code{m4wrap} from saved text, but then the order in
3406which the saved text is reread is undefined. If @code{m4wrap} is not used
3407recursively, the saved pieces of text are reread in the opposite order
3408in which they were saved (LIFO---last in, first out). However, this
3409behavior is likely to change in a future release, to match
3410@acronym{POSIX}, so you should not depend on this order.
3411
3412Here is an example of implementing a factorial function using
3413@code{m4wrap}:
3414
3415@example
3416define(`f', `ifelse(`$1', `0', `Answer: 0!=1
3417', eval(`$1>1'), `0', `Answer: $2$1=eval(`$2$1')
3418', `m4wrap(`f(decr(`$1'), `$2$1*')')')')
3419@result{}
3420f(`10')
3421@result{}
3422^D
3423@result{}Answer: 10*9*8*7*6*5*4*3*2*1=3628800
3424@end example
3425
3426Invocations of @code{m4wrap} at the same recursion level are
3427concatenated and rescanned as usual:
3428
3429@example
3430define(`aa', `AA
3431')
3432@result{}
3433m4wrap(`a')m4wrap(`a')
3434@result{}
3435^D
3436@result{}AA
3437@end example
3438
3439@noindent
3440however, the transition between recursion levels behaves like an end of
3441file condition between two input files.
3442
3443@example
3444m4wrap(`m4wrap(`)')len(abc')
3445@result{}
3446^D
3447@error{}m4:stdin:1: ERROR: end of file in argument list
3448@end example
3449
3450@node File Inclusion
3451@chapter File inclusion
3452
3453@cindex file inclusion
3454@cindex inclusion, of files
3455
3456@code{m4} allows you to include named files at any point in the input.
3457
3458@menu
3459* Include:: Including named files
3460* Search Path:: Searching for include files
3461@end menu
3462
3463@node Include
3464@section Including named files
3465
3466There are two builtin macros in @code{m4} for including files:
3467
3468@deffn Builtin include (@var{file})
3469@deffnx Builtin sinclude (@var{file})
3470Both macros cause the file named @var{file} to be read by
3471@code{m4}. When the end of the file is reached, input is resumed from
3472the previous input file.
3473
3474The expansion of @code{include} and @code{sinclude} is therefore the
3475contents of @var{file}.
3476
3477If @var{file} does not exist (or cannot be read), the expansion is void,
3478and @code{include} will fail with an error while @code{sinclude} is
3479silent. The empty string counts as a file that does not exist.
3480
3481The macros @code{include} and @code{sinclude} are recognized only with
3482parameters.
3483@end deffn
3484
3485@example
3486include(`none')
3487@error{}m4:stdin:1: cannot open `none': No such file or directory
3488@result{}
3489include()
3490@error{}m4:stdin:2: cannot open `': No such file or directory
3491@result{}
3492sinclude(`none')
3493@result{}
3494sinclude()
3495@result{}
3496@end example
3497
3498The rest of this section assumes that @code{m4} is invoked with the
3499@option{-I} option (@pxref{Preprocessor features, , Invoking m4})
3500pointing to the @file{m4-@value{VERSION}/@/examples}
3501directory shipped as part of the @acronym{GNU} @code{m4} package. The
3502file @file{m4-@value{VERSION}/@/examples/@/incl.m4} in the distribution
3503contains the lines:
3504
3505@comment ignore
3506@example
3507Include file start
3508foo
3509Include file end
3510@end example
3511
3512Normally file inclusion is used to insert the contents of a file
3513into the input stream. The contents of the file will be read by
3514@code{m4} and macro calls in the file will be expanded:
3515
3516@example
3517define(`foo', `FOO')
3518@result{}
3519include(`incl.m4')
3520@result{}Include file start
3521@result{}FOO
3522@result{}Include file end
3523@result{}
3524@end example
3525
3526The fact that @code{include} and @code{sinclude} expand to the contents
3527of the file can be used to define macros that operate on entire files.
3528Here is an example, which defines @samp{bar} to expand to the contents
3529of @file{incl.m4}:
3530
3531@example
3532define(`bar', include(`incl.m4'))
3533@result{}
3534This is `bar': >>bar<<
3535@result{}This is bar: >>Include file start
3536@result{}foo
3537@result{}Include file end
3538@result{}<<
3539@end example
3540
3541This use of @code{include} is not trivial, though, as files can contain
3542quotes, commas, and parentheses, which can interfere with the way the
3543@code{m4} parser works. @acronym{GNU} @code{m4} seamlessly concatenates
3544the file contents with the next character, even if the included file
3545ended in the middle of a comment, string, or macro call. These
3546conditions are only treated as end of file errors if specified as input
3547files on the command line.
3548
3549In @acronym{GNU} @code{m4}, an alternative method of reading files is
3550using @code{undivert} (@pxref{Undivert}) on a named file.
3551
3552@node Search Path
3553@section Searching for include files
3554
3555@cindex search path for included files
3556@cindex included files, search path for
3557@cindex @acronym{GNU} extensions
3558@acronym{GNU} @code{m4} allows included files to be found in other directories
3559than the current working directory.
3560
3561If the @option{--prepend-include} or @option{-B} command-line option was
3562provided (@pxref{Preprocessor features, , Invoking m4}), those
3563directories are searched first, in reverse order that those options were
3564listed on the command line. Then @code{m4} looks in the current working
3565directory. Next comes the directories specified with the
3566@option{--include} or @option{-I} option, in the order found on the
3567command line. Finally, if the @env{M4PATH} environment variable is set,
3568it is expected to contain a colon-separated list of directories, which
3569will be searched in order.
3570
3571If the automatic search for include-files causes trouble, the @samp{p}
3572debug flag (@pxref{Debug Levels}) can help isolate the problem.
3573
3574@node Diversions
3575@chapter Diverting and undiverting output
3576
3577Diversions are a way of temporarily saving output. The output of
3578@code{m4} can at any time be diverted to a temporary file, and be
3579reinserted into the output stream, @dfn{undiverted}, again at a later
3580time.
3581
3582Numbered diversions are counted from 0 upwards, diversion number 0
3583being the normal output stream. The number of simultaneous diversions
3584is limited mainly by the memory used to describe them, because @acronym{GNU}
3585@code{m4} tries to keep diversions in memory. However, there is a
3586limit to the overall memory usable by all diversions taken altogether
3587(512K, currently). When this maximum is about to be exceeded,
3588a temporary file is opened to receive the contents of the biggest
3589diversion still in memory, freeing this memory for other diversions.
3590When creating the temporary file, @code{m4} honors the value of the
3591environment variable @env{TMPDIR}, and falls back to @file{/tmp}.
3592So, it is theoretically possible that the number and aggregate size of
3593diversions is limited only by available disk space.
3594
3595@ignore
3596@comment We need to test spilled diversions, but don't need to expose
3597@comment this highly repetitive test in the manual.
3598
3599@example
3600divert(`-1')define(`f', `.')
3601define(`f', defn(`f')defn(`f'))
3602define(`f', defn(`f')defn(`f'))
3603define(`f', defn(`f')defn(`f'))
3604define(`f', defn(`f')defn(`f'))
3605define(`f', defn(`f')defn(`f'))
3606define(`f', defn(`f')defn(`f'))
3607define(`f', defn(`f')defn(`f'))
3608define(`f', defn(`f')defn(`f'))
3609define(`f', defn(`f')defn(`f'))
3610define(`f', defn(`f')defn(`f'))
3611define(`f', defn(`f')defn(`f'))
3612define(`f', defn(`f')defn(`f'))
3613define(`f', defn(`f')defn(`f'))
3614define(`f', defn(`f')defn(`f'))
3615define(`f', defn(`f')defn(`f'))
3616define(`f', defn(`f')defn(`f'))
3617define(`f', defn(`f')defn(`f'))
3618define(`f', defn(`f')defn(`f'))
3619define(`f', defn(`f')defn(`f'))
3620define(`f', defn(`f')defn(`f'))
3621divert`'dnl
3622len(f)
3623@result{}1048576
3624divert(`1')
3625f
3626divert(`-1')undivert
3627@end example
3628
3629@comment Another test of spilled diversions.
3630
3631@example
3632divert(`-1')define(`f', `.')
3633define(`f', defn(`f')defn(`f'))
3634define(`f', defn(`f')defn(`f'))
3635define(`f', defn(`f')defn(`f'))
3636define(`f', defn(`f')defn(`f'))
3637define(`f', defn(`f')defn(`f'))
3638define(`f', defn(`f')defn(`f'))
3639define(`f', defn(`f')defn(`f'))
3640define(`f', defn(`f')defn(`f'))
3641define(`f', defn(`f')defn(`f'))
3642define(`f', defn(`f')defn(`f'))
3643define(`f', defn(`f')defn(`f'))
3644define(`f', defn(`f')defn(`f'))
3645define(`f', defn(`f')defn(`f'))
3646define(`f', defn(`f')defn(`f'))
3647define(`f', defn(`f')defn(`f'))
3648define(`f', defn(`f')defn(`f'))
3649define(`f', defn(`f')defn(`f'))
3650define(`f', defn(`f')defn(`f'))
3651define(`f', defn(`f')defn(`f'))
3652define(`f', defn(`f')defn(`f'))
3653divert`'dnl
3654len(f)
3655@result{}1048576
3656divert(`1')
3657f
3658m4exit
3659@end example
3660@end ignore
3661
3662Diversions make it possible to generate output in a different order than
3663the input was read. It is possible to implement topological sorting
3664dependencies. For example, @acronym{GNU} Autoconf makes use of
3665diversions under the hood to ensure that the expansion of a prerequisite
3666macro appears in the output prior to the expansion of a dependent macro,
3667regardless of which order the two macros were invoked in the user's
3668input file.
3669
3670@menu
3671* Divert:: Diverting output
3672* Undivert:: Undiverting output
3673* Divnum:: Diversion numbers
3674* Cleardivert:: Discarding diverted text
3675@end menu
3676
3677@node Divert
3678@section Diverting output
3679
3680@cindex diverting output to files
3681@cindex output, diverting to files
3682@cindex files, diverting output to
3683Output is diverted using @code{divert}:
3684
3685@deffn Builtin divert (@dvar{number, 0})
3686The current diversion is changed to @var{number}. If @var{number} is left
3687out or empty, it is assumed to be zero. If @var{number} cannot be
3688parsed, the diversion is unchanged.
3689
3690The expansion of @code{divert} is void.
3691@end deffn
3692
3693When all the @code{m4} input will have been processed, all existing
3694diversions are automatically undiverted, in numerical order.
3695
3696@example
3697divert(`1')
3698This text is diverted.
3699divert
3700@result{}
3701This text is not diverted.
3702@result{}This text is not diverted.
3703^D
3704@result{}
3705@result{}This text is diverted.
3706@end example
3707
3708Several calls of @code{divert} with the same argument do not overwrite
3709the previous diverted text, but append to it. Diversions are printed
3710after any wrapped text is expanded.
3711
3712@example
3713define(`text', `TEXT')
3714@result{}
3715divert(`1')`diverted text.'
3716divert
3717@result{}
3718m4wrap(`Wrapped text preceeds ')
3719@result{}
3720^D
3721@result{}Wrapped TEXT preceeds diverted text.
3722@end example
3723
3724If output is diverted to a negative diversion, it is simply discarded.
3725This can be used to suppress unwanted output. A common example of
3726unwanted output is the trailing newlines after macro definitions. Here
3727is a common programming idiom in @code{m4} for avoiding them.
3728
3729@example
3730divert(`-1')
3731define(`foo', `Macro `foo'.')
3732define(`bar', `Macro `bar'.')
3733divert
3734@result{}
3735@end example
3736
3737@cindex @acronym{GNU} extensions
3738Traditional implementations only supported ten diversions. But as a
3739@acronym{GNU} extension, diversion numbers can be as large as positive
3740integers will allow, rather than treating a multi-digit diversion number
3741as a request to discard text.
3742
3743@example
3744divert(eval(`1<<28'))world
3745divert(`2')hello
3746^D
3747@result{}hello
3748@result{}world
3749@end example
3750
3751Note that @code{divert} is an English word, but also an active macro
3752without arguments. When processing plain text, the word might appear in
3753normal text and be unintentionally swallowed as a macro invocation. One
3754way to avoid this is to use the @option{-P} option to rename all
3755builtins (@pxref{Operation modes, , Invoking m4}). Another is to write
3756a wrapper that requires a parameter to be recognized.
3757
3758@example
3759We decided to divert the stream for irrigation.
3760@result{}We decided to the stream for irrigation.
3761define(`divert', `ifelse(`$#', `0', ``$0'', `builtin(`$0', $@@)')')
3762@result{}
3763divert(`-1')
3764Ignored text.
3765divert(`0')
3766@result{}
3767We decided to divert the stream for irrigation.
3768@result{}We decided to divert the stream for irrigation.
3769@end example
3770
3771@node Undivert
3772@section Undiverting output
3773
3774Diverted text can be undiverted explicitly using the builtin
3775@code{undivert}:
3776
3777@deffn Builtin undivert (@ovar{diversions@dots{}})
3778Undiverts the numeric @var{diversions} given by the arguments, in the
3779order given. If no arguments are supplied, all diversions are
3780undiverted, in numerical order.
3781
3782As a @acronym{GNU} extension, @var{diversions} may contain non-numeric
3783strings, which are treated as the names of files to copy into the output
3784without expansion. A warning is issued if a file could not be opened.
3785
3786The expansion of @code{undivert} is void.
3787@end deffn
3788
3789@example
3790divert(`1')
3791This text is diverted.
3792divert
3793@result{}
3794This text is not diverted.
3795@result{}This text is not diverted.
3796undivert(`1')
3797@result{}
3798@result{}This text is diverted.
3799@result{}
3800@end example
3801
3802Notice the last two blank lines. One of them comes from the newline
3803following @code{undivert}, the other from the newline that followed the
3804@code{divert}! A diversion often starts with a blank line like this.
3805
3806When diverted text is undiverted, it is @emph{not} reread by @code{m4},
3807but rather copied directly to the current output, and it is therefore
3808not an error to undivert into a diversion. Undiverting the empty string
3809is the same as specifying diversion 0; in either case nothing happens
3810since the output has already been flushed.
3811
3812@example
3813divert(`1')diverted text
3814divert
3815@result{}
3816undivert()
3817@result{}
3818undivert(`0')
3819@result{}
3820undivert
3821@result{}diverted text
3822@result{}
3823@end example
3824
3825When a diversion has been undiverted, the diverted text is discarded,
3826and it is not possible to bring back diverted text more than once.
3827
3828@example
3829divert(`1')
3830This text is diverted first.
3831divert(`0')undivert(`1')dnl
3832@result{}
3833@result{}This text is diverted first.
3834undivert(`1')
3835@result{}
3836divert(`1')
3837This text is also diverted but not appended.
3838divert(`0')undivert(`1')dnl
3839@result{}
3840@result{}This text is also diverted but not appended.
3841@end example
3842
3843Attempts to undivert the current diversion are silently ignored. Thus,
3844when the current diversion is not 0, the current diversion does not get
3845rearranged among the other diversions.
3846
3847@example
3848divert(`1')one
3849divert(`2')two
3850divert(`3')three
3851divert(`2')undivert`'dnl
3852divert`'undivert`'dnl
3853@result{}two
3854@result{}one
3855@result{}three
3856@end example
3857
3858@cindex @acronym{GNU} extensions
3859@cindex file inclusion
3860@cindex inclusion, of files
3861@acronym{GNU} @code{m4} allows named files to be undiverted. Given a
3862non-numeric
3863argument, the contents of the file named will be copied, uninterpreted, to
3864the current output. This complements the builtin @code{include}
3865(@pxref{Include}). To illustrate the difference, the file
3866@file{m4-@value{VERSION}/@/examples/@/foo} contains the word @samp{bar}:
3867
3868@example
3869define(`bar', `BAR')
3870@result{}
3871undivert(`foo')
3872@result{}bar
3873@result{}
3874include(`foo')
3875@result{}BAR
3876@result{}
3877@end example
3878
3879If the file is not found (or cannot be read), an error message is
3880issued, and the expansion is void.
3881
3882@node Divnum
3883@section Diversion numbers
3884
3885@cindex diversion numbers
3886The current diversion is tracked by the builtin @code{divnum}:
3887
3888@deffn Builtin divnum
3889Expands to the number of the current diversion.
3890@end deffn
3891
3892@example
3893Initial divnum
3894@result{}Initial 0
3895divert(`1')
3896Diversion one: divnum
3897divert(`2')
3898Diversion two: divnum
3899^D
3900@result{}
3901@result{}Diversion one: 1
3902@result{}
3903@result{}Diversion two: 2
3904@end example
3905
3906@node Cleardivert
3907@section Discarding diverted text
3908
3909@cindex discarding diverted text
3910@cindex diverted text, discarding
3911Often it is not known, when output is diverted, whether the diverted
3912text is actually needed. Since all non-empty diversion are brought back
3913on the main output stream when the end of input is seen, a method of
3914discarding a diversion is needed. If all diversions should be
3915discarded, the easiest is to end the input to @code{m4} with
3916@samp{divert(`-1')} followed by an explicit @samp{undivert}:
3917
3918@example
3919divert(`1')
3920Diversion one: divnum
3921divert(`2')
3922Diversion two: divnum
3923divert(`-1')
3924undivert
3925^D
3926@end example
3927
3928@noindent
3929No output is produced at all.
3930
3931Clearing selected diversions can be done with the following macro:
3932
3933@deffn Composite cleardivert (@ovar{diversions@dots{}})
3934Discard the contents of each of the listed numeric @var{diversions}.
3935@end deffn
3936
3937@example
3938define(`cleardivert',
3939`pushdef(`_n', divnum)divert(`-1')undivert($@@)divert(_n)popdef(`_n')')
3940@result{}
3941@end example
3942
3943It is called just like @code{undivert}, but the effect is to clear the
3944diversions, given by the arguments. (This macro has a nasty bug! You
3945should try to see if you can find it and correct it; or @pxref{Improved
3946cleardivert, , Answers}).
3947
3948@node Text handling
3949@chapter Macros for text handling
3950
3951There are a number of builtins in @code{m4} for manipulating text in
3952various ways, extracting substrings, searching, substituting, and so on.
3953
3954@menu
3955* Len:: Calculating length of strings
3956* Index macro:: Searching for substrings
3957* Regexp:: Searching for regular expressions
3958* Substr:: Extracting substrings
3959* Translit:: Translating characters
3960* Patsubst:: Substituting text by regular expression
3961* Format:: Formatting strings (printf-like)
3962@end menu
3963
3964@node Len
3965@section Calculating length of strings
3966
3967@cindex length of strings
3968@cindex strings, length of
3969The length of a string can be calculated by @code{len}:
3970
3971@deffn Builtin len (@var{string})
3972Expands to the length of @var{string}, as a decimal number.
3973
3974The macro @code{len} is recognized only with parameters.
3975@end deffn
3976
3977@example
3978len()
3979@result{}0
3980len(`abcdef')
3981@result{}6
3982@end example
3983
3984@node Index macro
3985@section Searching for substrings
3986
3987Searching for substrings is done with @code{index}:
3988
3989@deffn Builtin index (@var{string}, @var{substring})
3990Expands to the index of the first occurrence of @var{substring} in
3991@var{string}. The first character in @var{string} has index 0. If
3992@var{substring} does not occur in @var{string}, @code{index} expands to
3993@samp{-1}.
3994
3995The macro @code{index} is recognized only with parameters.
3996@end deffn
3997
3998@example
3999index(`gnus, gnats, and armadillos', `nat')
4000@result{}7
4001index(`gnus, gnats, and armadillos', `dag')
4002@result{}-1
4003@end example
4004
4005Omitting @var{substring} evokes a warning, but still produces output.
4006
4007@example
4008index(`abc')
4009@error{}m4:stdin:1: Warning: too few arguments to builtin `index'
4010@result{}0
4011@end example
4012
4013@node Regexp
4014@section Searching for regular expressions
4015
4016@cindex regular expressions
4017@cindex @acronym{GNU} extensions
4018Searching for regular expressions is done with the builtin
4019@code{regexp}:
4020
4021@deffn Builtin regexp (@var{string}, @var{regexp}, @ovar{replacement})
4022Searches for @var{regexp} in @var{string}. The syntax for regular
4023expressions is the same as in @acronym{GNU} Emacs.
4024@ifnothtml
4025@xref{Regexps, , Syntax of Regular Expressions, emacs, The GNU Emacs
4026Manual}.
4027@end ifnothtml
4028@ifhtml
4029See
4030@uref{http://www.gnu.org/@/software/@/emacs/@/manual/@/emacs.html#Regexps,
4031Syntax of Regular Expressions} in the @acronym{GNU} Emacs Manual.
4032@end ifhtml
4033
4034If @var{replacement} is omitted, @code{regexp} expands to the index of
4035the first match of @var{regexp} in @var{string}. If @var{regexp} does
4036not match anywhere in @var{string}, it expands to -1.
4037
4038If @var{replacement} is supplied, and there was a match, @code{regexp}
4039changes the expansion to this argument, with @samp{\@var{n}} substituted
4040by the text matched by the @var{n}th parenthesized sub-expression of
4041@var{regexp}, up to nine sub-expressions. The escape @samp{\&} is
4042replaced by the text of the entire regular expression matched. For
4043all other characters, @samp{\} treats the next character literally. A
4044warning is issued if there were fewer sub-expressions than the
4045@samp{\@var{n}} requested, or if there is a trailing @samp{\}. If there
4046was no match, @code{regexp} expands to the empty string.
4047
4048The macro @code{regexp} is recognized only with parameters.
4049@end deffn
4050
4051@example
4052regexp(`GNUs not Unix', `\<[a-z]\w+')
4053@result{}5
4054regexp(`GNUs not Unix', `\<Q\w*')
4055@result{}-1
4056regexp(`GNUs not Unix', `\w\(\w+\)$', `*** \& *** \1 ***')
4057@result{}*** Unix *** nix ***
4058regexp(`GNUs not Unix', `\<Q\w*', `*** \& *** \1 ***')
4059@result{}
4060@end example
4061
4062Here are some more examples on the handling of backslash:
4063
4064@example
4065regexp(`abc', `\(b\)', `\\\10\a')
4066@result{}\b0a
4067regexp(`abc', `b', `\1\')
4068@error{}m4:stdin:2: Warning: sub-expression 1 not present
4069@error{}m4:stdin:2: Warning: trailing \ ignored in replacement
4070@result{}
4071regexp(`abc', `\(\(d\)?\)\(c\)', `\1\2\3\4\5\6')
4072@error{}m4:stdin:3: Warning: sub-expression 4 not present
4073@error{}m4:stdin:3: Warning: sub-expression 5 not present
4074@error{}m4:stdin:3: Warning: sub-expression 6 not present
4075@result{}c
4076@end example
4077
4078Omitting @var{regexp} evokes a warning, but still produces output.
4079
4080@example
4081regexp(`abc')
4082@error{}m4:stdin:1: Warning: too few arguments to builtin `regexp'
4083@result{}0
4084@end example
4085
4086@node Substr
4087@section Extracting substrings
4088
4089@cindex extracting substrings
4090@cindex substrings, extracting
4091Substrings are extracted with @code{substr}:
4092
4093@deffn Builtin substr (@var{string}, @var{from}, @ovar{length})
4094Expands to the substring of @var{string}, which starts at index
4095@var{from}, and extends for @var{length} characters, or to the end of
4096@var{string}, if @var{length} is omitted. The starting index of a string
4097is always 0. The expansion is empty if there is an error parsing
4098@var{from} or @var{length}, if @var{from} is beyond the end of
4099@var{string}, or if @var{length} is negative.
4100
4101The macro @code{substr} is recognized only with parameters.
4102@end deffn
4103
4104@example
4105substr(`gnus, gnats, and armadillos', `6')
4106@result{}gnats, and armadillos
4107substr(`gnus, gnats, and armadillos', `6', `5')
4108@result{}gnats
4109@end example
4110
4111Omitting @var{from} evokes a warning, but still produces output.
4112
4113@example
4114substr(`abc')
4115@error{}m4:stdin:1: Warning: too few arguments to builtin `substr'
4116@result{}abc
4117substr(`abc',)
4118@error{}m4:stdin:2: empty string treated as 0 in builtin `substr'
4119@result{}abc
4120@end example
4121
4122@node Translit
4123@section Translating characters
4124
4125@cindex translating characters
4126@cindex characters, translating
4127Character translation is done with @code{translit}:
4128
4129@deffn Builtin translit (@var{string}, @var{chars}, @ovar{replacement})
4130Expands to @var{string}, with each character that occurs in
4131@var{chars} translated into the character from @var{replacement} with
4132the same index.
4133
4134If @var{replacement} is shorter than @var{chars}, the excess characters
4135of @var{chars} are deleted from the expansion; if @var{chars} is
4136shorter, the excess characters in @var{replacement} are silently
4137ignored. If @var{replacement} is omitted, all characters in
4138@var{string} that are present in @var{chars} are deleted from the
4139expansion. If a character appears more than once in @var{chars}, only
4140the first instance is used in making the translation. Only a single
4141translation pass is made, even if characters in @var{replacement} also
4142appear in @var{chars}.
4143
4144As a @acronym{GNU} extension, both @var{chars} and @var{replacement} can
4145contain character-ranges, e.g., @samp{a-z} (meaning all lowercase
4146letters) or @samp{0-9} (meaning all digits). To include a dash @samp{-}
4147in @var{chars} or @var{replacement}, place it first or last in the
4148entire string, or as the last character of a range. Back-to-back ranges
4149can share a common endpoint. It is not an error for the last character
4150in the range to be `larger' than the first. In that case, the range
4151runs backwards, i.e., @samp{9-0} means the string @samp{9876543210}.
4152The expansion of a range is dependent on the underlying encoding of
4153characters, so using ranges is not always portable between machines.
4154
4155The macro @code{translit} is recognized only with parameters.
4156@end deffn
4157
4158@example
4159translit(`GNUs not Unix', `A-Z')
4160@result{}s not nix
4161translit(`GNUs not Unix', `a-z', `A-Z')
4162@result{}GNUS NOT UNIX
4163translit(`GNUs not Unix', `A-Z', `z-a')
4164@result{}tmfs not fnix
4165translit(`+,-12345', `+--1-5', `<;>a-c-a')
4166@result{}<;>abcba
4167translit(`abcdef', `aabdef', `bcged')
4168@result{}bgced
4169@end example
4170
4171In the @sc{ascii} encoding, the first example deletes all uppercase
4172letters, the second converts lowercase to uppercase, and the third
4173`mirrors' all uppercase letters, while converting them to lowercase.
4174The two first cases are by far the most common, even though they are not
4175portable to @sc{ebcdic} or other encodings. The fourth example shows a
4176range ending in @samp{-}, as well as back-to-back ranges. The final
4177example shows that @samp{a} is mapped to @samp{b}, not @samp{c}; the
4178resulting @samp{b} is not further remapped to @samp{g}; the @samp{d} and
4179@samp{e} are swapped, and the @samp{f} is discarded.
4180
4181@ignore
4182@comment No need to fight 8-bit characters, as it is difficult to get
4183@comment rendering right in both info and dvi.
4184
4185@example
4186translit(`«abc~', `~-»')
4187@result{}abc
4188@end example
4189@end ignore
4190
4191Omitting @var{chars} evokes a warning, but still produces output.
4192
4193@example
4194translit(`abc')
4195@error{}m4:stdin:1: Warning: too few arguments to builtin `translit'
4196@result{}abc
4197@end example
4198
4199@node Patsubst
4200@section Substituting text by regular expression
4201
4202@cindex regular expressions
4203@cindex pattern substitution
4204@cindex substitution by regular expression
4205@cindex @acronym{GNU} extensions
4206Global substitution in a string is done by @code{patsubst}:
4207
4208@deffn Builtin patsubst (@var{string}, @var{regexp}, @ovar{replacement})
4209Searches @var{string} for matches of @var{regexp}, and substitutes
4210@var{replacement} for each match. The syntax for regular expressions
4211is the same as in @acronym{GNU} Emacs (@pxref{Regexp}).
4212
4213The parts of @var{string} that are not covered by any match of
4214@var{regexp} are copied to the expansion. Whenever a match is found, the
4215search proceeds from the end of the match, so a character from
4216@var{string} will never be substituted twice. If @var{regexp} matches a
4217string of zero length, the start position for the search is incremented,
4218to avoid infinite loops.
4219
4220When a replacement is to be made, @var{replacement} is inserted into
4221the expansion, with @samp{\@var{n}} substituted by the text matched by
4222the @var{n}th parenthesized sub-expression of @var{patsubst}, for up to
4223nine sub-expressions. The escape @samp{\&} is replaced by the text of
4224the entire regular expression matched. For all other characters,
4225@samp{\} treats the next character literally. A warning is issued if
4226there were fewer sub-expressions than the @samp{\@var{n}} requested, or
4227if there is a trailing @samp{\}.
4228
4229The @var{replacement} argument can be omitted, in which case the text
4230matched by @var{regexp} is deleted.
4231
4232The macro @code{patsubst} is recognized only with parameters.
4233@end deffn
4234
4235@example
4236patsubst(`GNUs not Unix', `^', `OBS: ')
4237@result{}OBS: GNUs not Unix
4238patsubst(`GNUs not Unix', `\<', `OBS: ')
4239@result{}OBS: GNUs OBS: not OBS: Unix
4240patsubst(`GNUs not Unix', `\w*', `(\&)')
4241@result{}(GNUs)() (not)() (Unix)()
4242patsubst(`GNUs not Unix', `\w+', `(\&)')
4243@result{}(GNUs) (not) (Unix)
4244patsubst(`GNUs not Unix', `[A-Z][a-z]+')
4245@result{}GN not@w{ }
4246patsubst(`GNUs not Unix', `not', `NOT\')
4247@error{}m4:stdin:6: Warning: trailing \ ignored in replacement
4248@result{}GNUs NOT Unix
4249@end example
4250
4251Here is a slightly more realistic example, which capitalizes individual
4252word or whole sentences, by substituting calls of the macros
4253@code{upcase} and @code{downcase} into the strings.
4254
4255@deffn Composite upcase (@var{text})
4256@deffnx Composite downcase (@var{text})
4257@deffnx Composite capitalize (@var{text})
4258Expand to @var{text}, but with capitalization changed: @code{upcase}
4259changes all letters to upper case, @code{downcase} changes all letters
4260to lower case, and @code{capitalize} changes the first character of each
4261word to upper case and the remaining characters to lower case.
4262@end deffn
4263
4264@example
4265define(`upcase', `translit(`$*', `a-z', `A-Z')')dnl
4266define(`downcase', `translit(`$*', `A-Z', `a-z')')dnl
4267define(`capitalize1',
4268 `regexp(`$1', `^\(\w\)\(\w*\)',
4269 `upcase(`\1')`'downcase(`\2')')')dnl
4270define(`capitalize',
4271 `patsubst(`$1', `\w+', `capitalize1(`\&')')')dnl
4272capitalize(`GNUs not Unix')
4273@result{}Gnus Not Unix
4274@end example
4275
4276While @code{regexp} replaces the whole input with the replacement as
4277soon as there is a match, @code{patsubst} replaces each
4278@emph{occurrence} of a match and preserves non-matching pieces:
4279
4280@example
4281define(`patreg',
4282`patsubst($@@)
4283regexp($@@)')dnl
4284patreg(`bar foo baz Foo', `foo\|Foo', `FOO')
4285@result{}bar FOO baz FOO
4286@result{}FOO
4287patreg(`aba abb 121', `\(.\)\(.\)\1', `\2\1\2')
4288@result{}bab abb 212
4289@result{}bab
4290@end example
4291
4292Omitting @var{regexp} evokes a warning, but still produces output.
4293
4294@example
4295patsubst(`abc')
4296@error{}m4:stdin:1: Warning: too few arguments to builtin `patsubst'
4297@result{}abc
4298@end example
4299
4300@node Format
4301@section Formatted output
4302
4303@cindex formatted output
4304@cindex output, formatted
4305@cindex @acronym{GNU} extensions
4306Formatted output can be made with @code{format}:
4307
4308@deffn Builtin format (@var{format-string}, @dots{})
4309Works much like the C function @code{printf}. The first argument
4310@var{format-string} can contain @samp{%} specifications which are
4311satisfied by additional arguments, and the expansion of @code{format} is
4312the formatted string.
4313
4314The macro @code{format} is recognized only with parameters.
4315@end deffn
4316
4317Its use is best described by a few examples:
4318
4319@example
4320define(`foo', `The brown fox jumped over the lazy dog')
4321@result{}
4322format(`The string "%s" uses %d characters', foo, len(foo))
4323@result{}The string "The brown fox jumped over the lazy dog" uses 38 characters
4324format(`%.0f', `56789.9876')
4325@result{}56790
4326len(format(`%-*X', `300', `1'))
4327@result{}300
4328@end example
4329
4330Using the @code{forloop} macro defined earlier (@pxref{Forloop}), this
4331example shows how @code{format} can be used to produce tabular output.
4332
4333@example
4334include(`forloop.m4')
4335@result{}
4336forloop(`i', `1', `10', `format(`%6d squared is %10d
4337', i, eval(i**2))')
4338@result{} 1 squared is 1
4339@result{} 2 squared is 4
4340@result{} 3 squared is 9
4341@result{} 4 squared is 16
4342@result{} 5 squared is 25
4343@result{} 6 squared is 36
4344@result{} 7 squared is 49
4345@result{} 8 squared is 64
4346@result{} 9 squared is 81
4347@result{} 10 squared is 100
4348@result{}
4349@end example
4350
4351The builtin @code{format} is modeled after the ANSI C @samp{printf}
4352function, and supports these @samp{%} specifiers: @samp{c},
4353@samp{s}, @samp{d}, @samp{o}, @samp{x}, @samp{X}, @samp{u}, @samp{e},
4354@samp{E}, @samp{f}, @samp{F}, @samp{g}, @samp{G}, and @samp{%}; it
4355supports field widths and precisions, and the
4356modifiers @samp{+}, @samp{-}, @samp{@w{ }}, @samp{0}, @samp{#}, @samp{h} and
4357@samp{l}. For more details on the functioning of @code{printf}, see the
4358C Library Manual.
4359
4360For now, unrecognized specifiers are silently ignored, but it is
4361anticipated that a future release of @acronym{GNU} @code{m4} will support more
4362specifiers, and give warnings when problems are encountered. Likewise,
4363escape sequences are not yet recognized.
4364
4365@node Arithmetic
4366@chapter Macros for doing arithmetic
4367
4368@cindex arithmetic
4369@cindex integer arithmetic
4370Integer arithmetic is included in @code{m4}, with a C-like syntax. As
4371convenient shorthands, there are builtins for simple increment and
4372decrement operations.
4373
4374@menu
4375* Incr:: Decrement and increment operators
4376* Eval:: Evaluating integer expressions
4377@end menu
4378
4379@node Incr
4380@section Decrement and increment operators
4381
4382@cindex decrement operator
4383@cindex increment operator
4384Increment and decrement of integers are supported using the builtins
4385@code{incr} and @code{decr}:
4386
4387@deffn Builtin incr (@var{number})
4388@deffnx Builtin decr (@var{number})
4389Expand to the numerical value of @var{number}, incremented
4390or decremented, respectively, by one. Except for the empty string, the
4391expansion is empty if @var{number} could not be parsed.
4392
4393The macros @code{incr} and @code{decr} are recognized only with
4394parameters.
4395@end deffn
4396
4397@example
4398incr(`4')
4399@result{}5
4400decr(`7')
4401@result{}6
4402incr()
4403@error{}m4:stdin:3: empty string treated as 0 in builtin `incr'
4404@result{}1
4405decr()
4406@error{}m4:stdin:4: empty string treated as 0 in builtin `decr'
4407@result{}-1
4408@end example
4409
4410@node Eval
4411@section Evaluating integer expressions
4412
4413@cindex integer expression evaluation
4414@cindex evaluation, of integer expressions
4415@cindex expressions, evaluation of integer
4416Integer expressions are evaluated with @code{eval}:
4417
4418@deffn Builtin eval (@var{expression}, @dvar{radix, 10}, @ovar{width})
4419Expands to the value of @var{expression}. The expansion is empty
4420if an error is encountered while parsing the arguments. If specified,
4421@var{radix} and @var{width} control the format of the output.
4422
4423The macro @code{eval} is recognized only with parameters.
4424@end deffn
4425
4426Expressions can contain the following operators, listed in order of
4427decreasing precedence.
4428
4429@table @code
4430@item + -
4431Unary plus and minus
4432@item **
4433Exponentiation
4434@item * / %
4435Multiplication, division and modulo
4436@item + -
4437Addition and subtraction
4438@item << >>
4439Shift left or right
4440@item == != > >= < <=
4441Relational operators
4442@item !
4443Logical negation
4444@item ~
4445Bitwise negation
4446@item &
4447Bitwise and
4448@item ^
4449Bitwise exclusive-or
4450@item |
4451Bitwise or
4452@item &&
4453Logical and
4454@item ||
4455Logical or
4456@end table
4457
4458All operators, except exponentiation, are left associative.
4459
4460Note that some older @code{m4} implementations use @samp{^} as an
4461alternate operator for exponentiation, although @acronym{POSIX} requires
4462the C behavior of bitwise exclusive-or. On the other hand, the
4463precedence of @samp{~} and @samp{!} are different in @acronym{GNU}
4464@code{m4} than
4465they are in C, matching the precedence in traditional @code{m4}
4466implementations. This behavior is likely to change in a future
4467version to match @acronym{POSIX}, so use parentheses to force the
4468desired precedence.
4469
4470Within @var{expression}, (but not @var{radix} or @var{width}),
4471numbers without a special prefix are decimal. A simple @samp{0}
4472prefix introduces an octal number. @samp{0x} introduces a hexadecimal
4473number. @samp{0b} introduces a binary number. @samp{0r} introduces a
4474number expressed in any radix between 1 and 36: the prefix should be
4475immediately followed by the decimal expression of the radix, a colon,
4476then the digits making the number. For radix 1, leading zeros are
4477ignored and all remaining digits must be @samp{1}; for all other
4478radices, the digits are
4479@samp{0}, @samp{1}, @samp{2}, @dots{}. Beyond @samp{9}, the digits are
4480@samp{a}, @samp{b} @dots{} up to @samp{z}. Lower and upper case letters
4481can be used interchangeably in numbers prefixes and as number digits.
4482
4483Parentheses may be used to group subexpressions whenever needed. For the
4484relational operators, a true relation returns @code{1}, and a false
4485relation return @code{0}.
4486
4487Here are a few examples of use of @code{eval}.
4488
4489@example
4490eval(`-3 * 5')
4491@result{}-15
4492eval(index(`Hello world', `llo') >= 0)
4493@result{}1
4494eval(`0r1:0111 + 0b100 + 0r3:12')
4495@result{}12
4496define(`square', `eval(`('$1`)**2')')
4497@result{}
4498square(`9')
4499@result{}81
4500square(square(`5')`+1')
4501@result{}676
4502define(`foo', `666')
4503@result{}
4504eval(`foo/6')
4505@error{}m4:stdin:8: bad expression in eval: foo/6
4506@result{}
4507eval(foo/6)
4508@result{}111
4509@end example
4510
4511As the last two lines show, @code{eval} does not handle macro
4512names, even if they expand to a valid expression (or part of a valid
4513expression). Therefore all macros must be expanded before they are
4514passed to @code{eval}.
4515
4516All evaluation is done with 32-bit signed integers, assuming
45172's-complement with wrap-around. The shift operators are defined in
4518@acronym{GNU}
4519@code{m4} by doing an implicit bit-wise and of the right-hand operand
4520with 0x1f, and sign-extension with right shift.
4521
4522@example
4523eval(0x80000000 / -1)
4524@result{}-2147483648
4525eval(0x80000000 % -1)
4526@result{}0
4527eval(0x7fffffff)
4528@result{}2147483647
4529incr(eval(0x7fffffff))
4530@result{}-2147483648
4531eval(-4 >> 33)
4532@result{}-2
4533@end example
4534
4535If @var{radix} is specified, it specifies the radix to be used in the
4536expansion. The default radix is 10; this is also the case if
4537@var{radix} is the empty string. It is an error if the radix is outside
4538the range of 1 through 36, inclusive. The result of @code{eval} is
4539always taken to be signed. No radix prefix is output, and for radices
4540greater than 10, the digits are lower case. The @var{width} argument
4541specifies the minimum output width, excluding any negative sign. The
4542result is zero-padded to extend the expansion to the requested width.
4543It is an error if the width is negative. On error, the expansion of
4544@code{eval} is empty.
4545
4546@example
4547eval(`666', `10')
4548@result{}666
4549eval(`666', `11')
4550@result{}556
4551eval(`666', `6')
4552@result{}3030
4553eval(`666', `6', `10')
4554@result{}0000003030
4555eval(`-666', `6', `10')
4556@result{}-0000003030
4557eval(`10', `', `0')
4558@result{}10
4559`0r1:'eval(`10', `1', `11')
4560@result{}0r1:01111111111
4561eval(`10', `16')
4562@result{}a
4563@end example
4564
4565@node Shell commands
4566@chapter Macros for running shell commands
4567
4568@cindex executing UNIX commands
4569@cindex running UNIX commands
4570@cindex UNIX commands, running
4571@cindex commands, running UNIX
4572@cindex executing shell commands
4573@cindex running shell commands
4574@cindex shell commands, running
4575@cindex commands, running shell
4576There are a few builtin macros in @code{m4} that allow you to run shell
4577commands from within @code{m4}.
4578
4579Note that the definition of a valid shell command is system dependent.
4580On UNIX systems, this is the typical @code{/bin/sh}. But on other
4581systems, such as native Windows, the shell has a different syntax of
4582commands that it understands. Some examples in this chapter assume
4583@code{/bin/sh}, and also demonstrate how to quit early with a known
4584exit value if this is not the case.
4585
4586@menu
4587* Platform macros:: Determining the platform
4588* Syscmd:: Executing simple commands
4589* Esyscmd:: Reading the output of commands
4590* Sysval:: Exit status
4591* Mkstemp:: Making temporary files
4592@end menu
4593
4594@node Platform macros
4595@section Determining the platform
4596
4597@cindex platform macros
4598Sometimes it is desirable for an input file to know which
4599platform @code{m4} is running on. @acronym{GNU} @code{m4} provides several
4600macros that are predefined to expand to the empty string; checking for
4601their existence will confirm platform details.
4602
4603@deffn {Optional builtin} __gnu__
4604@deffnx {Optional builtin} __os2__
4605@deffnx {Optional builtin} os2
4606@deffnx {Optional builtin} __unix__
4607@deffnx {Optional builtin} unix
4608@deffnx {Optional builtin} __windows__
4609@deffnx {Optional builtin} windows
4610Each of these macros is conditionally defined as needed to describe the
4611environment of @code{m4}. If defined, each macro expands to the empty
4612string.
4613@end deffn
4614
4615When @acronym{GNU} extensions are in effect (that is, when you did not
4616use the @option{-G} option, @pxref{Limits control, , Invoking m4}),
4617@acronym{GNU} @code{m4} will define the macro @code{@w{__gnu__}} to
4618expand to the empty string.
4619
4620@example
4621__gnu__
4622@result{}
4623ifdef(`__gnu__', `Extensions are active')
4624@result{}Extensions are active
4625@end example
4626
4627@cindex platform macro
4628On UNIX systems, @acronym{GNU} @code{m4} will define @code{@w{__unix__}}
4629by default, or @code{unix} when the @option{-G} option is specified.
4630
4631On native Windows systems, @acronym{GNU} @code{m4} will define
4632@code{@w{__windows__}} by default, or @code{windows} when the
4633@option{-G} option is specified.
4634
4635On OS/2 systems, @acronym{GNU} @code{m4} will define @code{@w{__os2__}}
4636by default, or @code{os2} when the @option{-G} option is specified.
4637
4638If @acronym{GNU} @code{m4} does not provide a platform macro for your system,
4639please report that as a bug.
4640
4641@example
4642define(`provided', `0')
4643@result{}
4644ifdef(`__unix__', `define(`provided', incr(provided))')
4645@result{}
4646ifdef(`__windows__', `define(`provided', incr(provided))')
4647@result{}
4648ifdef(`__os2__', `define(`provided', incr(provided))')
4649@result{}
4650provided
4651@result{}1
4652@end example
4653
4654@node Syscmd
4655@section Executing simple commands
4656
4657Any shell command can be executed, using @code{syscmd}:
4658
4659@deffn Builtin syscmd (@var{shell-command})
4660Executes @var{shell-command} as a shell command.
4661
4662The expansion of @code{syscmd} is void, @emph{not} the output from
4663@var{shell-command}! Output or error messages from @var{shell-command}
4664are not read by @code{m4}. @xref{Esyscmd}, if you need to process the
4665command output.
4666
4667Prior to executing the command, @code{m4} flushes its buffers.
4668The default standard input, output and error of @var{shell-command} are
4669the same as those of @code{m4}.
4670
4671The macro @code{syscmd} is recognized only with parameters.
4672@end deffn
4673
4674@example
4675define(`foo', `FOO')
4676@result{}
4677syscmd(`echo foo')
4678@result{}foo
4679@result{}
4680@end example
4681
4682Note how the expansion of @code{syscmd} keeps the trailing newline of
4683the command, as well as using the newline that appeared after the macro.
4684
4685As an example of @var{shell-command} using the same standard input as
4686@code{m4}, the command line @kbd{echo "m4wrap(\`syscmd(\`cat')')" | m4}
4687will tell @code{m4} to read all of its input before executing the
4688wrapped text, then hand a valid (albeit emptied) pipe as standard input
4689for the @code{cat} subcommand. Therefore, you should be careful when
4690using standard input (either by specifying no files, or by passing
4691@samp{-} as a file name on the command line, @pxref{Command line files,
4692, Invoking m4}), and
4693also invoking subcommands via @code{syscmd} or @code{esyscmd} that
4694consume data from standard input. When standard input is a seekable
4695file, the subprocess will pick up with the next character not yet
4696processed by @code{m4}; when it is a pipe or other non-seekable file,
4697there is no guarantee how much data will already be buffered by
4698@code{m4} and thus unavailable to the child.
4699
4700@ignore
4701@comment If the user types the example below with stdin being an
4702@comment interactive terminal, then cat will hang waiting for additional
4703@comment input after m4 has exited. But the testsuite is using a pipe
4704@comment for stdin. Hence, we have two versions - the one we feed the
4705@comment testsuite below, and the one we display to the user above that
4706@comment more accurately shows what the testsuite is really doing but
4707@comment which the testsuite cannot parse.
4708
4709@example
4710m4wrap(`syscmd(`cat')')
4711@result{}
4712^D
4713@end example
4714@end ignore
4715
4716@node Esyscmd
4717@section Reading the output of commands
4718
4719@cindex @acronym{GNU} extensions
4720If you want @code{m4} to read the output of a shell command, use
4721@code{esyscmd}:
4722
4723@deffn Builtin esyscmd (@var{shell-command})
4724Expands to the standard output of the shell command
4725@var{shell-command}.
4726
4727Prior to executing the command, @code{m4} flushes its buffers.
4728The default standard input and standard error of @var{shell-command} are
4729the same as those of @code{m4}. The error output of @var{shell-command}
4730is not a part of the expansion: it will appear along with the error
4731output of @code{m4}.
4732
4733The macro @code{esyscmd} is recognized only with parameters.
4734@end deffn
4735
4736@example
4737define(`foo', `FOO')
4738@result{}
4739esyscmd(`echo foo')
4740@result{}FOO
4741@result{}
4742@end example
4743
4744Note how the expansion of @code{esyscmd} keeps the trailing newline of
4745the command, as well as using the newline that appeared after the macro.
4746
4747Just as with @code{syscmd}, care must be exercised when sharing standard
4748input between @code{m4} and the child process of @code{esyscmd}.
4749
4750@node Sysval
4751@section Exit status
4752
4753@cindex UNIX commands, exit status from
4754@cindex exit status from shell commands
4755@cindex shell commands, exit status from
4756@cindex commands, exit status from shell
4757@cindex status of shell commands
4758To see whether a shell command succeeded, use @code{sysval}:
4759
4760@deffn Builtin sysval
4761Expands to the exit status of the last shell command run with
4762@code{syscmd} or @code{esyscmd}. Expands to 0 if no command has been
4763run yet.
4764@end deffn
4765
4766@example
4767syscmd(`false')
4768@result{}
4769ifelse(sysval, `0', `zero', `non-zero')
4770@result{}non-zero
4771syscmd(`exit 2')
4772@result{}
4773sysval
4774@result{}2
4775syscmd(`true')
4776@result{}
4777sysval
4778@result{}0
4779esyscmd(`false')
4780@result{}
4781ifelse(sysval, `0', `zero', `non-zero')
4782@result{}non-zero
4783esyscmd(`exit 2')
4784@result{}
4785sysval
4786@result{}2
4787esyscmd(`true')
4788@result{}
4789sysval
4790@result{}0
4791@end example
4792
4793@command{sysval} results in 127 if there was a problem executing the
4794command, for example, if the system-imposed argument length is exceeded,
4795or if there were not enough resources to fork. It is not possible to
4796distinguish between failed execution and successful execution that had
4797an exit status of 127.
4798
4799On UNIX platforms, where it is possible to detect when command execution
4800is terminated by a signal, rather than a normal exit, the result is the
4801signal number shifted left by eight bits.
4802
4803@comment This test has difficulties being portable, even on platforms
4804@comment where syscmd invokes /bin/sh. Kill is not portable with signal
4805@comment names. According to autoconf, the only portable signal numbers
4806@comment are 1 (HUP), 2 (INT), 9 (KILL), 13 (PIPE) and 15 (TERM). But
4807@comment all shells handle SIGINT, and ksh handles HUP (as in, the shell
4808@comment exits normally rather than letting the signal terminate it).
4809@comment Also, TERM is flaky, as it can also kill the running m4 on
4810@comment systems where /bin/sh does not create its own process group.
4811@comment That leaves KILL and PIPE as the two signals tested.
4812@example
4813dnl This test assumes kill is a shell builtin, and that signals are
4814dnl recognizable.
4815ifdef(`__unix__', ,
4816 `errprint(` skipping: syscmd does not have unix semantics
4817')m4exit(`77')')dnl
4818syscmd(`kill -13 $$')
4819@result{}
4820sysval
4821@result{}3328
4822esyscmd(`kill -9 $$')
4823@result{}
4824sysval
4825@result{}2304
4826@end example
4827
4828@node Mkstemp
4829@section Making temporary files
4830
4831@cindex temporary file names
4832@cindex files, names of temporary
4833Commands specified to @code{syscmd} or @code{esyscmd} might need a
4834temporary file, for output or for some other purpose. There is a
4835builtin macro, @code{mkstemp}, for making a temporary file:
4836
4837@deffn Builtin mkstemp (@var{template})
4838@deffnx Builtin maketemp (@var{template})
4839Expands to a name of a new, empty file, made from the string
4840@var{template}, which should end with the string @samp{XXXXXX}. The six
4841@samp{X} characters are then replaced with random characters matching
4842the regular expression @samp{[a-zA-Z0-9._-]}, in order to make the file
4843name unique. If fewer than six @samp{X} characters are found at the end
4844of @code{template}, the result will be longer than the template. The
4845created file will have access permissions as if by @kbd{chmod =rw,go=},
4846meaning that the current umask of the @code{m4} process is taken into
4847account, and at most only the current user can read and write the file.
4848
4849The traditional behavior, standardized by @acronym{POSIX}, is that
4850@code{maketemp} merely replaces the trailing @samp{X} with the process
4851id, without creating a file, and without ensuring that the resulting
4852string is a unique file name. In part, this means that using the same
4853@var{template} twice in the same input file will result in the same
4854expansion. This behavior is a security hole, as it is very easy for
4855another process to guess the name that will be generated, and thus
4856interfere with a subsequent use of @code{syscmd} trying to manipulate
4857that file name. Hence, @acronym{POSIX} has recommended that all new
4858implementations of @code{m4} provide the secure @code{mkstemp} builtin,
4859and that users of @code{m4} check for its existence.
4860
4861The macros @code{mkstemp} and @code{maketemp} are recognized only with
4862parameters.
4863@end deffn
4864
4865If you try this next example, you will most likely get different output
4866for the two file names, since the replacement characters are randomly
4867chosen:
4868
4869@comment ignore
4870@example
4871maketemp(`/tmp/fooXXXXXX')
4872@result{}/tmp/fooa07346
4873ifdef(`mkstemp', `define(`maketemp', defn(`mkstemp'))',
4874 `define(`mkstemp', defn(`maketemp'))dnl
4875errprint(`warning: potentially insecure maketemp implementation
4876')')
4877@result{}
4878mkstemp(`doc')
4879@result{}docQv83Uw
4880@end example
4881
4882@cindex @acronym{GNU} extensions
4883Unless you use the @option{--traditional} command line option (or
4884@option{-G}, @pxref{Limits control, , Invoking m4}), the @acronym{GNU}
4885version of @code{maketemp} is secure. This means that using the same
4886template to multiple calls will generate multiple files. However, we
4887recommend that you use the new @code{mkstemp} macro, introduced in
4888@acronym{GNU} M4 1.4.8, which is secure even in traditional mode.
4889
4890@example
4891syscmd(`echo foo??????')dnl
4892@result{}foo??????
4893define(`file1', maketemp(`fooXXXXXX'))dnl
4894ifelse(esyscmd(`echo foo??????'), `foo??????', `no file', `created')
4895@result{}created
4896define(`file2', maketemp(`fooXX'))dnl
4897define(`file3', mkstemp(`fooXXXXXX'))dnl
4898ifelse(len(file1), len(file2), `same length', `different')
4899@result{}same length
4900ifelse(file1, file2, `same', `different file')
4901@result{}different file
4902ifelse(file2, file3, `same', `different file')
4903@result{}different file
4904ifelse(file1, file3, `same', `different file')
4905@result{}different file
4906syscmd(`rm 'file1 file2 file3)
4907@result{}
4908sysval
4909@result{}0
4910@end example
4911
4912@node Miscellaneous
4913@chapter Miscellaneous builtin macros
4914
4915This chapter describes various builtins, that do not really belong in
4916any of the previous chapters.
4917
4918@menu
4919* Errprint:: Printing error messages
4920* Location:: Printing current location
4921* M4exit:: Exiting from @code{m4}
4922@end menu
4923
4924@node Errprint
4925@section Printing error messages
4926
4927@cindex printing error messages
4928@cindex error messages, printing
4929@cindex messages, printing error
4930You can print error messages using @code{errprint}:
4931
4932@deffn Builtin errprint (@var{message}, @dots{})
4933Prints @var{message} and the rest of the arguments to standard error,
4934separated by spaces. Standard error is used, regardless of the
4935@option{--debugfile} option (@pxref{Debugging options, , Invoking m4}).
4936
4937The expansion of @code{errprint} is void.
4938The macro @code{errprint} is recognized only with parameters.
4939@end deffn
4940
4941@example
4942errprint(`Invalid arguments to forloop
4943')
4944@error{}Invalid arguments to forloop
4945@result{}
4946errprint(`1')errprint(`2',`3
4947')
4948@error{}12 3
4949@result{}
4950@end example
4951
4952A trailing newline is @emph{not} printed automatically, so it should be
4953supplied as part of the argument, as in the example. Unfortunately, the
4954exact output of @code{errprint} is not very portable to other @code{m4}
4955implementations: @acronym{POSIX} requires that all arguments be printed,
4956but some implementations of @code{m4} only print the first.
4957Furthermore, some BSD implementations always append a newline for each
4958@code{errprint} call, regardless of whether the last argument already
4959had one, and @acronym{POSIX} is silent on whether this is acceptable.
4960
4961@node Location
4962@section Printing current location
4963
4964To make it possible to specify the location of an error, three
4965utility builtins exist:
4966
4967@deffn Builtin __file__
4968@deffnx Builtin __line__
4969@deffnx Builtin __program__
4970Expand to the quoted name of the current input file, the
4971current input line number in that file, and the quoted name of the
4972current invocation of @code{m4}.
4973@end deffn
4974
4975@example
4976errprint(__program__:__file__:__line__: `input error
4977')
4978@error{}m4:stdin:1: input error
4979@result{}
4980@end example
4981
4982Line numbers start at 1 for each file. If the file was found due to the
4983@option{-I} option or @env{M4PATH} environment variable, that is
4984reflected in the file name. The syncline option (@option{-s},
4985@pxref{Preprocessor features, , Invoking m4}), and the
4986@samp{f} and @samp{l} flags of @code{debugmode} (@pxref{Debug Levels}),
4987also use this notion of current file and line. Redefining the three
4988location macros has no effect on syncline, debug, or warning message
4989output. Assume this example is run in the
4990@file{m4-@value{VERSION}/@/checks} directory of the @acronym{GNU} M4
4991package, using @samp{--include=../examples} in the command line to find
4992the file @file{incl.m4} mentioned earlier:
4993
4994@example
4995define(`foo', ``$0' called at __file__:__line__')
4996@result{}
4997foo
4998@result{}foo called at stdin:2
4999include(`incl.m4')
5000@result{}Include file start
5001@result{}foo called at ../examples/incl.m4:2
5002@result{}Include file end
5003@result{}
5004@end example
5005
5006The location of macros invoked during the rescanning of macro expansion
5007text corresponds to the location in the file where the expansion was
5008triggered, regardless of how many newline characters the expansion text
5009contains. As of @acronym{GNU} M4 1.4.8, the location of text wrapped
5010with @code{m4wrap} (@pxref{M4wrap}) is the point at which the
5011@code{m4wrap} was invoked. Previous versions, however, behaved as
5012though wrapped text came from line 0 of the file ``''.
5013
5014@example
5015define(`echo', `$@@')
5016@result{}
5017define(`foo', `echo(__line__
5018__line__)')
5019@result{}
5020echo(__line__
5021__line__)
5022@result{}4
5023@result{}5
5024m4wrap(`foo
5025')
5026@result{}
5027foo(errprint(__line__
5028__line__
5029))
5030@error{}8
5031@error{}9
5032@result{}8
5033@result{}8
5034__line__
5035@result{}11
5036^D
5037@result{}6
5038@result{}6
5039@end example
5040
5041The @code{@w{__program__}} macro behaves like @samp{$0} in shell
5042terminology. If you invoke @code{m4} through an absolute path or a link
5043with a different spelling, rather than by relying on a @env{PATH} search
5044for plain @samp{m4}, it will affect how @code{@w{__program__}} expands.
5045The intent is that you can use it to produce error messages with the
5046same formatting that @code{m4} produces internally. It can also be used
5047within @code{syscmd} (@pxref{Syscmd}) to pick the same version of
5048@code{m4} that is currently running, rather than whatever version of
5049@code{m4} happens to be first in @env{PATH}. It was first introduced in
5050@acronym{GNU} M4 1.4.6.
5051
5052@node M4exit
5053@section Exiting from @code{m4}
5054
5055@cindex exiting from @code{m4}
5056@cindex status, setting @code{m4} exit
5057If you need to exit from @code{m4} before the entire input has been
5058read, you can use @code{m4exit}:
5059
5060@deffn Builtin m4exit (@dvar{code, 0})
5061Causes @code{m4} to exit, with exit status @var{code}. If @var{code} is
5062left out, the exit status is zero. If @var{code} cannot be parsed, or
5063is outside the range of 0 to 255, the exit status is one. No further
5064input is read, and all wrapped and diverted text is discarded.
5065@end deffn
5066
5067@example
5068m4wrap(`This text is lost due to `m4exit'.')
5069@result{}
5070divert(`1') So is this.
5071divert
5072@result{}
5073m4exit And this is never read.
5074@end example
5075
5076A common use of this is to abort processing:
5077
5078@deffn Composite fatal_error (@var{message})
5079Abort processing with an error message and non-zero status. Prefix
5080@var{message} with details about where the error occurred, and print the
5081resulting string to standard error.
5082@end deffn
5083
5084@example
5085define(`fatal_error',
5086 `errprint(__program__:__file__:__line__`: fatal error: $*
5087')m4exit(`1')')
5088@result{}
5089fatal_error(`this is a BAD one, buster')
5090@error{}m4:stdin:4: fatal error: this is a BAD one, buster
5091@end example
5092
5093After this macro call, @code{m4} will exit with exit status 1. This macro
5094is only intended for error exits, since the normal exit procedures are
5095not followed, e.g., diverted text is not undiverted, and saved text
5096(@pxref{M4wrap}) is not reread. (This macro could be made more robust
5097to earlier versions of @code{m4}. You should try to see if you can find
5098weaknesses and correct them; or @pxref{Improved fatal_error, , Answers}).
5099
5100Note that it is still possible for the exit status to be different than
5101what was requested by @code{m4exit}. If @code{m4} detects some other
5102error, such as a write error on standard output, the exit status will be
5103non-zero even if @code{m4exit} requested zero.
5104
5105If standard input is seekable, then the file will be positioned at the
5106next unread character. If it is a pipe or other non-seekable file,
5107then there are no guarantees how much data @code{m4} might have read
5108into buffers, and thus discarded.
5109
5110@node Frozen files
5111@chapter Fast loading of frozen state
5112
5113Some bigger @code{m4} applications may be built over a common base
5114containing hundreds of definitions and other costly initializations.
5115Usually, the common base is kept in one or more declarative files,
5116which files are listed on each @code{m4} invocation prior to the
5117user's input file, or else each input file uses @code{include}.
5118
5119Reading the common base of a big application, over and over again, may
5120be time consuming. @acronym{GNU} @code{m4} offers some machinery to speed up
5121the start of an application using lengthy common bases.
5122
5123@menu
5124* Using frozen files:: Using frozen files
5125* Frozen file format:: Frozen file format
5126@end menu
5127
5128@node Using frozen files
5129@section Using frozen files
5130@cindex fast loading of frozen files
5131@cindex frozen files for fast loading
5132@cindex initialization, frozen states
5133@cindex dumping into frozen file
5134@cindex reloading a frozen file
5135@cindex @acronym{GNU} extensions
5136Suppose a user has a library of @code{m4} initializations in
5137@file{base.m4}, which is then used with multiple input files:
5138
5139@comment ignore
5140@example
5141m4 base.m4 input1.m4
5142m4 base.m4 input2.m4
5143m4 base.m4 input3.m4
5144@end example
5145
5146Rather than spending time parsing the fixed contents of @file{base.m4}
5147every time, the user might rather execute:
5148
5149@comment ignore
5150@example
5151m4 -F base.m4f base.m4
5152@end example
5153
5154@noindent
5155once, and further execute, as often as needed:
5156
5157@comment ignore
5158@example
5159m4 -R base.m4f input1.m4
5160m4 -R base.m4f input2.m4
5161m4 -R base.m4f input3.m4
5162@end example
5163
5164@noindent
5165with the varying input. The first call, containing the @option{-F}
5166option, only reads and executes file @file{base.m4}, defining
5167various application macros and computing other initializations.
5168Once the input file @file{base.m4} has been completely processed, @acronym{GNU}
5169@code{m4} produces on @file{base.m4f} a @dfn{frozen} file, that is, a
5170file which contains a kind of snapshot of the @code{m4} internal state.
5171
5172Later calls, containing the @option{-R} option, are able to reload
5173the internal state of @code{m4}, from @file{base.m4f},
5174@emph{prior} to reading any other input files. This means
5175instead of starting with a virgin copy of @code{m4}, input will be
5176read after having effectively recovered the effect of a prior run.
5177In our example, the effect is the same as if file @file{base.m4} has
5178been read anew. However, this effect is achieved a lot faster.
5179
5180Only one frozen file may be created or read in any one @code{m4}
5181invocation. It is not possible to recover two frozen files at once.
5182However, frozen files may be updated incrementally, through using
5183@option{-R} and @option{-F} options simultaneously. For example, if
5184some care is taken, the command:
5185
5186@comment ignore
5187@example
5188m4 file1.m4 file2.m4 file3.m4 file4.m4
5189@end example
5190
5191@noindent
5192could be broken down in the following sequence, accumulating the same
5193output:
5194
5195@comment ignore
5196@example
5197m4 -F file1.m4f file1.m4
5198m4 -R file1.m4f -F file2.m4f file2.m4
5199m4 -R file2.m4f -F file3.m4f file3.m4
5200m4 -R file3.m4f file4.m4
5201@end example
5202
5203Some care is necessary because not every effort has been made for
5204this to work in all cases. In particular, the trace attribute of
5205macros is not handled, nor the current setting of @code{changeword}.
5206Currently, @code{m4wrap} and @code{sysval} also have problems.
5207Also, interactions for some options of @code{m4}, being used in one call
5208and not in the next, have not been fully analyzed yet. On the other
5209end, you may be confident that stacks of @code{pushdef} definitions
5210are handled correctly, as well as undefined or renamed builtins, and
5211changed strings for quotes or comments. And future releases of
5212@acronym{GNU} M4 will improve on the utility of frozen files.
5213
5214When an @code{m4} run is to be frozen, the automatic undiversion
5215which takes place at end of execution is inhibited. Instead, all
5216positively numbered diversions are saved into the frozen file.
5217The active diversion number is also transmitted.
5218
5219A frozen file to be reloaded need not reside in the current directory.
5220It is looked up the same way as an @code{include} file (@pxref{Search
5221Path}).
5222
5223If the frozen file was generated with a newer version of @code{m4}, and
5224contains directives that an older @code{m4} cannot parse, attempting to
5225load the frozen file with option @option{-R} will cause @code{m4} to
5226exit with status 63 to indicate version mismatch.
5227
5228@node Frozen file format
5229@section Frozen file format
5230@cindex frozen file format
5231@cindex file format, frozen file
5232Frozen files are sharable across architectures. It is safe to write
5233a frozen file on one machine and read it on another, given that the
5234second machine uses the same or newer version of @acronym{GNU} @code{m4}.
5235It is conventional, but not required, to give a frozen file the suffix
5236of @code{.m4f}.
5237
5238These are simple (editable) text files, made up of directives,
5239each starting with a capital letter and ending with a newline
5240(@key{NL}). Wherever a directive is expected, the character
5241@samp{#} introduces a comment line; empty lines are also ignored if they
5242are not part of an embedded string.
5243In the following descriptions, each @var{len} refers to the length of
5244the corresponding strings @var{str} in the next line of input. Numbers
5245are always expressed in decimal. There are no escape characters. The
5246directives are:
5247
5248@table @code
5249@item C @var{len1} , @var{len2} @key{NL} @var{str1} @var{str2} @key{NL}
5250Uses @var{str1} and @var{str2} as the begin-comment and
5251end-comment strings. If omitted, then @samp{#} and @key{NL} are the
5252comment delimiters.
5253
5254@item D @var{number}, @var{len} @key{NL} @var{str} @key{NL}
5255Selects diversion @var{number}, making it current, then copy
5256@var{str} in the current diversion. @var{number} may be a negative
5257number for a non-existing diversion. To merely specify an active
5258selection, use this command with an empty @var{str}. With 0 as the
5259diversion @var{number}, @var{str} will be issued on standard output
5260at reload time. @acronym{GNU} @code{m4} will not produce the @samp{D}
5261directive with non-zero length for diversion 0, but this can be done
5262with manual edits. This directive may
5263appear more than once for the same diversion, in which case the
5264diversion is the concatenation of the various uses. If omitted, then
5265diversion 0 is current.
5266
5267@item F @var{len1} , @var{len2} @key{NL} @var{str1} @var{str2} @key{NL}
5268Defines, through @code{pushdef}, a definition for @var{str1}
5269expanding to the function whose builtin name is @var{str2}. If the
5270builtin does not exist (for example, if the frozen file was produced by
5271a copy of @code{m4} compiled with changeword support, but the version
5272of @code{m4} reloading was compiled without it), the reload is silent,
5273but any subsequent use of the definition of @var{str1} will result in
5274a warning. This directive may appear more than once for the same name,
5275and its order, along with @samp{T}, is important. If omitted, you will
5276have no access to any builtins.
5277
5278@item Q @var{len1} , @var{len2} @key{NL} @var{str1} @var{str2} @key{NL}
5279Uses @var{str1} and @var{str2} as the begin-quote and end-quote
5280strings. If omitted, then @samp{`} and @samp{'} are the quote
5281delimiters.
5282
5283@item T @var{len1} , @var{len2} @key{NL} @var{str1} @var{str2} @key{NL}
5284Defines, though @code{pushdef}, a definition for @var{str1}
5285expanding to the text given by @var{str2}. This directive may appear
5286more than once for the same name, and its order, along with @samp{F}, is
5287important.
5288
5289@item V @var{number} @key{NL}
5290Confirms the format of the file. @code{m4} @value{VERSION} only creates
5291and understands frozen files where @var{number} is 1. This directive
5292must be the first non-comment in the file, and may not appear more than
5293once.
5294@end table
5295
5296@node Compatibility
5297@chapter Compatibility with other versions of @code{m4}
5298
5299@cindex compatibility
5300This chapter describes the differences between this implementation of
5301@code{m4}, and the implementation found under UNIX, notably System V,
5302Release 3.
5303
5304There are also differences in BSD flavors of @code{m4}. No attempt
5305is made to summarize these here.
5306
5307@menu
5308* Extensions:: Extensions in @acronym{GNU} M4
5309* Incompatibilities:: Facilities in System V m4 not in GNU M4
5310* Other Incompatibilities:: Other incompatibilities
5311@end menu
5312
5313@node Extensions
5314@section Extensions in @acronym{GNU} @code{m4}
5315
5316@cindex @acronym{GNU} extensions
5317This version of @code{m4} contains a few facilities that do not exist
5318in System V @code{m4}. These extra facilities are all suppressed by
5319using the @option{-G} command line option (@pxref{Limits control, ,
5320Invoking m4}), unless overridden by other command line options.
5321
5322@itemize @bullet
5323@item
5324In the @code{$}@var{n} notation for macro arguments, @var{n} can contain
5325several digits, while the System V @code{m4} only accepts one digit.
5326This allows macros in @acronym{GNU} @code{m4} to take any number of
5327arguments, and not only nine (@pxref{Arguments}).
5328
5329This means that @code{define(`foo', `$11')} is ambiguous between
5330implementations. To portably choose between grabbing the first
5331parameter and appending 1 to the expansion, or grabbing the eleventh
5332parameter, you can do the following:
5333
5334@example
5335define(`a1', `A1')
5336@result{}
5337dnl First argument, concatenated with 1
5338define(`_1', `$1')define(`first1', `_1($@@)1')
5339@result{}
5340dnl Eleventh argument, portable
5341define(`_9', `$9')define(`eleventh', `_9(shift(shift($@@)))')
5342@result{}
5343dnl Eleventh argument, GNU style
5344define(`Eleventh', `$11')
5345@result{}
5346first1(`a', `b', `c', `d', `e', `f', `g', `h', `i', `j', `k')
5347@result{}A1
5348eleventh(`a', `b', `c', `d', `e', `f', `g', `h', `i', `j', `k')
5349@result{}k
5350Eleventh(`a', `b', `c', `d', `e', `f', `g', `h', `i', `j', `k')
5351@result{}k
5352@end example
5353
5354@item
5355The @code{divert} (@pxref{Divert}) macro can manage more than 9
5356diversions. @acronym{GNU} @code{m4} treats all positive numbers as valid
5357diversions, rather than discarding diversions greater than 9.
5358
5359@item
5360Files included with @code{include} and @code{sinclude} are sought in a
5361user specified search path, if they are not found in the working
5362directory. The search path is specified by the @option{-I} option and the
5363@env{M4PATH} environment variable (@pxref{Search Path}).
5364
5365@item
5366Arguments to @code{undivert} can be non-numeric, in which case the named
5367file will be included uninterpreted in the output (@pxref{Undivert}).
5368
5369@item
5370Formatted output is supported through the @code{format} builtin, which
5371is modeled after the C library function @code{printf} (@pxref{Format}).
5372
5373@item
5374Searches and text substitution through regular expressions are
5375supported by the @code{regexp} (@pxref{Regexp}) and @code{patsubst}
5376(@pxref{Patsubst}) builtins.
5377
5378@item
5379The output of shell commands can be read into @code{m4} with
5380@code{esyscmd} (@pxref{Esyscmd}).
5381
5382@item
5383There is indirect access to any builtin macro with @code{builtin}
5384(@pxref{Builtin}).
5385
5386@item
5387Macros can be called indirectly through @code{indir} (@pxref{Indir}).
5388
5389@item
5390The name of the program, the current input file, and the current input
5391line number are accessible through the builtins @code{@w{__program__}},
5392@code{@w{__file__}}, and @code{@w{__line__}} (@pxref{Location}).
5393
5394@item
5395The format of the output from @code{dumpdef} and macro tracing can be
5396controlled with @code{debugmode} (@pxref{Debug Levels}).
5397
5398@item
5399The destination of trace and debug output can be controlled with
5400@code{debugfile} (@pxref{Debug Output}).
5401
5402@item
5403The @code{maketemp} (@pxref{Mkstemp}) macro behaves like @code{mkstemp},
5404creating a new file with a unique name on every invocation, rather than
5405following the insecure behavior of replacing the trailing @samp{X}
5406characters with the @code{m4} process id.
5407@end itemize
5408
5409In addition to the above extensions, @acronym{GNU} @code{m4} implements the
5410following command line options: @option{-F}, @option{-G}, @option{-I},
5411@option{-L}, @option{-R}, @option{-V}, @option{-W}, @option{-d}, @option{-i},
5412@option{-l}, @option{--debugfile} and @option{-t}. @xref{Invoking m4}, for a
5413description of these options.
5414
5415Also, the debugging and tracing facilities in @acronym{GNU} @code{m4} are much
5416more extensive than in most other versions of @code{m4}.
5417
5418@node Incompatibilities
5419@section Facilities in System V @code{m4} not in @acronym{GNU} @code{m4}
5420
5421The version of @code{m4} from System V contains a few facilities that
5422have not been implemented in @acronym{GNU} @code{m4} yet. Additionally,
5423@acronym{POSIX} requires some behaviors that @acronym{GNU} @code{m4} has not
5424implemented yet. Relying on these behaviors is non-portable, as a
5425future release of @acronym{GNU} @code{m4} may change.
5426
5427@itemize @bullet
5428@item
5429System V @code{m4} supports multiple arguments to @code{defn}, and
5430@acronym{POSIX} requires it. This is not yet implemented in @acronym{GNU}
5431@code{m4}. Unfortunately, this means it is not possible to mix builtins
5432and other text into a single macro; a helper macro is required.
5433
5434@item
5435@acronym{POSIX} requires an application to exit with non-zero status if
5436it wrote an error message to stderr. This has not yet been consistently
5437implemented for the various builtins that are required to issue an error
5438(such as @code{include} (@pxref{Include}) when a file is unreadable,
5439@code{eval} (@pxref{Eval}) when an argument cannot be parsed, or using
5440@code{m4exit} (@pxref{M4exit}) with a non-numeric argument).
5441
5442@item
5443Some traditional implementations only allow reading standard input
5444once, but @acronym{GNU} @code{m4} correctly handles multiple instances
5445of @samp{-} on the command line.
5446
5447@item
5448@acronym{POSIX} requires @code{m4wrap} (@pxref{M4wrap}) to act in FIFO
5449(first-in, first-out) order, but @acronym{GNU} @code{m4} currently uses
5450LIFO order. Furthermore, @acronym{POSIX} states that only the first
5451argument to @code{m4wrap} is saved for later evaluation, bug
5452@acronym{GNU} @code{m4} saves and processes all arguments, with output
5453separated by spaces.
5454
5455However, it is possible to emulate @acronym{POSIX} behavior by
5456including the file @file{m4-@value{VERSION}/@/examples/@/wrapfifo.m4}
5457from the distribution:
5458
5459@example
5460undivert(`wrapfifo.m4')dnl
5461@result{}dnl Redefine m4wrap to have FIFO semantics.
5462@result{}define(`_m4wrap_level', `0')dnl
5463@result{}define(`m4wrap',
5464@result{}`ifdef(`m4wrap'_m4wrap_level,
5465@result{} `define(`m4wrap'_m4wrap_level,
5466@result{} defn(`m4wrap'_m4wrap_level)`$1')',
5467@result{} `builtin(`m4wrap', `define(`_m4wrap_level',
5468@result{} incr(_m4wrap_level))dnl
5469@result{}m4wrap'_m4wrap_level)dnl
5470@result{}define(`m4wrap'_m4wrap_level, `$1')')')dnl
5471include(`wrapfifo.m4')
5472@result{}
5473m4wrap(`a`'m4wrap(`c
5474', `d')')m4wrap(`b')
5475@result{}
5476^D
5477@result{}abc
5478@end example
5479
5480@item
5481@acronym{POSIX} requires that all builtins that require arguments, but
5482are called without arguments, behave as though empty strings had been
5483passed. For example, @code{a`'define`'b} would expand to @code{ab}.
5484But @acronym{GNU} @code{m4} ignores certain builtins if they have missing
5485arguments, giving @code{adefineb} for the above example.
5486
5487@item
5488Traditional implementations handle @code{define(`f',`1')} (@pxref{Define})
5489by undefining the entire stack of previous definitions, and if doing
5490@code{undefine(`f')} first. @acronym{GNU} @code{m4} replaces just the top
5491definition on the stack, as if doing @code{popdef(`f')} followed by
5492@code{pushdef(`f',`1')}.
5493
5494@item
5495@acronym{POSIX} requires @code{syscmd} (@pxref{Syscmd}) to evaluate
5496command output for macro expansion, but this appears to be a mistake
5497in @acronym{POSIX} since traditional implementations did not do this.
5498@acronym{GNU} @code{m4} follows traditional behavior in @code{syscmd}, and
5499provides the extension @code{esyscmd} that provides the @acronym{POSIX}
5500semantics.
5501
5502@item
5503At one point, @acronym{POSIX} required @code{changequote(@var{arg})}
5504(@pxref{Changequote}) to use newline as the close quote, but this was a
5505bug, and the next version of @acronym{POSIX} is anticipated to state
5506that using empty strings or just one argument is unspecified.
5507Meanwhile, the @acronym{GNU} @code{m4} behavior of treating an empty
5508end-quote delimiter as @samp{'} is not portable, as Solaris treats it as
5509repeating the start-quote delimiter, and BSD treats it as leaving the
5510previous end-quote delimiter unchanged. For predictable results, never
5511call changequote with just one argument, or with empty strings for
5512arguments.
5513
5514@item
5515At one point, @acronym{POSIX} required @code{changecom(@var{arg},)}
5516(@pxref{Changecom}) to make it impossible to end a comment, but this is
5517a bug, and the next version of @acronym{POSIX} is anticipated to state
5518that using empty strings is unspecified. Meanwhile, the @acronym{GNU}
5519@code{m4} behavior of treating an empty end-comment delimiter as newline
5520is not portable, as BSD treats it as leaving the previous end-comment
5521delimiter unchanged. It is also impossible in BSD implementations to
5522disable comments, even though that is required by @acronym{POSIX}. For
5523predictable results, never call changecom with empty strings for
5524arguments.
5525
5526@item
5527Most implementations of @code{m4} give macros a higher precedence than
5528comments when parsing, meaning that if the start delimiter given to
5529@code{changecom} (@pxref{Changecom}) starts with a macro name, comments
5530are effectively disabled. @acronym{POSIX} does not specify what the
5531precedence is, so the @acronym{GNU} @code{m4} parser recognizes
5532comments, then macros, then quoted strings.
5533
5534@item
5535Traditional implementations allow argument collection, but not string
5536and comment processing, to span file boundaries. Thus, if @file{a.m4}
5537contains @samp{len(}, and @file{b.m4} contains @samp{abc)},
5538@kbd{m4 a.m4 b.m4} outputs @samp{3} with traditional @code{m4}, but
5539gives an error message that the end of file was encountered inside a
5540macro with @acronym{GNU} @code{m4}. On the other hand, traditional
5541implementations do end of file processing for files included with
5542@code{include} or @code{sinclude} (@pxref{Include}), while @acronym{GNU}
5543@code{m4} seamlessly integrates the content of those files. Thus
5544@code{include(`a.m4')include(`b.m4')} will output @samp{3} instead of
5545giving an error.
5546
5547@item
5548Traditional @code{m4} treats @code{traceon} (@pxref{Trace}) without
5549arguments as a global variable, independent of named macro tracing.
5550Also, once a macro is undefined, named tracing of that macro is lost.
5551On the other hand, when @acronym{GNU} @code{m4} encounters
5552@code{traceon} without
5553arguments, it turns tracing on for all existing definitions at the time,
5554but does not trace future definitions; @code{traceoff} without arguments
5555turns tracing off for all definitions regardless of whether they were
5556also traced by name; and tracing by name, such as with @option{-tfoo} at
5557the command line or @code{traceon(`foo')} in the input, is an attribute
5558that is preserved even if the macro is currently undefined.
5559
5560@item
5561@acronym{POSIX} requires @code{eval} (@pxref{Eval}) to treat all
5562operators with the same precedence as C. However, @acronym{GNU} @code{m4}
5563currently follows the traditional precedence of other @code{m4}
5564implementations, where bitwise and logical negation (@samp{~} and
5565@samp{!}) have lower precedence than equality operators, rather than
5566equal precedence with other unary operators. Use explicit parentheses
5567to ensure proper precedence. As extensions to @acronym{POSIX}, @acronym{GNU}
5568@code{m4} treats the shift operators @samp{<<} and @samp{>>} as
5569well-defined on signed integers (even though they are not in C), and
5570adds the exponentiation operator @samp{**}.
5571
5572@item
5573@acronym{POSIX} requires @code{translit} (@pxref{Translit}) to treat
5574each character of the second and third arguments literally, but @acronym{GNU}
5575@code{m4} treats @samp{-} as a range operator.
5576
5577@item
5578@acronym{POSIX} requires @code{m4} to honor the locale environment
5579variables of @env{LANG}, @env{LC_ALL}, @env{LC_CTYPE},
5580@env{LC_MESSAGES}, and @env{NLSPATH}, but this has not yet been
5581implemented in @acronym{GNU} @code{m4}.
5582
5583@item
5584@acronym{POSIX} states that only unquoted leading newlines and blanks
5585(that is, space and tab) are ignored when collecting macro arguments.
5586However, this appears to be a bug in @acronym{POSIX}, since most
5587traditional implementations also ignore all whitespace (formfeed,
5588carriage return, and vertical tab). @acronym{GNU} @code{m4} follows
5589tradition and ignores all leading unquoted whitespace.
5590@end itemize
5591
5592@node Other Incompatibilities
5593@section Other incompatibilities
5594
5595There are a few other incompatibilities between this implementation of
5596@code{m4}, and the System V version.
5597
5598@itemize @bullet
5599@item
5600@acronym{GNU} @code{m4} implements sync lines differently from System V
5601@code{m4}, when text is being diverted. @acronym{GNU} @code{m4} outputs
5602the sync lines when the text is being diverted, and System V @code{m4}
5603when the diverted text is being brought back.
5604
5605The problem is which lines and file names should be attached to text that
5606is being, or has been, diverted. System V @code{m4} regards all the
5607diverted text as being generated by the source line containing the
5608@code{undivert} call, whereas @acronym{GNU} @code{m4} regards the
5609diverted text as being generated at the time it is diverted.
5610
5611The sync line option is used mostly when using @code{m4} as
5612a front end to a compiler. If a diverted line causes a compiler error,
5613the error messages should most probably refer to the place where the
5614diversion were made, and not where it was inserted again.
5615
5616@item
5617@acronym{GNU} @code{m4} makes no attempt at prohibiting self-referential
5618definitions like:
5619
5620@comment ignore
5621@example
5622define(`x', `x')
5623@result{}
5624define(`x', `x ')
5625@result{}
5626@end example
5627
5628There is nothing inherently wrong with defining @samp{x} to
5629return @samp{x}. The wrong thing is to expand @samp{x} unquoted.
5630In @code{m4}, one might use macros to hold strings, as we do for
5631variables in other programming languages, further checking them with:
5632
5633@comment ignore
5634@example
5635ifelse(defn(`@var{holder}'), `@var{value}', @dots{})
5636@end example
5637
5638@noindent
5639In cases like this one, an interdiction for a macro to hold its own
5640name would be a useless limitation. Of course, this leaves more rope
5641for the @acronym{GNU} @code{m4} user to hang himself! Rescanning hangs may be
5642avoided through careful programming, a little like for endless loops
5643in traditional programming languages.
5644@end itemize
5645
5646@node Answers
5647@chapter Correct version of some examples
5648
5649Some of the examples in this manuals are buggy or not very robust, for
5650demonstration purposes. Improved versions of these composite macros are
5651presented here.
5652
5653@menu
5654* Improved exch:: Solution for @code{exch}
5655* Improved forloop:: Solution for @code{forloop}
5656* Improved foreach:: Solution for @code{foreach}
5657* Improved cleardivert:: Solution for @code{cleardivert}
5658* Improved fatal_error:: Solution for @code{fatal_error}
5659@end menu
5660
5661@node Improved exch
5662@section Solution for @code{exch}
5663
5664The @code{exch} macro (@pxref{Arguments}) as presented requires clients
5665to double quote their arguments. A nicer definition, which lets
5666clients follow the rule of thumb of one level of quoting per level of
5667parentheses, involves adding quotes in the definition of @code{exch}, as
5668follows:
5669
5670@example
5671define(`exch', ``$2', `$1'')
5672@result{}
5673define(exch(`expansion text', `macro'))
5674@result{}
5675macro
5676@result{}expansion text
5677@end example
5678
5679@node Improved forloop
5680@section Solution for @code{forloop}
5681
5682The @code{forloop} macro (@pxref{Forloop}) as presented earlier can go
5683into an infinite loop if given an iterator that is not parsed as a macro
5684name. It does not do any sanity checking on its numeric bounds, and
5685only permits decimal numbers for bounds. Here is an improved version,
5686shipped as @file{m4-@value{VERSION}/@/examples/@/forloop2.m4}; this
5687version also optimizes based on the fact that the starting bound does
5688not need to be passed to the helper @code{@w{_forloop}}.
5689
5690@example
5691undivert(`forloop2.m4')dnl
5692@result{}divert(`-1')
5693@result{}# forloop(var, from, to, stmt) - improved version:
5694@result{}# works even if VAR is not a strict macro name
5695@result{}# performs sanity check that FROM is larger than TO
5696@result{}# allows complex numerical expressions in TO and FROM
5697@result{}define(`forloop', `ifelse(eval(`($3) >= ($2)'), `1',
5698@result{} `pushdef(`$1', eval(`$2'))_forloop(`$1',
5699@result{} eval(`$3'), `$4')popdef(`$1')')')
5700@result{}define(`_forloop',
5701@result{} `$3`'ifelse(indir(`$1'), `$2', `',
5702@result{} `define(`$1', incr(indir(`$1')))$0($@@)')')
5703@result{}divert`'dnl
5704include(`forloop2.m4')
5705@result{}
5706forloop(`i', `2', `1', `no iteration occurs')
5707@result{}
5708forloop(`', `1', `2', ` odd iterator name')
5709@result{} odd iterator name odd iterator name
5710forloop(`i', `5 + 5', `0xc', ` 0x`'eval(i, `16')')
5711@result{} 0xa 0xb 0xc
5712forloop(`i', `a', `b', `non-numeric bounds')
5713@error{}m4:stdin:6: bad expression in eval (bad input): (b) >= (a)
5714@result{}
5715@end example
5716
5717Of course, it is possible to make even more improvements, such as
5718adding an optional step argument, or allowing iteration through
5719descending sequences. @acronym{GNU} Autoconf provides some of these
5720additional bells and whistles in its @code{m4_for} macro.
5721
5722@node Improved foreach
5723@section Solution for @code{foreach}
5724
5725The @code{foreach} and @code{foreachq} macros (@pxref{Foreach}) as
5726presented earlier each have flaws. First, we will examine and fix the
5727quadratic behavior of @code{foreachq}:
5728
5729@example
5730include(`foreachq.m4')
5731@result{}
5732traceon(`shift')debugmode(`aq')
5733@result{}
5734foreachq(`x', ``1', `2', `3', `4'', `x
5735')dnl
5736@result{}1
5737@error{}m4trace: -3- shift(`1', `2', `3', `4')
5738@error{}m4trace: -2- shift(`1', `2', `3', `4')
5739@result{}2
5740@error{}m4trace: -4- shift(`1', `2', `3', `4')
5741@error{}m4trace: -3- shift(`2', `3', `4')
5742@error{}m4trace: -3- shift(`1', `2', `3', `4')
5743@error{}m4trace: -2- shift(`2', `3', `4')
5744@result{}3
5745@error{}m4trace: -5- shift(`1', `2', `3', `4')
5746@error{}m4trace: -4- shift(`2', `3', `4')
5747@error{}m4trace: -3- shift(`3', `4')
5748@error{}m4trace: -4- shift(`1', `2', `3', `4')
5749@error{}m4trace: -3- shift(`2', `3', `4')
5750@error{}m4trace: -2- shift(`3', `4')
5751@result{}4
5752@error{}m4trace: -6- shift(`1', `2', `3', `4')
5753@error{}m4trace: -5- shift(`2', `3', `4')
5754@error{}m4trace: -4- shift(`3', `4')
5755@error{}m4trace: -3- shift(`4')
5756@end example
5757
5758Each successive iteration was adding more quoted @code{shift}
5759invocations, and the entire list contents were passing through every
5760iteration. In general, when recursing, it is a good idea to make the
5761recursion use fewer arguments, rather than adding additional quoted
5762uses of @code{shift}. By doing so, @code{m4} uses less memory, invokes
5763fewer macros, is less likely to run into machine limits, and most
5764importantly, performs faster. The fixed version of @code{foreachq} can
5765be found in @file{m4-@value{VERSION}/@/examples/@/foreachq2.m4}:
5766
5767@example
5768include(`foreachq2.m4')
5769@result{}
5770undivert(`foreachq2.m4')dnl
5771@result{}include(`quote.m4')dnl
5772@result{}divert(`-1')
5773@result{}# foreachq(x, `item_1, item_2, ..., item_n', stmt)
5774@result{}# quoted list, improved version
5775@result{}define(`foreachq', `pushdef(`$1')_foreachq($@@)popdef(`$1')')
5776@result{}define(`_arg1q', ``$1'')
5777@result{}define(`_rest', `ifelse(`$#', `1', `', `dquote(shift($@@))')')
5778@result{}define(`_foreachq', `ifelse(`$2', `', `',
5779@result{} `define(`$1', _arg1q($2))$3`'$0(`$1', _rest($2), `$3')')')
5780@result{}divert`'dnl
5781traceon(`shift')debugmode(`aq')
5782@result{}
5783foreachq(`x', ``1', `2', `3', `4'', `x
5784')dnl
5785@result{}1
5786@error{}m4trace: -3- shift(`1', `2', `3', `4')
5787@result{}2
5788@error{}m4trace: -3- shift(`2', `3', `4')
5789@result{}3
5790@error{}m4trace: -3- shift(`3', `4')
5791@result{}4
5792@end example
5793
5794Note that the fixed version calls unquoted helper macros in
5795@code{@w{_foreachq}} to trim elements immediately; those helper macros
5796in turn must re-supply the layer of quotes lost in the macro invocation.
5797Contrast the use of @code{@w{_arg1q}}, which quotes the first list
5798element, with @code{@w{_arg1}} of the earlier implementation that
5799returned the first list element directly.
5800
5801For a different approach, the improved version of @code{foreach},
5802available in @file{m4-@value{VERSION}/@/examples/@/foreach2.m4}, simply
5803overquotes the arguments to @code{@w{_foreach}} to begin with, using
5804@code{dquote_elt}. Then @code{@w{_foreach}} can just use
5805@code{@w{_arg1}} to remove the extra layer of quoting that was added up
5806front:
5807
5808@example
5809include(`foreach2.m4')
5810@result{}
5811undivert(`foreach2.m4')dnl
5812@result{}include(`quote.m4')dnl
5813@result{}divert(`-1')
5814@result{}# foreach(x, (item_1, item_2, ..., item_n), stmt)
5815@result{}# parenthesized list, improved version
5816@result{}define(`foreach', `pushdef(`$1')_foreach(`$1',
5817@result{} (dquote(dquote_elt$2)), `$3')popdef(`$1')')
5818@result{}define(`_arg1', `$1')
5819@result{}define(`_foreach', `ifelse(`$2', `(`')', `',
5820@result{} `define(`$1', _arg1$2)$3`'$0(`$1', (dquote(shift$2)), `$3')')')
5821@result{}divert`'dnl
5822traceon(`shift')debugmode(`aq')
5823@result{}
5824foreach(`x', `(`1', `2', `3', `4')', `x
5825')dnl
5826@error{}m4trace: -4- shift(`1', `2', `3', `4')
5827@error{}m4trace: -4- shift(`2', `3', `4')
5828@error{}m4trace: -4- shift(`3', `4')
5829@result{}1
5830@error{}m4trace: -3- shift(``1'', ``2'', ``3'', ``4'')
5831@result{}2
5832@error{}m4trace: -3- shift(``2'', ``3'', ``4'')
5833@result{}3
5834@error{}m4trace: -3- shift(``3'', ``4'')
5835@result{}4
5836@error{}m4trace: -3- shift(``4'')
5837@end example
5838
5839In summary, recursion over list elements is trickier than it appeared at
5840first glance, but provides a powerful idiom within @code{m4} processing.
5841As a final demonstration, both list styles are now able to handle
5842several scenarios that would wreak havoc on the original
5843implementations. This points out one other difference between the two
5844list styles. @code{foreach} evaluates unquoted list elements only once,
5845in preparation for calling @code{@w{_foreach}}. But @code{foreachq}
5846evaluates unquoted list elements twice while visiting the first list
5847element, once in @code{@w{_arg1q}} and once in @code{@w{_rest}}. When
5848deciding which list style to use, one must take into account whether
5849repeating the side effects of unquoted list elements will have any
5850detrimental effects.
5851
5852@example
5853include(`foreach2.m4')
5854@result{}
5855include(`foreachq2.m4')
5856@result{}
5857dnl 0-element list:
5858foreach(`x', `', `<x>') / foreachq(`x', `', `<x>')
5859@result{} /@w{ }
5860dnl 1-element list of empty element
5861foreach(`x', `()', `<x>') / foreachq(`x', ``'', `<x>')
5862@result{}<> / <>
5863dnl 2-element list of empty elements
5864foreach(`x', `(`',`')', `<x>') / foreachq(`x', ``',`'', `<x>')
5865@result{}<><> / <><>
5866dnl 1-element list of a comma
5867foreach(`x', `(`,')', `<x>') / foreachq(`x', ``,'', `<x>')
5868@result{}<,> / <,>
5869dnl 2-element list of unbalanced parentheses
5870foreach(`x', `(`(', `)')', `<x>') / foreachq(`x', ``(', `)'', `<x>')
5871@result{}<(><)> / <(><)>
5872define(`active', `ACT, IVE')
5873@result{}
5874traceon(`active')
5875@result{}
5876dnl list of unquoted macros; expansion occurs before recursion
5877foreach(`x', `(active, active)', `<x>
5878')dnl
5879@error{}m4trace: -4- active -> `ACT, IVE'
5880@error{}m4trace: -4- active -> `ACT, IVE'
5881@result{}<ACT>
5882@result{}<IVE>
5883@result{}<ACT>
5884@result{}<IVE>
5885foreachq(`x', `active, active', `<x>
5886')dnl
5887@error{}m4trace: -3- active -> `ACT, IVE'
5888@error{}m4trace: -3- active -> `ACT, IVE'
5889@result{}<ACT>
5890@error{}m4trace: -3- active -> `ACT, IVE'
5891@error{}m4trace: -3- active -> `ACT, IVE'
5892@result{}<IVE>
5893@result{}<ACT>
5894@result{}<IVE>
5895dnl list of quoted macros; expansion occurs during recursion
5896foreach(`x', `(`active', `active')', `<x>
5897')dnl
5898@error{}m4trace: -1- active -> `ACT, IVE'
5899@result{}<ACT, IVE>
5900@error{}m4trace: -1- active -> `ACT, IVE'
5901@result{}<ACT, IVE>
5902foreachq(`x', ``active', `active'', `<x>
5903')dnl
5904@error{}m4trace: -1- active -> `ACT, IVE'
5905@result{}<ACT, IVE>
5906@error{}m4trace: -1- active -> `ACT, IVE'
5907@result{}<ACT, IVE>
5908dnl list of double-quoted macro names; no expansion
5909foreach(`x', `(``active'', ``active'')', `<x>
5910')dnl
5911@result{}<active>
5912@result{}<active>
5913foreachq(`x', ```active'', ``active''', `<x>
5914')dnl
5915@result{}<active>
5916@result{}<active>
5917@end example
5918
5919@node Improved cleardivert
5920@section Solution for @code{cleardivert}
5921
5922The @code{cleardivert} macro (@pxref{Cleardivert}) cannot, as it stands, be
5923called without arguments to clear all pending diversions. That is
5924because using undivert with an empty string for an argument is different
5925than using it with no arguments at all. Compare the earlier definition
5926with one that takes the number of arguments into account:
5927
5928@example
5929define(`cleardivert',
5930 `pushdef(`_n', divnum)divert(`-1')undivert($@@)divert(_n)popdef(`_n')')
5931@result{}
5932divert(`1')one
5933divert
5934@result{}
5935cleardivert
5936@result{}
5937undivert
5938@result{}one
5939@result{}
5940define(`cleardivert',
5941 `pushdef(`_num', divnum)divert(`-1')ifelse(`$#', `0',
5942 `undivert`'', `undivert($@@)')divert(_num)popdef(`_num')')
5943@result{}
5944divert(`2')two
5945divert
5946@result{}
5947cleardivert
5948@result{}
5949undivert
5950@result{}
5951@end example
5952
5953@node Improved fatal_error
5954@section Solution for @code{fatal_error}
5955
5956The @code{fatal_error} macro (@pxref{M4exit}) is not robust to versions
5957of @acronym{GNU} M4 earlier than 1.4.8, where invoking
5958@code{@w{__file__}} (@pxref{Location}) inside @code{m4wrap} would result
5959in an empty string, and @code{@w{__line__}} resulted in @samp{0} even
5960though all files start at line 1. Furthermore, versions earlier than
59611.4.6 did not support the @code{@w{__program__}} macro. If you want
5962@code{fatal_error} to work across the entire 1.4.x release series, a
5963better implementation would be:
5964
5965@example
5966define(`fatal_error',
5967 `errprint(ifdef(`__program__', `__program__', ``m4'')'dnl
5968`:ifelse(__line__, `0', `',
5969 `__file__:__line__:')` fatal error: $*
5970')m4exit(`1')')
5971@result{}
5972m4wrap(`divnum(`demo of internal message')
5973fatal_error(`inside wrapped text')')
5974@result{}
5975^D
5976@error{}m4:stdin:6: Warning: excess arguments to builtin `divnum' ignored
5977@result{}0
5978@error{}m4:stdin:6: fatal error: inside wrapped text
5979@end example
5980
5981@c ========================================================== Appendices
5982
5983@node Copying This Manual
5984@appendix How to make copies of this manual
5985@cindex License
5986
5987@menu
5988* GNU Free Documentation License:: License for copying this manual
5989@end menu
5990
5991@include fdl.texi
5992
5993@node Indices
5994@appendix Indices of concepts and macros
5995
5996@menu
5997* Concept index:: Index for many concepts
5998* Macro index:: Index for all @code{m4} macros
5999@end menu
6000
6001@node Concept index
6002@appendixsec Index for many concepts
6003
6004@printindex cp
6005
6006@node Macro index
6007@appendixsec Index for all @code{m4} macros
6008
6009References are exclusively to the places where a builtin is introduced
6010the first time.
6011
6012@iftex
6013@sp 1
6014@end iftex
6015
6016@printindex fn
6017
6018@bye
6019
6020@c Local Variables:
6021@c coding: ISO-8859-1
6022@c fill-column: 72
6023@c ispell-local-dictionary: "american"
6024@c indent-tabs-mode: nil
6025@c whitespace-check-buffer-indent: nil
6026@c End:
Note: See TracBrowser for help on using the repository browser.