1 | .ds PX \s-1POSIX\s+1
|
---|
2 | .ds UX \s-1UNIX\s+1
|
---|
3 | .ds AN \s-1ANSI\s+1
|
---|
4 | .ds GN \s-1GNU\s+1
|
---|
5 | .ds AK \s-1AWK\s+1
|
---|
6 | .ds EP \fIGAWK: Effective AWK Programming\fP
|
---|
7 | .if !\n(.g \{\
|
---|
8 | . if !\w|\*(lq| \{\
|
---|
9 | . ds lq ``
|
---|
10 | . if \w'\(lq' .ds lq "\(lq
|
---|
11 | . \}
|
---|
12 | . if !\w|\*(rq| \{\
|
---|
13 | . ds rq ''
|
---|
14 | . if \w'\(rq' .ds rq "\(rq
|
---|
15 | . \}
|
---|
16 | .\}
|
---|
17 | .TH GAWK 1 "June 26 2005" "Free Software Foundation" "Utility Commands"
|
---|
18 | .SH NAME
|
---|
19 | gawk \- pattern scanning and processing language
|
---|
20 | .SH SYNOPSIS
|
---|
21 | .B gawk
|
---|
22 | [ \*(PX or \*(GN style options ]
|
---|
23 | .B \-f
|
---|
24 | .I program-file
|
---|
25 | [
|
---|
26 | .B \-\^\-
|
---|
27 | ] file .\|.\|.
|
---|
28 | .br
|
---|
29 | .B gawk
|
---|
30 | [ \*(PX or \*(GN style options ]
|
---|
31 | [
|
---|
32 | .B \-\^\-
|
---|
33 | ]
|
---|
34 | .I program-text
|
---|
35 | file .\|.\|.
|
---|
36 | .sp
|
---|
37 | .B pgawk
|
---|
38 | [ \*(PX or \*(GN style options ]
|
---|
39 | .B \-f
|
---|
40 | .I program-file
|
---|
41 | [
|
---|
42 | .B \-\^\-
|
---|
43 | ] file .\|.\|.
|
---|
44 | .br
|
---|
45 | .B pgawk
|
---|
46 | [ \*(PX or \*(GN style options ]
|
---|
47 | [
|
---|
48 | .B \-\^\-
|
---|
49 | ]
|
---|
50 | .I program-text
|
---|
51 | file .\|.\|.
|
---|
52 | .SH DESCRIPTION
|
---|
53 | .I Gawk
|
---|
54 | is the \*(GN Project's implementation of the \*(AK programming language.
|
---|
55 | It conforms to the definition of the language in
|
---|
56 | the \*(PX 1003.2 Command Language And Utilities Standard.
|
---|
57 | This version in turn is based on the description in
|
---|
58 | .IR "The AWK Programming Language" ,
|
---|
59 | by Aho, Kernighan, and Weinberger,
|
---|
60 | with the additional features found in the System V Release 4 version
|
---|
61 | of \*(UX
|
---|
62 | .IR awk .
|
---|
63 | .I Gawk
|
---|
64 | also provides more recent Bell Laboratories
|
---|
65 | .I awk
|
---|
66 | extensions, and a number of \*(GN-specific extensions.
|
---|
67 | .PP
|
---|
68 | .I Pgawk
|
---|
69 | is the profiling version of
|
---|
70 | .IR gawk .
|
---|
71 | It is identical in every way to
|
---|
72 | .IR gawk ,
|
---|
73 | except that programs run more slowly,
|
---|
74 | and it automatically produces an execution profile in the file
|
---|
75 | .B awkprof.out
|
---|
76 | when done.
|
---|
77 | See the
|
---|
78 | .B \-\^\-profile
|
---|
79 | option, below.
|
---|
80 | .PP
|
---|
81 | The command line consists of options to
|
---|
82 | .I gawk
|
---|
83 | itself, the \*(AK program text (if not supplied via the
|
---|
84 | .B \-f
|
---|
85 | or
|
---|
86 | .B \-\^\-file
|
---|
87 | options), and values to be made
|
---|
88 | available in the
|
---|
89 | .B ARGC
|
---|
90 | and
|
---|
91 | .B ARGV
|
---|
92 | pre-defined \*(AK variables.
|
---|
93 | .SH OPTION FORMAT
|
---|
94 | .PP
|
---|
95 | .I Gawk
|
---|
96 | options may be either traditional \*(PX one letter options,
|
---|
97 | or \*(GN style long options. \*(PX options start with a single \*(lq\-\*(rq,
|
---|
98 | while long options start with \*(lq\-\^\-\*(rq.
|
---|
99 | Long options are provided for both \*(GN-specific features and
|
---|
100 | for \*(PX-mandated features.
|
---|
101 | .PP
|
---|
102 | Following the \*(PX standard,
|
---|
103 | .IR gawk -specific
|
---|
104 | options are supplied via arguments to the
|
---|
105 | .B \-W
|
---|
106 | option. Multiple
|
---|
107 | .B \-W
|
---|
108 | options may be supplied
|
---|
109 | Each
|
---|
110 | .B \-W
|
---|
111 | option has a corresponding long option, as detailed below.
|
---|
112 | Arguments to long options are either joined with the option
|
---|
113 | by an
|
---|
114 | .B =
|
---|
115 | sign, with no intervening spaces, or they may be provided in the
|
---|
116 | next command line argument.
|
---|
117 | Long options may be abbreviated, as long as the abbreviation
|
---|
118 | remains unique.
|
---|
119 | .SH OPTIONS
|
---|
120 | .PP
|
---|
121 | .I Gawk
|
---|
122 | accepts the following options, listed alphabetically.
|
---|
123 | .TP
|
---|
124 | .PD 0
|
---|
125 | .BI \-F " fs"
|
---|
126 | .TP
|
---|
127 | .PD
|
---|
128 | .BI \-\^\-field-separator " fs"
|
---|
129 | Use
|
---|
130 | .I fs
|
---|
131 | for the input field separator (the value of the
|
---|
132 | .B FS
|
---|
133 | predefined
|
---|
134 | variable).
|
---|
135 | .TP
|
---|
136 | .PD 0
|
---|
137 | \fB\-v\fI var\fB\^=\^\fIval\fR
|
---|
138 | .TP
|
---|
139 | .PD
|
---|
140 | \fB\-\^\-assign \fIvar\fB\^=\^\fIval\fR
|
---|
141 | Assign the value
|
---|
142 | .I val
|
---|
143 | to the variable
|
---|
144 | .IR var ,
|
---|
145 | before execution of the program begins.
|
---|
146 | Such variable values are available to the
|
---|
147 | .B BEGIN
|
---|
148 | block of an \*(AK program.
|
---|
149 | .TP
|
---|
150 | .PD 0
|
---|
151 | .BI \-f " program-file"
|
---|
152 | .TP
|
---|
153 | .PD
|
---|
154 | .BI \-\^\-file " program-file"
|
---|
155 | Read the \*(AK program source from the file
|
---|
156 | .IR program-file ,
|
---|
157 | instead of from the first command line argument.
|
---|
158 | Multiple
|
---|
159 | .B \-f
|
---|
160 | (or
|
---|
161 | .BR \-\^\-file )
|
---|
162 | options may be used.
|
---|
163 | .TP
|
---|
164 | .PD 0
|
---|
165 | .BI \-mf " NNN"
|
---|
166 | .TP
|
---|
167 | .PD
|
---|
168 | .BI \-mr " NNN"
|
---|
169 | Set various memory limits to the value
|
---|
170 | .IR NNN .
|
---|
171 | The
|
---|
172 | .B f
|
---|
173 | flag sets the maximum number of fields, and the
|
---|
174 | .B r
|
---|
175 | flag sets the maximum record size. These two flags and the
|
---|
176 | .B \-m
|
---|
177 | option are from the Bell Laboratories research version of \*(UX
|
---|
178 | .IR awk .
|
---|
179 | They are ignored by
|
---|
180 | .IR gawk ,
|
---|
181 | since
|
---|
182 | .I gawk
|
---|
183 | has no pre-defined limits.
|
---|
184 | .TP
|
---|
185 | .PD 0
|
---|
186 | .B "\-W compat"
|
---|
187 | .TP
|
---|
188 | .PD 0
|
---|
189 | .B "\-W traditional"
|
---|
190 | .TP
|
---|
191 | .PD 0
|
---|
192 | .B \-\^\-compat
|
---|
193 | .TP
|
---|
194 | .PD
|
---|
195 | .B \-\^\-traditional
|
---|
196 | Run in
|
---|
197 | .I compatibility
|
---|
198 | mode. In compatibility mode,
|
---|
199 | .I gawk
|
---|
200 | behaves identically to \*(UX
|
---|
201 | .IR awk ;
|
---|
202 | none of the \*(GN-specific extensions are recognized.
|
---|
203 | The use of
|
---|
204 | .B \-\^\-traditional
|
---|
205 | is preferred over the other forms of this option.
|
---|
206 | See
|
---|
207 | .BR "GNU EXTENSIONS" ,
|
---|
208 | below, for more information.
|
---|
209 | .TP
|
---|
210 | .PD 0
|
---|
211 | .B "\-W copyleft"
|
---|
212 | .TP
|
---|
213 | .PD 0
|
---|
214 | .B "\-W copyright"
|
---|
215 | .TP
|
---|
216 | .PD 0
|
---|
217 | .B \-\^\-copyleft
|
---|
218 | .TP
|
---|
219 | .PD
|
---|
220 | .B \-\^\-copyright
|
---|
221 | Print the short version of the \*(GN copyright information message on
|
---|
222 | the standard output and exit successfully.
|
---|
223 | .TP
|
---|
224 | .PD 0
|
---|
225 | \fB\-W dump-variables\fR[\fB=\fIfile\fR]
|
---|
226 | .TP
|
---|
227 | .PD
|
---|
228 | \fB\-\^\-dump-variables\fR[\fB=\fIfile\fR]
|
---|
229 | Print a sorted list of global variables, their types and final values to
|
---|
230 | .IR file .
|
---|
231 | If no
|
---|
232 | .I file
|
---|
233 | is provided,
|
---|
234 | .I gawk
|
---|
235 | uses a file named
|
---|
236 | .I awkvars.out
|
---|
237 | in the current directory.
|
---|
238 | .sp .5
|
---|
239 | Having a list of all the global variables is a good way to look for
|
---|
240 | typographical errors in your programs.
|
---|
241 | You would also use this option if you have a large program with a lot of
|
---|
242 | functions, and you want to be sure that your functions don't
|
---|
243 | inadvertently use global variables that you meant to be local.
|
---|
244 | (This is a particularly easy mistake to make with simple variable
|
---|
245 | names like
|
---|
246 | .BR i ,
|
---|
247 | .BR j ,
|
---|
248 | and so on.)
|
---|
249 | .TP
|
---|
250 | .PD 0
|
---|
251 | .BI "\-W exec " file
|
---|
252 | .TP
|
---|
253 | .PD
|
---|
254 | .BI \-\^\-exec " file"
|
---|
255 | Similar to
|
---|
256 | .BR \-f ,
|
---|
257 | however, this is option is the last one processed.
|
---|
258 | This should be used with
|
---|
259 | .B #!
|
---|
260 | scripts, particularly for CGI applications, to avoid
|
---|
261 | passing in options or source code (!) on the command line
|
---|
262 | from a URL.
|
---|
263 | This option disables command-line variable assignments.
|
---|
264 | .TP
|
---|
265 | .PD 0
|
---|
266 | .B "\-W gen\-po"
|
---|
267 | .TP
|
---|
268 | .PD
|
---|
269 | .B \-\^\-gen\-po
|
---|
270 | Scan and parse the \*(AK program, and generate a \*(GN
|
---|
271 | .B \&.po
|
---|
272 | format file on standard output with entries for all localizable
|
---|
273 | strings in the program. The program itself is not executed.
|
---|
274 | See the \*(GN
|
---|
275 | .I gettext
|
---|
276 | distribution for more information on
|
---|
277 | .B \&.po
|
---|
278 | files.
|
---|
279 | .TP
|
---|
280 | .PD 0
|
---|
281 | .B "\-W help"
|
---|
282 | .TP
|
---|
283 | .PD 0
|
---|
284 | .B "\-W usage"
|
---|
285 | .TP
|
---|
286 | .PD 0
|
---|
287 | .B \-\^\-help
|
---|
288 | .TP
|
---|
289 | .PD
|
---|
290 | .B \-\^\-usage
|
---|
291 | Print a relatively short summary of the available options on
|
---|
292 | the standard output.
|
---|
293 | (Per the
|
---|
294 | .IR "GNU Coding Standards" ,
|
---|
295 | these options cause an immediate, successful exit.)
|
---|
296 | .TP
|
---|
297 | .PD 0
|
---|
298 | .BR "\-W lint" [ =\fIvalue\fR ]
|
---|
299 | .TP
|
---|
300 | .PD
|
---|
301 | .BR \-\^\-lint [ =\fIvalue\fR ]
|
---|
302 | Provide warnings about constructs that are
|
---|
303 | dubious or non-portable to other \*(AK implementations.
|
---|
304 | With an optional argument of
|
---|
305 | .BR fatal ,
|
---|
306 | lint warnings become fatal errors.
|
---|
307 | This may be drastic, but its use will certainly encourage the
|
---|
308 | development of cleaner \*(AK programs.
|
---|
309 | With an optional argument of
|
---|
310 | .BR invalid ,
|
---|
311 | only warnings about things that are
|
---|
312 | actually invalid are issued. (This is not fully implemented yet.)
|
---|
313 | .TP
|
---|
314 | .PD 0
|
---|
315 | .B "\-W lint\-old"
|
---|
316 | .TP
|
---|
317 | .PD
|
---|
318 | .B \-\^\-lint\-old
|
---|
319 | Provide warnings about constructs that are
|
---|
320 | not portable to the original version of Unix
|
---|
321 | .IR awk .
|
---|
322 | .TP
|
---|
323 | .PD 0
|
---|
324 | .B "\-W non\-decimal\-data"
|
---|
325 | .TP
|
---|
326 | .PD
|
---|
327 | .B "\-\^\-non\-decimal\-data"
|
---|
328 | Recognize octal and hexadecimal values in input data.
|
---|
329 | .I "Use this option with great caution!"
|
---|
330 | .ig
|
---|
331 | .\" This option is left undocumented, on purpose.
|
---|
332 | .TP
|
---|
333 | .PD 0
|
---|
334 | .B "\-W nostalgia"
|
---|
335 | .TP
|
---|
336 | .PD
|
---|
337 | .B \-\^\-nostalgia
|
---|
338 | Provide a moment of nostalgia for long time
|
---|
339 | .I awk
|
---|
340 | users.
|
---|
341 | ..
|
---|
342 | .TP
|
---|
343 | .PD 0
|
---|
344 | .B "\-W posix"
|
---|
345 | .TP
|
---|
346 | .PD
|
---|
347 | .B \-\^\-posix
|
---|
348 | This turns on
|
---|
349 | .I compatibility
|
---|
350 | mode, with the following additional restrictions:
|
---|
351 | .RS
|
---|
352 | .TP "\w'\(bu'u+1n"
|
---|
353 | \(bu
|
---|
354 | .B \ex
|
---|
355 | escape sequences are not recognized.
|
---|
356 | .TP
|
---|
357 | \(bu
|
---|
358 | Only space and tab act as field separators when
|
---|
359 | .B FS
|
---|
360 | is set to a single space, newline does not.
|
---|
361 | .TP
|
---|
362 | \(bu
|
---|
363 | You cannot continue lines after
|
---|
364 | .B ?
|
---|
365 | and
|
---|
366 | .BR : .
|
---|
367 | .TP
|
---|
368 | \(bu
|
---|
369 | The synonym
|
---|
370 | .B func
|
---|
371 | for the keyword
|
---|
372 | .B function
|
---|
373 | is not recognized.
|
---|
374 | .TP
|
---|
375 | \(bu
|
---|
376 | The operators
|
---|
377 | .B **
|
---|
378 | and
|
---|
379 | .B **=
|
---|
380 | cannot be used in place of
|
---|
381 | .B ^
|
---|
382 | and
|
---|
383 | .BR ^= .
|
---|
384 | .TP
|
---|
385 | \(bu
|
---|
386 | The
|
---|
387 | .B fflush()
|
---|
388 | function is not available.
|
---|
389 | .RE
|
---|
390 | .TP
|
---|
391 | .PD 0
|
---|
392 | \fB\-W profile\fR[\fB=\fIprof_file\fR]
|
---|
393 | .TP
|
---|
394 | .PD
|
---|
395 | \fB\-\^\-profile\fR[\fB=\fIprof_file\fR]
|
---|
396 | Send profiling data to
|
---|
397 | .IR prof_file .
|
---|
398 | The default is
|
---|
399 | .BR awkprof.out .
|
---|
400 | When run with
|
---|
401 | .IR gawk ,
|
---|
402 | the profile is just a \*(lqpretty printed\*(rq version of the program.
|
---|
403 | When run with
|
---|
404 | .IR pgawk ,
|
---|
405 | the profile contains execution counts of each statement in the program
|
---|
406 | in the left margin and function call counts for each user-defined function.
|
---|
407 | .TP
|
---|
408 | .PD 0
|
---|
409 | .B "\-W re\-interval"
|
---|
410 | .TP
|
---|
411 | .PD
|
---|
412 | .B \-\^\-re\-interval
|
---|
413 | Enable the use of
|
---|
414 | .I "interval expressions"
|
---|
415 | in regular expression matching
|
---|
416 | (see
|
---|
417 | .BR "Regular Expressions" ,
|
---|
418 | below).
|
---|
419 | Interval expressions were not traditionally available in the
|
---|
420 | \*(AK language. The \*(PX standard added them, to make
|
---|
421 | .I awk
|
---|
422 | and
|
---|
423 | .I egrep
|
---|
424 | consistent with each other.
|
---|
425 | However, their use is likely
|
---|
426 | to break old \*(AK programs, so
|
---|
427 | .I gawk
|
---|
428 | only provides them if they are requested with this option, or when
|
---|
429 | .B \-\^\-posix
|
---|
430 | is specified.
|
---|
431 | .TP
|
---|
432 | .PD 0
|
---|
433 | .BI "\-W source " program-text
|
---|
434 | .TP
|
---|
435 | .PD
|
---|
436 | .BI \-\^\-source " program-text"
|
---|
437 | Use
|
---|
438 | .I program-text
|
---|
439 | as \*(AK program source code.
|
---|
440 | This option allows the easy intermixing of library functions (used via the
|
---|
441 | .B \-f
|
---|
442 | and
|
---|
443 | .B \-\^\-file
|
---|
444 | options) with source code entered on the command line.
|
---|
445 | It is intended primarily for medium to large \*(AK programs used
|
---|
446 | in shell scripts.
|
---|
447 | .TP
|
---|
448 | .PD 0
|
---|
449 | .B "\-W version"
|
---|
450 | .TP
|
---|
451 | .PD
|
---|
452 | .B \-\^\-version
|
---|
453 | Print version information for this particular copy of
|
---|
454 | .I gawk
|
---|
455 | on the standard output.
|
---|
456 | This is useful mainly for knowing if the current copy of
|
---|
457 | .I gawk
|
---|
458 | on your system
|
---|
459 | is up to date with respect to whatever the Free Software Foundation
|
---|
460 | is distributing.
|
---|
461 | This is also useful when reporting bugs.
|
---|
462 | (Per the
|
---|
463 | .IR "GNU Coding Standards" ,
|
---|
464 | these options cause an immediate, successful exit.)
|
---|
465 | .TP
|
---|
466 | .PD 0
|
---|
467 | .B \-\^\-
|
---|
468 | Signal the end of options. This is useful to allow further arguments to the
|
---|
469 | \*(AK program itself to start with a \*(lq\-\*(rq.
|
---|
470 | This is mainly for consistency with the argument parsing convention used
|
---|
471 | by most other \*(PX programs.
|
---|
472 | .PP
|
---|
473 | In compatibility mode,
|
---|
474 | any other options are flagged as invalid, but are otherwise ignored.
|
---|
475 | In normal operation, as long as program text has been supplied, unknown
|
---|
476 | options are passed on to the \*(AK program in the
|
---|
477 | .B ARGV
|
---|
478 | array for processing. This is particularly useful for running \*(AK
|
---|
479 | programs via the \*(lq#!\*(rq executable interpreter mechanism.
|
---|
480 | .SH AWK PROGRAM EXECUTION
|
---|
481 | .PP
|
---|
482 | An \*(AK program consists of a sequence of pattern-action statements
|
---|
483 | and optional function definitions.
|
---|
484 | .RS
|
---|
485 | .PP
|
---|
486 | \fIpattern\fB { \fIaction statements\fB }\fR
|
---|
487 | .br
|
---|
488 | \fBfunction \fIname\fB(\fIparameter list\fB) { \fIstatements\fB }\fR
|
---|
489 | .RE
|
---|
490 | .PP
|
---|
491 | .I Gawk
|
---|
492 | first reads the program source from the
|
---|
493 | .IR program-file (s)
|
---|
494 | if specified,
|
---|
495 | from arguments to
|
---|
496 | .BR \-\^\-source ,
|
---|
497 | or from the first non-option argument on the command line.
|
---|
498 | The
|
---|
499 | .B \-f
|
---|
500 | and
|
---|
501 | .B \-\^\-source
|
---|
502 | options may be used multiple times on the command line.
|
---|
503 | .I Gawk
|
---|
504 | reads the program text as if all the
|
---|
505 | .IR program-file s
|
---|
506 | and command line source texts
|
---|
507 | had been concatenated together. This is useful for building libraries
|
---|
508 | of \*(AK functions, without having to include them in each new \*(AK
|
---|
509 | program that uses them. It also provides the ability to mix library
|
---|
510 | functions with command line programs.
|
---|
511 | .PP
|
---|
512 | The environment variable
|
---|
513 | .B AWKPATH
|
---|
514 | specifies a search path to use when finding source files named with
|
---|
515 | the
|
---|
516 | .B \-f
|
---|
517 | option. If this variable does not exist, the default path is
|
---|
518 | \fB".:/usr/local/share/awk"\fR.
|
---|
519 | (The actual directory may vary, depending upon how
|
---|
520 | .I gawk
|
---|
521 | was built and installed.)
|
---|
522 | If a file name given to the
|
---|
523 | .B \-f
|
---|
524 | option contains a \*(lq/\*(rq character, no path search is performed.
|
---|
525 | .PP
|
---|
526 | .I Gawk
|
---|
527 | executes \*(AK programs in the following order.
|
---|
528 | First,
|
---|
529 | all variable assignments specified via the
|
---|
530 | .B \-v
|
---|
531 | option are performed.
|
---|
532 | Next,
|
---|
533 | .I gawk
|
---|
534 | compiles the program into an internal form.
|
---|
535 | Then,
|
---|
536 | .I gawk
|
---|
537 | executes the code in the
|
---|
538 | .B BEGIN
|
---|
539 | block(s) (if any),
|
---|
540 | and then proceeds to read
|
---|
541 | each file named in the
|
---|
542 | .B ARGV
|
---|
543 | array.
|
---|
544 | If there are no files named on the command line,
|
---|
545 | .I gawk
|
---|
546 | reads the standard input.
|
---|
547 | .PP
|
---|
548 | If a filename on the command line has the form
|
---|
549 | .IB var = val
|
---|
550 | it is treated as a variable assignment. The variable
|
---|
551 | .I var
|
---|
552 | will be assigned the value
|
---|
553 | .IR val .
|
---|
554 | (This happens after any
|
---|
555 | .B BEGIN
|
---|
556 | block(s) have been run.)
|
---|
557 | Command line variable assignment
|
---|
558 | is most useful for dynamically assigning values to the variables
|
---|
559 | \*(AK uses to control how input is broken into fields and records.
|
---|
560 | It is also useful for controlling state if multiple passes are needed over
|
---|
561 | a single data file.
|
---|
562 | .PP
|
---|
563 | If the value of a particular element of
|
---|
564 | .B ARGV
|
---|
565 | is empty (\fB""\fR),
|
---|
566 | .I gawk
|
---|
567 | skips over it.
|
---|
568 | .PP
|
---|
569 | For each record in the input,
|
---|
570 | .I gawk
|
---|
571 | tests to see if it matches any
|
---|
572 | .I pattern
|
---|
573 | in the \*(AK program.
|
---|
574 | For each pattern that the record matches, the associated
|
---|
575 | .I action
|
---|
576 | is executed.
|
---|
577 | The patterns are tested in the order they occur in the program.
|
---|
578 | .PP
|
---|
579 | Finally, after all the input is exhausted,
|
---|
580 | .I gawk
|
---|
581 | executes the code in the
|
---|
582 | .B END
|
---|
583 | block(s) (if any).
|
---|
584 | .SH VARIABLES, RECORDS AND FIELDS
|
---|
585 | \*(AK variables are dynamic; they come into existence when they are
|
---|
586 | first used. Their values are either floating-point numbers or strings,
|
---|
587 | or both,
|
---|
588 | depending upon how they are used. \*(AK also has one dimensional
|
---|
589 | arrays; arrays with multiple dimensions may be simulated.
|
---|
590 | Several pre-defined variables are set as a program
|
---|
591 | runs; these will be described as needed and summarized below.
|
---|
592 | .SS Records
|
---|
593 | Normally, records are separated by newline characters. You can control how
|
---|
594 | records are separated by assigning values to the built-in variable
|
---|
595 | .BR RS .
|
---|
596 | If
|
---|
597 | .B RS
|
---|
598 | is any single character, that character separates records.
|
---|
599 | Otherwise,
|
---|
600 | .B RS
|
---|
601 | is a regular expression. Text in the input that matches this
|
---|
602 | regular expression separates the record.
|
---|
603 | However, in compatibility mode,
|
---|
604 | only the first character of its string
|
---|
605 | value is used for separating records.
|
---|
606 | If
|
---|
607 | .B RS
|
---|
608 | is set to the null string, then records are separated by
|
---|
609 | blank lines.
|
---|
610 | When
|
---|
611 | .B RS
|
---|
612 | is set to the null string, the newline character always acts as
|
---|
613 | a field separator, in addition to whatever value
|
---|
614 | .B FS
|
---|
615 | may have.
|
---|
616 | .SS Fields
|
---|
617 | .PP
|
---|
618 | As each input record is read,
|
---|
619 | .I gawk
|
---|
620 | splits the record into
|
---|
621 | .IR fields ,
|
---|
622 | using the value of the
|
---|
623 | .B FS
|
---|
624 | variable as the field separator.
|
---|
625 | If
|
---|
626 | .B FS
|
---|
627 | is a single character, fields are separated by that character.
|
---|
628 | If
|
---|
629 | .B FS
|
---|
630 | is the null string, then each individual character becomes a
|
---|
631 | separate field.
|
---|
632 | Otherwise,
|
---|
633 | .B FS
|
---|
634 | is expected to be a full regular expression.
|
---|
635 | In the special case that
|
---|
636 | .B FS
|
---|
637 | is a single space, fields are separated
|
---|
638 | by runs of spaces and/or tabs and/or newlines.
|
---|
639 | (But see the discussion of
|
---|
640 | .BR \-\^\-posix ,
|
---|
641 | below).
|
---|
642 | .B NOTE:
|
---|
643 | The value of
|
---|
644 | .B IGNORECASE
|
---|
645 | (see below) also affects how fields are split when
|
---|
646 | .B FS
|
---|
647 | is a regular expression, and how records are separated when
|
---|
648 | .B RS
|
---|
649 | is a regular expression.
|
---|
650 | .PP
|
---|
651 | If the
|
---|
652 | .B FIELDWIDTHS
|
---|
653 | variable is set to a space separated list of numbers, each field is
|
---|
654 | expected to have fixed width, and
|
---|
655 | .I gawk
|
---|
656 | splits up the record using the specified widths. The value of
|
---|
657 | .B FS
|
---|
658 | is ignored.
|
---|
659 | Assigning a new value to
|
---|
660 | .B FS
|
---|
661 | overrides the use of
|
---|
662 | .BR FIELDWIDTHS ,
|
---|
663 | and restores the default behavior.
|
---|
664 | .PP
|
---|
665 | Each field in the input record may be referenced by its position,
|
---|
666 | .BR $1 ,
|
---|
667 | .BR $2 ,
|
---|
668 | and so on.
|
---|
669 | .B $0
|
---|
670 | is the whole record.
|
---|
671 | Fields need not be referenced by constants:
|
---|
672 | .RS
|
---|
673 | .PP
|
---|
674 | .ft B
|
---|
675 | n = 5
|
---|
676 | .br
|
---|
677 | print $n
|
---|
678 | .ft R
|
---|
679 | .RE
|
---|
680 | .PP
|
---|
681 | prints the fifth field in the input record.
|
---|
682 | .PP
|
---|
683 | The variable
|
---|
684 | .B NF
|
---|
685 | is set to the total number of fields in the input record.
|
---|
686 | .PP
|
---|
687 | References to non-existent fields (i.e. fields after
|
---|
688 | .BR $NF )
|
---|
689 | produce the null-string. However, assigning to a non-existent field
|
---|
690 | (e.g.,
|
---|
691 | .BR "$(NF+2) = 5" )
|
---|
692 | increases the value of
|
---|
693 | .BR NF ,
|
---|
694 | creates any intervening fields with the null string as their value, and
|
---|
695 | causes the value of
|
---|
696 | .B $0
|
---|
697 | to be recomputed, with the fields being separated by the value of
|
---|
698 | .BR OFS .
|
---|
699 | References to negative numbered fields cause a fatal error.
|
---|
700 | Decrementing
|
---|
701 | .B NF
|
---|
702 | causes the values of fields past the new value to be lost, and the value of
|
---|
703 | .B $0
|
---|
704 | to be recomputed, with the fields being separated by the value of
|
---|
705 | .BR OFS .
|
---|
706 | .PP
|
---|
707 | Assigning a value to an existing field
|
---|
708 | causes the whole record to be rebuilt when
|
---|
709 | .B $0
|
---|
710 | is referenced.
|
---|
711 | Similarly, assigning a value to
|
---|
712 | .B $0
|
---|
713 | causes the record to be resplit, creating new
|
---|
714 | values for the fields.
|
---|
715 | .SS Built-in Variables
|
---|
716 | .PP
|
---|
717 | .IR Gawk\^ "'s"
|
---|
718 | built-in variables are:
|
---|
719 | .PP
|
---|
720 | .TP "\w'\fBFIELDWIDTHS\fR'u+1n"
|
---|
721 | .B ARGC
|
---|
722 | The number of command line arguments (does not include options to
|
---|
723 | .IR gawk ,
|
---|
724 | or the program source).
|
---|
725 | .TP
|
---|
726 | .B ARGIND
|
---|
727 | The index in
|
---|
728 | .B ARGV
|
---|
729 | of the current file being processed.
|
---|
730 | .TP
|
---|
731 | .B ARGV
|
---|
732 | Array of command line arguments. The array is indexed from
|
---|
733 | 0 to
|
---|
734 | .B ARGC
|
---|
735 | \- 1.
|
---|
736 | Dynamically changing the contents of
|
---|
737 | .B ARGV
|
---|
738 | can control the files used for data.
|
---|
739 | .TP
|
---|
740 | .B BINMODE
|
---|
741 | On non-POSIX systems, specifies use of \*(lqbinary\*(rq mode for all file I/O.
|
---|
742 | Numeric values of 1, 2, or 3, specify that input files, output files, or
|
---|
743 | all files, respectively, should use binary I/O.
|
---|
744 | String values of \fB"r"\fR, or \fB"w"\fR specify that input files, or output files,
|
---|
745 | respectively, should use binary I/O.
|
---|
746 | String values of \fB"rw"\fR or \fB"wr"\fR specify that all files
|
---|
747 | should use binary I/O.
|
---|
748 | Any other string value is treated as \fB"rw"\fR, but generates a warning message.
|
---|
749 | .TP
|
---|
750 | .B CONVFMT
|
---|
751 | The conversion format for numbers, \fB"%.6g"\fR, by default.
|
---|
752 | .TP
|
---|
753 | .B ENVIRON
|
---|
754 | An array containing the values of the current environment.
|
---|
755 | The array is indexed by the environment variables, each element being
|
---|
756 | the value of that variable (e.g., \fBENVIRON["HOME"]\fP might be
|
---|
757 | .BR /home/arnold ).
|
---|
758 | Changing this array does not affect the environment seen by programs which
|
---|
759 | .I gawk
|
---|
760 | spawns via redirection or the
|
---|
761 | .B system()
|
---|
762 | function.
|
---|
763 | .TP
|
---|
764 | .B ERRNO
|
---|
765 | If a system error occurs either doing a redirection for
|
---|
766 | .BR getline ,
|
---|
767 | during a read for
|
---|
768 | .BR getline ,
|
---|
769 | or during a
|
---|
770 | .BR close() ,
|
---|
771 | then
|
---|
772 | .B ERRNO
|
---|
773 | will contain
|
---|
774 | a string describing the error.
|
---|
775 | The value is subject to translation in non-English locales.
|
---|
776 | .TP
|
---|
777 | .B FIELDWIDTHS
|
---|
778 | A white-space separated list of fieldwidths. When set,
|
---|
779 | .I gawk
|
---|
780 | parses the input into fields of fixed width, instead of using the
|
---|
781 | value of the
|
---|
782 | .B FS
|
---|
783 | variable as the field separator.
|
---|
784 | .TP
|
---|
785 | .B FILENAME
|
---|
786 | The name of the current input file.
|
---|
787 | If no files are specified on the command line, the value of
|
---|
788 | .B FILENAME
|
---|
789 | is \*(lq\-\*(rq.
|
---|
790 | However,
|
---|
791 | .B FILENAME
|
---|
792 | is undefined inside the
|
---|
793 | .B BEGIN
|
---|
794 | block
|
---|
795 | (unless set by
|
---|
796 | .BR getline ).
|
---|
797 | .TP
|
---|
798 | .B FNR
|
---|
799 | The input record number in the current input file.
|
---|
800 | .TP
|
---|
801 | .B FS
|
---|
802 | The input field separator, a space by default. See
|
---|
803 | .BR Fields ,
|
---|
804 | above.
|
---|
805 | .TP
|
---|
806 | .B IGNORECASE
|
---|
807 | Controls the case-sensitivity of all regular expression
|
---|
808 | and string operations. If
|
---|
809 | .B IGNORECASE
|
---|
810 | has a non-zero value, then string comparisons and
|
---|
811 | pattern matching in rules,
|
---|
812 | field splitting with
|
---|
813 | .BR FS ,
|
---|
814 | record separating with
|
---|
815 | .BR RS ,
|
---|
816 | regular expression
|
---|
817 | matching with
|
---|
818 | .B ~
|
---|
819 | and
|
---|
820 | .BR !~ ,
|
---|
821 | and the
|
---|
822 | .BR gensub() ,
|
---|
823 | .BR gsub() ,
|
---|
824 | .BR index() ,
|
---|
825 | .BR match() ,
|
---|
826 | .BR split() ,
|
---|
827 | and
|
---|
828 | .B sub()
|
---|
829 | built-in functions all ignore case when doing regular expression
|
---|
830 | operations.
|
---|
831 | .B NOTE:
|
---|
832 | Array subscripting is
|
---|
833 | .I not
|
---|
834 | affected.
|
---|
835 | However, the
|
---|
836 | .B asort()
|
---|
837 | and
|
---|
838 | .B asorti()
|
---|
839 | functions are affected.
|
---|
840 | .sp .5
|
---|
841 | Thus, if
|
---|
842 | .B IGNORECASE
|
---|
843 | is not equal to zero,
|
---|
844 | .B /aB/
|
---|
845 | matches all of the strings \fB"ab"\fP, \fB"aB"\fP, \fB"Ab"\fP,
|
---|
846 | and \fB"AB"\fP.
|
---|
847 | As with all \*(AK variables, the initial value of
|
---|
848 | .B IGNORECASE
|
---|
849 | is zero, so all regular expression and string
|
---|
850 | operations are normally case-sensitive.
|
---|
851 | Under Unix, the full ISO 8859-1 Latin-1 character set is used
|
---|
852 | when ignoring case.
|
---|
853 | As of
|
---|
854 | .I gawk
|
---|
855 | 3.1.4, the case equivalencies are fully locale-aware, based on
|
---|
856 | the C
|
---|
857 | .B <ctype.h>
|
---|
858 | facilities such as
|
---|
859 | .BR isalpha() ,
|
---|
860 | and
|
---|
861 | .BR tolupper() .
|
---|
862 | .TP
|
---|
863 | .B LINT
|
---|
864 | Provides dynamic control of the
|
---|
865 | .B \-\^\-lint
|
---|
866 | option from within an \*(AK program.
|
---|
867 | When true,
|
---|
868 | .I gawk
|
---|
869 | prints lint warnings. When false, it does not.
|
---|
870 | When assigned the string value \fB"fatal"\fP,
|
---|
871 | lint warnings become fatal errors, exactly like
|
---|
872 | .BR \-\^\-lint=fatal .
|
---|
873 | Any other true value just prints warnings.
|
---|
874 | .TP
|
---|
875 | .B NF
|
---|
876 | The number of fields in the current input record.
|
---|
877 | .TP
|
---|
878 | .B NR
|
---|
879 | The total number of input records seen so far.
|
---|
880 | .TP
|
---|
881 | .B OFMT
|
---|
882 | The output format for numbers, \fB"%.6g"\fR, by default.
|
---|
883 | .TP
|
---|
884 | .B OFS
|
---|
885 | The output field separator, a space by default.
|
---|
886 | .TP
|
---|
887 | .B ORS
|
---|
888 | The output record separator, by default a newline.
|
---|
889 | .TP
|
---|
890 | .B PROCINFO
|
---|
891 | The elements of this array provide access to information about the
|
---|
892 | running \*(AK program.
|
---|
893 | On some systems,
|
---|
894 | there may be elements in the array, \fB"group1"\fP through
|
---|
895 | \fB"group\fIn\fB"\fR for some
|
---|
896 | .IR n ,
|
---|
897 | which is the number of supplementary groups that the process has.
|
---|
898 | Use the
|
---|
899 | .B in
|
---|
900 | operator to test for these elements.
|
---|
901 | The following elements are guaranteed to be available:
|
---|
902 | .RS
|
---|
903 | .TP \w'\fBPROCINFO["pgrpid"]\fR'u+1n
|
---|
904 | \fBPROCINFO["egid"]\fP
|
---|
905 | the value of the
|
---|
906 | .IR getegid (2)
|
---|
907 | system call.
|
---|
908 | .TP
|
---|
909 | \fBPROCINFO["euid"]\fP
|
---|
910 | the value of the
|
---|
911 | .IR geteuid (2)
|
---|
912 | system call.
|
---|
913 | .TP
|
---|
914 | \fBPROCINFO["FS"]\fP
|
---|
915 | \fB"FS"\fP if field splitting with
|
---|
916 | .B FS
|
---|
917 | is in effect, or \fB"FIELDWIDTHS"\fP if field splitting with
|
---|
918 | .B FIELDWIDTHS
|
---|
919 | is in effect.
|
---|
920 | .TP
|
---|
921 | \fBPROCINFO["gid"]\fP
|
---|
922 | the value of the
|
---|
923 | .IR getgid (2)
|
---|
924 | system call.
|
---|
925 | .TP
|
---|
926 | \fBPROCINFO["pgrpid"]\fP
|
---|
927 | the process group ID of the current process.
|
---|
928 | .TP
|
---|
929 | \fBPROCINFO["pid"]\fP
|
---|
930 | the process ID of the current process.
|
---|
931 | .TP
|
---|
932 | \fBPROCINFO["ppid"]\fP
|
---|
933 | the parent process ID of the current process.
|
---|
934 | .TP
|
---|
935 | \fBPROCINFO["uid"]\fP
|
---|
936 | the value of the
|
---|
937 | .IR getuid (2)
|
---|
938 | system call.
|
---|
939 | .TP
|
---|
940 | \fBPROCINFO["version"]\fP
|
---|
941 | The version of
|
---|
942 | .IR gawk .
|
---|
943 | This is available from
|
---|
944 | version 3.1.4 and later.
|
---|
945 | .RE
|
---|
946 | .TP
|
---|
947 | .B RS
|
---|
948 | The input record separator, by default a newline.
|
---|
949 | .TP
|
---|
950 | .B RT
|
---|
951 | The record terminator.
|
---|
952 | .I Gawk
|
---|
953 | sets
|
---|
954 | .B RT
|
---|
955 | to the input text that matched the character or regular expression
|
---|
956 | specified by
|
---|
957 | .BR RS .
|
---|
958 | .TP
|
---|
959 | .B RSTART
|
---|
960 | The index of the first character matched by
|
---|
961 | .BR match() ;
|
---|
962 | 0 if no match.
|
---|
963 | (This implies that character indices start at one.)
|
---|
964 | .TP
|
---|
965 | .B RLENGTH
|
---|
966 | The length of the string matched by
|
---|
967 | .BR match() ;
|
---|
968 | \-1 if no match.
|
---|
969 | .TP
|
---|
970 | .B SUBSEP
|
---|
971 | The character used to separate multiple subscripts in array
|
---|
972 | elements, by default \fB"\e034"\fR.
|
---|
973 | .TP
|
---|
974 | .B TEXTDOMAIN
|
---|
975 | The text domain of the \*(AK program; used to find the localized
|
---|
976 | translations for the program's strings.
|
---|
977 | .SS Arrays
|
---|
978 | .PP
|
---|
979 | Arrays are subscripted with an expression between square brackets
|
---|
980 | .RB ( [ " and " ] ).
|
---|
981 | If the expression is an expression list
|
---|
982 | .RI ( expr ", " expr " .\|.\|.)"
|
---|
983 | then the array subscript is a string consisting of the
|
---|
984 | concatenation of the (string) value of each expression,
|
---|
985 | separated by the value of the
|
---|
986 | .B SUBSEP
|
---|
987 | variable.
|
---|
988 | This facility is used to simulate multiply dimensioned
|
---|
989 | arrays. For example:
|
---|
990 | .PP
|
---|
991 | .RS
|
---|
992 | .ft B
|
---|
993 | i = "A";\^ j = "B";\^ k = "C"
|
---|
994 | .br
|
---|
995 | x[i, j, k] = "hello, world\en"
|
---|
996 | .ft R
|
---|
997 | .RE
|
---|
998 | .PP
|
---|
999 | assigns the string \fB"hello, world\en"\fR to the element of the array
|
---|
1000 | .B x
|
---|
1001 | which is indexed by the string \fB"A\e034B\e034C"\fR. All arrays in \*(AK
|
---|
1002 | are associative, i.e. indexed by string values.
|
---|
1003 | .PP
|
---|
1004 | The special operator
|
---|
1005 | .B in
|
---|
1006 | may be used in an
|
---|
1007 | .B if
|
---|
1008 | or
|
---|
1009 | .B while
|
---|
1010 | statement to see if an array has an index consisting of a particular
|
---|
1011 | value.
|
---|
1012 | .PP
|
---|
1013 | .RS
|
---|
1014 | .ft B
|
---|
1015 | .nf
|
---|
1016 | if (val in array)
|
---|
1017 | print array[val]
|
---|
1018 | .fi
|
---|
1019 | .ft
|
---|
1020 | .RE
|
---|
1021 | .PP
|
---|
1022 | If the array has multiple subscripts, use
|
---|
1023 | .BR "(i, j) in array" .
|
---|
1024 | .PP
|
---|
1025 | The
|
---|
1026 | .B in
|
---|
1027 | construct may also be used in a
|
---|
1028 | .B for
|
---|
1029 | loop to iterate over all the elements of an array.
|
---|
1030 | .PP
|
---|
1031 | An element may be deleted from an array using the
|
---|
1032 | .B delete
|
---|
1033 | statement.
|
---|
1034 | The
|
---|
1035 | .B delete
|
---|
1036 | statement may also be used to delete the entire contents of an array,
|
---|
1037 | just by specifying the array name without a subscript.
|
---|
1038 | .SS Variable Typing And Conversion
|
---|
1039 | .PP
|
---|
1040 | Variables and fields
|
---|
1041 | may be (floating point) numbers, or strings, or both. How the
|
---|
1042 | value of a variable is interpreted depends upon its context. If used in
|
---|
1043 | a numeric expression, it will be treated as a number, if used as a string
|
---|
1044 | it will be treated as a string.
|
---|
1045 | .PP
|
---|
1046 | To force a variable to be treated as a number, add 0 to it; to force it
|
---|
1047 | to be treated as a string, concatenate it with the null string.
|
---|
1048 | .PP
|
---|
1049 | When a string must be converted to a number, the conversion is accomplished
|
---|
1050 | using
|
---|
1051 | .IR strtod (3).
|
---|
1052 | A number is converted to a string by using the value of
|
---|
1053 | .B CONVFMT
|
---|
1054 | as a format string for
|
---|
1055 | .IR sprintf (3),
|
---|
1056 | with the numeric value of the variable as the argument.
|
---|
1057 | However, even though all numbers in \*(AK are floating-point,
|
---|
1058 | integral values are
|
---|
1059 | .I always
|
---|
1060 | converted as integers. Thus, given
|
---|
1061 | .PP
|
---|
1062 | .RS
|
---|
1063 | .ft B
|
---|
1064 | .nf
|
---|
1065 | CONVFMT = "%2.2f"
|
---|
1066 | a = 12
|
---|
1067 | b = a ""
|
---|
1068 | .fi
|
---|
1069 | .ft R
|
---|
1070 | .RE
|
---|
1071 | .PP
|
---|
1072 | the variable
|
---|
1073 | .B b
|
---|
1074 | has a string value of \fB"12"\fR and not \fB"12.00"\fR.
|
---|
1075 | .PP
|
---|
1076 | .I Gawk
|
---|
1077 | performs comparisons as follows:
|
---|
1078 | If two variables are numeric, they are compared numerically.
|
---|
1079 | If one value is numeric and the other has a string value that is a
|
---|
1080 | \*(lqnumeric string,\*(rq then comparisons are also done numerically.
|
---|
1081 | Otherwise, the numeric value is converted to a string and a string
|
---|
1082 | comparison is performed.
|
---|
1083 | Two strings are compared, of course, as strings.
|
---|
1084 | Note that the POSIX standard applies the concept of
|
---|
1085 | \*(lqnumeric string\*(rq everywhere, even to string constants.
|
---|
1086 | However, this is
|
---|
1087 | clearly incorrect, and
|
---|
1088 | .I gawk
|
---|
1089 | does not do this.
|
---|
1090 | (Fortunately, this is fixed in the next version of the standard.)
|
---|
1091 | .PP
|
---|
1092 | Note that string constants, such as \fB"57"\fP, are
|
---|
1093 | .I not
|
---|
1094 | numeric strings, they are string constants.
|
---|
1095 | The idea of \*(lqnumeric string\*(rq
|
---|
1096 | only applies to fields,
|
---|
1097 | .B getline
|
---|
1098 | input,
|
---|
1099 | .BR FILENAME ,
|
---|
1100 | .B ARGV
|
---|
1101 | elements,
|
---|
1102 | .B ENVIRON
|
---|
1103 | elements and the elements of an array created by
|
---|
1104 | .B split()
|
---|
1105 | that are numeric strings.
|
---|
1106 | The basic idea is that
|
---|
1107 | .IR "user input" ,
|
---|
1108 | and only user input, that looks numeric,
|
---|
1109 | should be treated that way.
|
---|
1110 | .PP
|
---|
1111 | Uninitialized variables have the numeric value 0 and the string value ""
|
---|
1112 | (the null, or empty, string).
|
---|
1113 | .SS Octal and Hexadecimal Constants
|
---|
1114 | Starting with version 3.1 of
|
---|
1115 | .I gawk ,
|
---|
1116 | you may use C-style octal and hexadecimal constants in your AWK
|
---|
1117 | program source code.
|
---|
1118 | For example, the octal value
|
---|
1119 | .B 011
|
---|
1120 | is equal to decimal
|
---|
1121 | .BR 9 ,
|
---|
1122 | and the hexadecimal value
|
---|
1123 | .B 0x11
|
---|
1124 | is equal to decimal 17.
|
---|
1125 | .SS String Constants
|
---|
1126 | .PP
|
---|
1127 | String constants in \*(AK are sequences of characters enclosed
|
---|
1128 | between double quotes (\fB"\fR). Within strings, certain
|
---|
1129 | .I "escape sequences"
|
---|
1130 | are recognized, as in C. These are:
|
---|
1131 | .PP
|
---|
1132 | .TP "\w'\fB\e\^\fIddd\fR'u+1n"
|
---|
1133 | .B \e\e
|
---|
1134 | A literal backslash.
|
---|
1135 | .TP
|
---|
1136 | .B \ea
|
---|
1137 | The \*(lqalert\*(rq character; usually the \s-1ASCII\s+1 \s-1BEL\s+1 character.
|
---|
1138 | .TP
|
---|
1139 | .B \eb
|
---|
1140 | backspace.
|
---|
1141 | .TP
|
---|
1142 | .B \ef
|
---|
1143 | form-feed.
|
---|
1144 | .TP
|
---|
1145 | .B \en
|
---|
1146 | newline.
|
---|
1147 | .TP
|
---|
1148 | .B \er
|
---|
1149 | carriage return.
|
---|
1150 | .TP
|
---|
1151 | .B \et
|
---|
1152 | horizontal tab.
|
---|
1153 | .TP
|
---|
1154 | .B \ev
|
---|
1155 | vertical tab.
|
---|
1156 | .TP
|
---|
1157 | .BI \ex "\^hex digits"
|
---|
1158 | The character represented by the string of hexadecimal digits following
|
---|
1159 | the
|
---|
1160 | .BR \ex .
|
---|
1161 | As in \*(AN C, all following hexadecimal digits are considered part of
|
---|
1162 | the escape sequence.
|
---|
1163 | (This feature should tell us something about language design by committee.)
|
---|
1164 | E.g., \fB"\ex1B"\fR is the \s-1ASCII\s+1 \s-1ESC\s+1 (escape) character.
|
---|
1165 | .TP
|
---|
1166 | .BI \e ddd
|
---|
1167 | The character represented by the 1-, 2-, or 3-digit sequence of octal
|
---|
1168 | digits.
|
---|
1169 | E.g., \fB"\e033"\fR is the \s-1ASCII\s+1 \s-1ESC\s+1 (escape) character.
|
---|
1170 | .TP
|
---|
1171 | .BI \e c
|
---|
1172 | The literal character
|
---|
1173 | .IR c\^ .
|
---|
1174 | .PP
|
---|
1175 | The escape sequences may also be used inside constant regular expressions
|
---|
1176 | (e.g.,
|
---|
1177 | .B "/[\ \et\ef\en\er\ev]/"
|
---|
1178 | matches whitespace characters).
|
---|
1179 | .PP
|
---|
1180 | In compatibility mode, the characters represented by octal and
|
---|
1181 | hexadecimal escape sequences are treated literally when used in
|
---|
1182 | regular expression constants. Thus,
|
---|
1183 | .B /a\e52b/
|
---|
1184 | is equivalent to
|
---|
1185 | .BR /a\e*b/ .
|
---|
1186 | .SH PATTERNS AND ACTIONS
|
---|
1187 | \*(AK is a line-oriented language. The pattern comes first, and then the
|
---|
1188 | action. Action statements are enclosed in
|
---|
1189 | .B {
|
---|
1190 | and
|
---|
1191 | .BR } .
|
---|
1192 | Either the pattern may be missing, or the action may be missing, but,
|
---|
1193 | of course, not both. If the pattern is missing, the action is
|
---|
1194 | executed for every single record of input.
|
---|
1195 | A missing action is equivalent to
|
---|
1196 | .RS
|
---|
1197 | .PP
|
---|
1198 | .B "{ print }"
|
---|
1199 | .RE
|
---|
1200 | .PP
|
---|
1201 | which prints the entire record.
|
---|
1202 | .PP
|
---|
1203 | Comments begin with the \*(lq#\*(rq character, and continue until the
|
---|
1204 | end of the line.
|
---|
1205 | Blank lines may be used to separate statements.
|
---|
1206 | Normally, a statement ends with a newline, however, this is not the
|
---|
1207 | case for lines ending in
|
---|
1208 | a \*(lq,\*(rq,
|
---|
1209 | .BR { ,
|
---|
1210 | .BR ? ,
|
---|
1211 | .BR : ,
|
---|
1212 | .BR && ,
|
---|
1213 | or
|
---|
1214 | .BR || .
|
---|
1215 | Lines ending in
|
---|
1216 | .B do
|
---|
1217 | or
|
---|
1218 | .B else
|
---|
1219 | also have their statements automatically continued on the following line.
|
---|
1220 | In other cases, a line can be continued by ending it with a \*(lq\e\*(rq,
|
---|
1221 | in which case the newline will be ignored.
|
---|
1222 | .PP
|
---|
1223 | Multiple statements may
|
---|
1224 | be put on one line by separating them with a \*(lq;\*(rq.
|
---|
1225 | This applies to both the statements within the action part of a
|
---|
1226 | pattern-action pair (the usual case),
|
---|
1227 | and to the pattern-action statements themselves.
|
---|
1228 | .SS Patterns
|
---|
1229 | \*(AK patterns may be one of the following:
|
---|
1230 | .PP
|
---|
1231 | .RS
|
---|
1232 | .nf
|
---|
1233 | .B BEGIN
|
---|
1234 | .B END
|
---|
1235 | .BI / "regular expression" /
|
---|
1236 | .I "relational expression"
|
---|
1237 | .IB pattern " && " pattern
|
---|
1238 | .IB pattern " || " pattern
|
---|
1239 | .IB pattern " ? " pattern " : " pattern
|
---|
1240 | .BI ( pattern )
|
---|
1241 | .BI ! " pattern"
|
---|
1242 | .IB pattern1 ", " pattern2
|
---|
1243 | .fi
|
---|
1244 | .RE
|
---|
1245 | .PP
|
---|
1246 | .B BEGIN
|
---|
1247 | and
|
---|
1248 | .B END
|
---|
1249 | are two special kinds of patterns which are not tested against
|
---|
1250 | the input.
|
---|
1251 | The action parts of all
|
---|
1252 | .B BEGIN
|
---|
1253 | patterns are merged as if all the statements had
|
---|
1254 | been written in a single
|
---|
1255 | .B BEGIN
|
---|
1256 | block. They are executed before any
|
---|
1257 | of the input is read. Similarly, all the
|
---|
1258 | .B END
|
---|
1259 | blocks are merged,
|
---|
1260 | and executed when all the input is exhausted (or when an
|
---|
1261 | .B exit
|
---|
1262 | statement is executed).
|
---|
1263 | .B BEGIN
|
---|
1264 | and
|
---|
1265 | .B END
|
---|
1266 | patterns cannot be combined with other patterns in pattern expressions.
|
---|
1267 | .B BEGIN
|
---|
1268 | and
|
---|
1269 | .B END
|
---|
1270 | patterns cannot have missing action parts.
|
---|
1271 | .PP
|
---|
1272 | For
|
---|
1273 | .BI / "regular expression" /
|
---|
1274 | patterns, the associated statement is executed for each input record that matches
|
---|
1275 | the regular expression.
|
---|
1276 | Regular expressions are the same as those in
|
---|
1277 | .IR egrep (1),
|
---|
1278 | and are summarized below.
|
---|
1279 | .PP
|
---|
1280 | A
|
---|
1281 | .I "relational expression"
|
---|
1282 | may use any of the operators defined below in the section on actions.
|
---|
1283 | These generally test whether certain fields match certain regular expressions.
|
---|
1284 | .PP
|
---|
1285 | The
|
---|
1286 | .BR && ,
|
---|
1287 | .BR || ,
|
---|
1288 | and
|
---|
1289 | .B !
|
---|
1290 | operators are logical AND, logical OR, and logical NOT, respectively, as in C.
|
---|
1291 | They do short-circuit evaluation, also as in C, and are used for combining
|
---|
1292 | more primitive pattern expressions. As in most languages, parentheses
|
---|
1293 | may be used to change the order of evaluation.
|
---|
1294 | .PP
|
---|
1295 | The
|
---|
1296 | .B ?\^:
|
---|
1297 | operator is like the same operator in C. If the first pattern is true
|
---|
1298 | then the pattern used for testing is the second pattern, otherwise it is
|
---|
1299 | the third. Only one of the second and third patterns is evaluated.
|
---|
1300 | .PP
|
---|
1301 | The
|
---|
1302 | .IB pattern1 ", " pattern2
|
---|
1303 | form of an expression is called a
|
---|
1304 | .IR "range pattern" .
|
---|
1305 | It matches all input records starting with a record that matches
|
---|
1306 | .IR pattern1 ,
|
---|
1307 | and continuing until a record that matches
|
---|
1308 | .IR pattern2 ,
|
---|
1309 | inclusive. It does not combine with any other sort of pattern expression.
|
---|
1310 | .SS Regular Expressions
|
---|
1311 | Regular expressions are the extended kind found in
|
---|
1312 | .IR egrep .
|
---|
1313 | They are composed of characters as follows:
|
---|
1314 | .TP "\w'\fB[^\fIabc.\|.\|.\fB]\fR'u+2n"
|
---|
1315 | .I c
|
---|
1316 | matches the non-metacharacter
|
---|
1317 | .IR c .
|
---|
1318 | .TP
|
---|
1319 | .I \ec
|
---|
1320 | matches the literal character
|
---|
1321 | .IR c .
|
---|
1322 | .TP
|
---|
1323 | .B .
|
---|
1324 | matches any character
|
---|
1325 | .I including
|
---|
1326 | newline.
|
---|
1327 | .TP
|
---|
1328 | .B ^
|
---|
1329 | matches the beginning of a string.
|
---|
1330 | .TP
|
---|
1331 | .B $
|
---|
1332 | matches the end of a string.
|
---|
1333 | .TP
|
---|
1334 | .BI [ abc.\|.\|. ]
|
---|
1335 | character list, matches any of the characters
|
---|
1336 | .IR abc.\|.\|. .
|
---|
1337 | .TP
|
---|
1338 | .BI [^ abc.\|.\|. ]
|
---|
1339 | negated character list, matches any character except
|
---|
1340 | .IR abc.\|.\|. .
|
---|
1341 | .TP
|
---|
1342 | .IB r1 | r2
|
---|
1343 | alternation: matches either
|
---|
1344 | .I r1
|
---|
1345 | or
|
---|
1346 | .IR r2 .
|
---|
1347 | .TP
|
---|
1348 | .I r1r2
|
---|
1349 | concatenation: matches
|
---|
1350 | .IR r1 ,
|
---|
1351 | and then
|
---|
1352 | .IR r2 .
|
---|
1353 | .TP
|
---|
1354 | .IB r\^ +
|
---|
1355 | matches one or more
|
---|
1356 | .IR r\^ "'s."
|
---|
1357 | .TP
|
---|
1358 | .IB r *
|
---|
1359 | matches zero or more
|
---|
1360 | .IR r\^ "'s."
|
---|
1361 | .TP
|
---|
1362 | .IB r\^ ?
|
---|
1363 | matches zero or one
|
---|
1364 | .IR r\^ "'s."
|
---|
1365 | .TP
|
---|
1366 | .BI ( r )
|
---|
1367 | grouping: matches
|
---|
1368 | .IR r .
|
---|
1369 | .TP
|
---|
1370 | .PD 0
|
---|
1371 | .IB r { n }
|
---|
1372 | .TP
|
---|
1373 | .PD 0
|
---|
1374 | .IB r { n ,}
|
---|
1375 | .TP
|
---|
1376 | .PD
|
---|
1377 | .IB r { n , m }
|
---|
1378 | One or two numbers inside braces denote an
|
---|
1379 | .IR "interval expression" .
|
---|
1380 | If there is one number in the braces, the preceding regular expression
|
---|
1381 | .I r
|
---|
1382 | is repeated
|
---|
1383 | .I n
|
---|
1384 | times. If there are two numbers separated by a comma,
|
---|
1385 | .I r
|
---|
1386 | is repeated
|
---|
1387 | .I n
|
---|
1388 | to
|
---|
1389 | .I m
|
---|
1390 | times.
|
---|
1391 | If there is one number followed by a comma, then
|
---|
1392 | .I r
|
---|
1393 | is repeated at least
|
---|
1394 | .I n
|
---|
1395 | times.
|
---|
1396 | .sp .5
|
---|
1397 | Interval expressions are only available if either
|
---|
1398 | .B \-\^\-posix
|
---|
1399 | or
|
---|
1400 | .B \-\^\-re\-interval
|
---|
1401 | is specified on the command line.
|
---|
1402 | .TP
|
---|
1403 | .B \ey
|
---|
1404 | matches the empty string at either the beginning or the
|
---|
1405 | end of a word.
|
---|
1406 | .TP
|
---|
1407 | .B \eB
|
---|
1408 | matches the empty string within a word.
|
---|
1409 | .TP
|
---|
1410 | .B \e<
|
---|
1411 | matches the empty string at the beginning of a word.
|
---|
1412 | .TP
|
---|
1413 | .B \e>
|
---|
1414 | matches the empty string at the end of a word.
|
---|
1415 | .TP
|
---|
1416 | .B \ew
|
---|
1417 | matches any word-constituent character (letter, digit, or underscore).
|
---|
1418 | .TP
|
---|
1419 | .B \eW
|
---|
1420 | matches any character that is not word-constituent.
|
---|
1421 | .TP
|
---|
1422 | .B \e`
|
---|
1423 | matches the empty string at the beginning of a buffer (string).
|
---|
1424 | .TP
|
---|
1425 | .B \e'
|
---|
1426 | matches the empty string at the end of a buffer.
|
---|
1427 | .PP
|
---|
1428 | The escape sequences that are valid in string constants (see below)
|
---|
1429 | are also valid in regular expressions.
|
---|
1430 | .PP
|
---|
1431 | .I "Character classes"
|
---|
1432 | are a new feature introduced in the \*(PX standard.
|
---|
1433 | A character class is a special notation for describing
|
---|
1434 | lists of characters that have a specific attribute, but where the
|
---|
1435 | actual characters themselves can vary from country to country and/or
|
---|
1436 | from character set to character set. For example, the notion of what
|
---|
1437 | is an alphabetic character differs in the USA and in France.
|
---|
1438 | .PP
|
---|
1439 | A character class is only valid in a regular expression
|
---|
1440 | .I inside
|
---|
1441 | the brackets of a character list. Character classes consist of
|
---|
1442 | .BR [: ,
|
---|
1443 | a keyword denoting the class, and
|
---|
1444 | .BR :] .
|
---|
1445 | The character
|
---|
1446 | classes defined by the \*(PX standard are:
|
---|
1447 | .TP "\w'\fB[:alnum:]\fR'u+2n"
|
---|
1448 | .B [:alnum:]
|
---|
1449 | Alphanumeric characters.
|
---|
1450 | .TP
|
---|
1451 | .B [:alpha:]
|
---|
1452 | Alphabetic characters.
|
---|
1453 | .TP
|
---|
1454 | .B [:blank:]
|
---|
1455 | Space or tab characters.
|
---|
1456 | .TP
|
---|
1457 | .B [:cntrl:]
|
---|
1458 | Control characters.
|
---|
1459 | .TP
|
---|
1460 | .B [:digit:]
|
---|
1461 | Numeric characters.
|
---|
1462 | .TP
|
---|
1463 | .B [:graph:]
|
---|
1464 | Characters that are both printable and visible.
|
---|
1465 | (A space is printable, but not visible, while an
|
---|
1466 | .B a
|
---|
1467 | is both.)
|
---|
1468 | .TP
|
---|
1469 | .B [:lower:]
|
---|
1470 | Lower-case alphabetic characters.
|
---|
1471 | .TP
|
---|
1472 | .B [:print:]
|
---|
1473 | Printable characters (characters that are not control characters.)
|
---|
1474 | .TP
|
---|
1475 | .B [:punct:]
|
---|
1476 | Punctuation characters (characters that are not letter, digits,
|
---|
1477 | control characters, or space characters).
|
---|
1478 | .TP
|
---|
1479 | .B [:space:]
|
---|
1480 | Space characters (such as space, tab, and formfeed, to name a few).
|
---|
1481 | .TP
|
---|
1482 | .B [:upper:]
|
---|
1483 | Upper-case alphabetic characters.
|
---|
1484 | .TP
|
---|
1485 | .B [:xdigit:]
|
---|
1486 | Characters that are hexadecimal digits.
|
---|
1487 | .PP
|
---|
1488 | For example, before the \*(PX standard, to match alphanumeric
|
---|
1489 | characters, you would have had to write
|
---|
1490 | .BR /[A\-Za\-z0\-9]/ .
|
---|
1491 | If your character set had other alphabetic characters in it, this would not
|
---|
1492 | match them, and if your character set collated differently from
|
---|
1493 | \s-1ASCII\s+1, this might not even match the
|
---|
1494 | \s-1ASCII\s+1 alphanumeric characters.
|
---|
1495 | With the \*(PX character classes, you can write
|
---|
1496 | .BR /[[:alnum:]]/ ,
|
---|
1497 | and this matches
|
---|
1498 | the alphabetic and numeric characters in your character set.
|
---|
1499 | .PP
|
---|
1500 | Two additional special sequences can appear in character lists.
|
---|
1501 | These apply to non-\s-1ASCII\s+1 character sets, which can have single symbols
|
---|
1502 | (called
|
---|
1503 | .IR "collating elements" )
|
---|
1504 | that are represented with more than one
|
---|
1505 | character, as well as several characters that are equivalent for
|
---|
1506 | .IR collating ,
|
---|
1507 | or sorting, purposes. (E.g., in French, a plain \*(lqe\*(rq
|
---|
1508 | and a grave-accented e\` are equivalent.)
|
---|
1509 | .TP
|
---|
1510 | Collating Symbols
|
---|
1511 | A collating symbol is a multi-character collating element enclosed in
|
---|
1512 | .B [.
|
---|
1513 | and
|
---|
1514 | .BR .] .
|
---|
1515 | For example, if
|
---|
1516 | .B ch
|
---|
1517 | is a collating element, then
|
---|
1518 | .B [[.ch.]]
|
---|
1519 | is a regular expression that matches this collating element, while
|
---|
1520 | .B [ch]
|
---|
1521 | is a regular expression that matches either
|
---|
1522 | .B c
|
---|
1523 | or
|
---|
1524 | .BR h .
|
---|
1525 | .TP
|
---|
1526 | Equivalence Classes
|
---|
1527 | An equivalence class is a locale-specific name for a list of
|
---|
1528 | characters that are equivalent. The name is enclosed in
|
---|
1529 | .B [=
|
---|
1530 | and
|
---|
1531 | .BR =] .
|
---|
1532 | For example, the name
|
---|
1533 | .B e
|
---|
1534 | might be used to represent all of
|
---|
1535 | \*(lqe,\*(rq \*(lqe\h'-\w:e:u'\',\*(rq and \*(lqe\h'-\w:e:u'\`.\*(rq
|
---|
1536 | In this case,
|
---|
1537 | .B [[=e=]]
|
---|
1538 | is a regular expression
|
---|
1539 | that matches any of
|
---|
1540 | .BR e ,
|
---|
1541 | .BR "e\h'-\w:e:u'\'" ,
|
---|
1542 | or
|
---|
1543 | .BR "e\h'-\w:e:u'\`" .
|
---|
1544 | .PP
|
---|
1545 | These features are very valuable in non-English speaking locales.
|
---|
1546 | The library functions that
|
---|
1547 | .I gawk
|
---|
1548 | uses for regular expression matching
|
---|
1549 | currently only recognize \*(PX character classes; they do not recognize
|
---|
1550 | collating symbols or equivalence classes.
|
---|
1551 | .PP
|
---|
1552 | The
|
---|
1553 | .BR \ey ,
|
---|
1554 | .BR \eB ,
|
---|
1555 | .BR \e< ,
|
---|
1556 | .BR \e> ,
|
---|
1557 | .BR \ew ,
|
---|
1558 | .BR \eW ,
|
---|
1559 | .BR \e` ,
|
---|
1560 | and
|
---|
1561 | .B \e'
|
---|
1562 | operators are specific to
|
---|
1563 | .IR gawk ;
|
---|
1564 | they are extensions based on facilities in the \*(GN regular expression libraries.
|
---|
1565 | .PP
|
---|
1566 | The various command line options
|
---|
1567 | control how
|
---|
1568 | .I gawk
|
---|
1569 | interprets characters in regular expressions.
|
---|
1570 | .TP
|
---|
1571 | No options
|
---|
1572 | In the default case,
|
---|
1573 | .I gawk
|
---|
1574 | provide all the facilities of
|
---|
1575 | \*(PX regular expressions and the \*(GN regular expression operators described above.
|
---|
1576 | However, interval expressions are not supported.
|
---|
1577 | .TP
|
---|
1578 | .B \-\^\-posix
|
---|
1579 | Only \*(PX regular expressions are supported, the \*(GN operators are not special.
|
---|
1580 | (E.g.,
|
---|
1581 | .B \ew
|
---|
1582 | matches a literal
|
---|
1583 | .BR w ).
|
---|
1584 | Interval expressions are allowed.
|
---|
1585 | .TP
|
---|
1586 | .B \-\^\-traditional
|
---|
1587 | Traditional Unix
|
---|
1588 | .I awk
|
---|
1589 | regular expressions are matched. The \*(GN operators
|
---|
1590 | are not special, interval expressions are not available, and neither
|
---|
1591 | are the \*(PX character classes
|
---|
1592 | .RB ( [[:alnum:]]
|
---|
1593 | and so on).
|
---|
1594 | Characters described by octal and hexadecimal escape sequences are
|
---|
1595 | treated literally, even if they represent regular expression metacharacters.
|
---|
1596 | .TP
|
---|
1597 | .B \-\^\-re\-interval
|
---|
1598 | Allow interval expressions in regular expressions, even if
|
---|
1599 | .B \-\^\-traditional
|
---|
1600 | has been provided.
|
---|
1601 | .SS Actions
|
---|
1602 | Action statements are enclosed in braces,
|
---|
1603 | .B {
|
---|
1604 | and
|
---|
1605 | .BR } .
|
---|
1606 | Action statements consist of the usual assignment, conditional, and looping
|
---|
1607 | statements found in most languages. The operators, control statements,
|
---|
1608 | and input/output statements
|
---|
1609 | available are patterned after those in C.
|
---|
1610 | .SS Operators
|
---|
1611 | .PP
|
---|
1612 | The operators in \*(AK, in order of decreasing precedence, are
|
---|
1613 | .PP
|
---|
1614 | .TP "\w'\fB*= /= %= ^=\fR'u+1n"
|
---|
1615 | .BR ( \&.\|.\|. )
|
---|
1616 | Grouping
|
---|
1617 | .TP
|
---|
1618 | .B $
|
---|
1619 | Field reference.
|
---|
1620 | .TP
|
---|
1621 | .B "++ \-\^\-"
|
---|
1622 | Increment and decrement, both prefix and postfix.
|
---|
1623 | .TP
|
---|
1624 | .B ^
|
---|
1625 | Exponentiation (\fB**\fR may also be used, and \fB**=\fR for
|
---|
1626 | the assignment operator).
|
---|
1627 | .TP
|
---|
1628 | .B "+ \- !"
|
---|
1629 | Unary plus, unary minus, and logical negation.
|
---|
1630 | .TP
|
---|
1631 | .B "* / %"
|
---|
1632 | Multiplication, division, and modulus.
|
---|
1633 | .TP
|
---|
1634 | .B "+ \-"
|
---|
1635 | Addition and subtraction.
|
---|
1636 | .TP
|
---|
1637 | .I space
|
---|
1638 | String concatenation.
|
---|
1639 | .TP
|
---|
1640 | .PD 0
|
---|
1641 | .B "< >"
|
---|
1642 | .TP
|
---|
1643 | .PD 0
|
---|
1644 | .B "<= >="
|
---|
1645 | .TP
|
---|
1646 | .PD
|
---|
1647 | .B "!= =="
|
---|
1648 | The regular relational operators.
|
---|
1649 | .TP
|
---|
1650 | .B "~ !~"
|
---|
1651 | Regular expression match, negated match.
|
---|
1652 | .B NOTE:
|
---|
1653 | Do not use a constant regular expression
|
---|
1654 | .RB ( /foo/ )
|
---|
1655 | on the left-hand side of a
|
---|
1656 | .B ~
|
---|
1657 | or
|
---|
1658 | .BR !~ .
|
---|
1659 | Only use one on the right-hand side. The expression
|
---|
1660 | .BI "/foo/ ~ " exp
|
---|
1661 | has the same meaning as \fB(($0 ~ /foo/) ~ \fIexp\fB)\fR.
|
---|
1662 | This is usually
|
---|
1663 | .I not
|
---|
1664 | what was intended.
|
---|
1665 | .TP
|
---|
1666 | .B in
|
---|
1667 | Array membership.
|
---|
1668 | .TP
|
---|
1669 | .B &&
|
---|
1670 | Logical AND.
|
---|
1671 | .TP
|
---|
1672 | .B ||
|
---|
1673 | Logical OR.
|
---|
1674 | .TP
|
---|
1675 | .B ?:
|
---|
1676 | The C conditional expression. This has the form
|
---|
1677 | .IB expr1 " ? " expr2 " : " expr3\c
|
---|
1678 | \&.
|
---|
1679 | If
|
---|
1680 | .I expr1
|
---|
1681 | is true, the value of the expression is
|
---|
1682 | .IR expr2 ,
|
---|
1683 | otherwise it is
|
---|
1684 | .IR expr3 .
|
---|
1685 | Only one of
|
---|
1686 | .I expr2
|
---|
1687 | and
|
---|
1688 | .I expr3
|
---|
1689 | is evaluated.
|
---|
1690 | .TP
|
---|
1691 | .PD 0
|
---|
1692 | .B "= += \-="
|
---|
1693 | .TP
|
---|
1694 | .PD
|
---|
1695 | .B "*= /= %= ^="
|
---|
1696 | Assignment. Both absolute assignment
|
---|
1697 | .BI ( var " = " value )
|
---|
1698 | and operator-assignment (the other forms) are supported.
|
---|
1699 | .SS Control Statements
|
---|
1700 | .PP
|
---|
1701 | The control statements are
|
---|
1702 | as follows:
|
---|
1703 | .PP
|
---|
1704 | .RS
|
---|
1705 | .nf
|
---|
1706 | \fBif (\fIcondition\fB) \fIstatement\fR [ \fBelse\fI statement \fR]
|
---|
1707 | \fBwhile (\fIcondition\fB) \fIstatement \fR
|
---|
1708 | \fBdo \fIstatement \fBwhile (\fIcondition\fB)\fR
|
---|
1709 | \fBfor (\fIexpr1\fB; \fIexpr2\fB; \fIexpr3\fB) \fIstatement\fR
|
---|
1710 | \fBfor (\fIvar \fBin\fI array\fB) \fIstatement\fR
|
---|
1711 | \fBbreak\fR
|
---|
1712 | \fBcontinue\fR
|
---|
1713 | \fBdelete \fIarray\^\fB[\^\fIindex\^\fB]\fR
|
---|
1714 | \fBdelete \fIarray\^\fR
|
---|
1715 | \fBexit\fR [ \fIexpression\fR ]
|
---|
1716 | \fB{ \fIstatements \fB}\fR
|
---|
1717 | .fi
|
---|
1718 | .RE
|
---|
1719 | .SS "I/O Statements"
|
---|
1720 | .PP
|
---|
1721 | The input/output statements are as follows:
|
---|
1722 | .PP
|
---|
1723 | .TP "\w'\fBprintf \fIfmt, expr-list\fR'u+1n"
|
---|
1724 | \fBclose(\fIfile \fR[\fB, \fIhow\fR]\fB)\fR
|
---|
1725 | Close file, pipe or co-process.
|
---|
1726 | The optional
|
---|
1727 | .I how
|
---|
1728 | should only be used when closing one end of a
|
---|
1729 | two-way pipe to a co-process.
|
---|
1730 | It must be a string value, either
|
---|
1731 | \fB"to"\fR or \fB"from"\fR.
|
---|
1732 | .TP
|
---|
1733 | .B getline
|
---|
1734 | Set
|
---|
1735 | .B $0
|
---|
1736 | from next input record; set
|
---|
1737 | .BR NF ,
|
---|
1738 | .BR NR ,
|
---|
1739 | .BR FNR .
|
---|
1740 | .TP
|
---|
1741 | .BI "getline <" file
|
---|
1742 | Set
|
---|
1743 | .B $0
|
---|
1744 | from next record of
|
---|
1745 | .IR file ;
|
---|
1746 | set
|
---|
1747 | .BR NF .
|
---|
1748 | .TP
|
---|
1749 | .BI getline " var"
|
---|
1750 | Set
|
---|
1751 | .I var
|
---|
1752 | from next input record; set
|
---|
1753 | .BR NR ,
|
---|
1754 | .BR FNR .
|
---|
1755 | .TP
|
---|
1756 | .BI getline " var" " <" file
|
---|
1757 | Set
|
---|
1758 | .I var
|
---|
1759 | from next record of
|
---|
1760 | .IR file .
|
---|
1761 | .TP
|
---|
1762 | \fIcommand\fB | getline \fR[\fIvar\fR]
|
---|
1763 | Run
|
---|
1764 | .I command
|
---|
1765 | piping the output either into
|
---|
1766 | .B $0
|
---|
1767 | or
|
---|
1768 | .IR var ,
|
---|
1769 | as above.
|
---|
1770 | .TP
|
---|
1771 | \fIcommand\fB |& getline \fR[\fIvar\fR]
|
---|
1772 | Run
|
---|
1773 | .I command
|
---|
1774 | as a co-process
|
---|
1775 | piping the output either into
|
---|
1776 | .B $0
|
---|
1777 | or
|
---|
1778 | .IR var ,
|
---|
1779 | as above.
|
---|
1780 | Co-processes are a
|
---|
1781 | .I gawk
|
---|
1782 | extension.
|
---|
1783 | .TP
|
---|
1784 | .B next
|
---|
1785 | Stop processing the current input record. The next input record
|
---|
1786 | is read and processing starts over with the first pattern in the
|
---|
1787 | \*(AK program. If the end of the input data is reached, the
|
---|
1788 | .B END
|
---|
1789 | block(s), if any, are executed.
|
---|
1790 | .TP
|
---|
1791 | .B "nextfile"
|
---|
1792 | Stop processing the current input file. The next input record read
|
---|
1793 | comes from the next input file.
|
---|
1794 | .B FILENAME
|
---|
1795 | and
|
---|
1796 | .B ARGIND
|
---|
1797 | are updated,
|
---|
1798 | .B FNR
|
---|
1799 | is reset to 1, and processing starts over with the first pattern in the
|
---|
1800 | \*(AK program. If the end of the input data is reached, the
|
---|
1801 | .B END
|
---|
1802 | block(s), if any, are executed.
|
---|
1803 | .TP
|
---|
1804 | .B print
|
---|
1805 | Prints the current record.
|
---|
1806 | The output record is terminated with the value of the
|
---|
1807 | .B ORS
|
---|
1808 | variable.
|
---|
1809 | .TP
|
---|
1810 | .BI print " expr-list"
|
---|
1811 | Prints expressions.
|
---|
1812 | Each expression is separated by the value of the
|
---|
1813 | .B OFS
|
---|
1814 | variable.
|
---|
1815 | The output record is terminated with the value of the
|
---|
1816 | .B ORS
|
---|
1817 | variable.
|
---|
1818 | .TP
|
---|
1819 | .BI print " expr-list" " >" file
|
---|
1820 | Prints expressions on
|
---|
1821 | .IR file .
|
---|
1822 | Each expression is separated by the value of the
|
---|
1823 | .B OFS
|
---|
1824 | variable. The output record is terminated with the value of the
|
---|
1825 | .B ORS
|
---|
1826 | variable.
|
---|
1827 | .TP
|
---|
1828 | .BI printf " fmt, expr-list"
|
---|
1829 | Format and print.
|
---|
1830 | .TP
|
---|
1831 | .BI printf " fmt, expr-list" " >" file
|
---|
1832 | Format and print on
|
---|
1833 | .IR file .
|
---|
1834 | .TP
|
---|
1835 | .BI system( cmd-line )
|
---|
1836 | Execute the command
|
---|
1837 | .IR cmd-line ,
|
---|
1838 | and return the exit status.
|
---|
1839 | (This may not be available on non-\*(PX systems.)
|
---|
1840 | .TP
|
---|
1841 | \&\fBfflush(\fR[\fIfile\^\fR]\fB)\fR
|
---|
1842 | Flush any buffers associated with the open output file or pipe
|
---|
1843 | .IR file .
|
---|
1844 | If
|
---|
1845 | .I file
|
---|
1846 | is missing, then standard output is flushed.
|
---|
1847 | If
|
---|
1848 | .I file
|
---|
1849 | is the null string,
|
---|
1850 | then all open output files and pipes
|
---|
1851 | have their buffers flushed.
|
---|
1852 | .PP
|
---|
1853 | Additional output redirections are allowed for
|
---|
1854 | .B print
|
---|
1855 | and
|
---|
1856 | .BR printf .
|
---|
1857 | .TP
|
---|
1858 | .BI "print .\|.\|. >>" " file"
|
---|
1859 | appends output to the
|
---|
1860 | .IR file .
|
---|
1861 | .TP
|
---|
1862 | .BI "print .\|.\|. |" " command"
|
---|
1863 | writes on a pipe.
|
---|
1864 | .TP
|
---|
1865 | .BI "print .\|.\|. |&" " command"
|
---|
1866 | sends data to a co-process.
|
---|
1867 | .PP
|
---|
1868 | The
|
---|
1869 | .BR getline
|
---|
1870 | command returns 0 on end of file and \-1 on an error.
|
---|
1871 | Upon an error,
|
---|
1872 | .B ERRNO
|
---|
1873 | contains a string describing the problem.
|
---|
1874 | .PP
|
---|
1875 | .B NOTE:
|
---|
1876 | If using a pipe or co-process to
|
---|
1877 | .BR getline ,
|
---|
1878 | or from
|
---|
1879 | .B print
|
---|
1880 | or
|
---|
1881 | .B printf
|
---|
1882 | within a loop, you
|
---|
1883 | .I must
|
---|
1884 | use
|
---|
1885 | .B close()
|
---|
1886 | to create new instances of the command.
|
---|
1887 | \*(AK does not automatically close pipes or co-processes when
|
---|
1888 | they return EOF.
|
---|
1889 | .SS The \fIprintf\fP\^ Statement
|
---|
1890 | .PP
|
---|
1891 | The \*(AK versions of the
|
---|
1892 | .B printf
|
---|
1893 | statement and
|
---|
1894 | .B sprintf()
|
---|
1895 | function
|
---|
1896 | (see below)
|
---|
1897 | accept the following conversion specification formats:
|
---|
1898 | .TP "\w'\fB%g\fR, \fB%G\fR'u+2n"
|
---|
1899 | .B %c
|
---|
1900 | An \s-1ASCII\s+1 character.
|
---|
1901 | If the argument used for
|
---|
1902 | .B %c
|
---|
1903 | is numeric, it is treated as a character and printed.
|
---|
1904 | Otherwise, the argument is assumed to be a string, and the only first
|
---|
1905 | character of that string is printed.
|
---|
1906 | .TP
|
---|
1907 | .BR "%d" "," " %i"
|
---|
1908 | A decimal number (the integer part).
|
---|
1909 | .TP
|
---|
1910 | .B %e , " %E"
|
---|
1911 | A floating point number of the form
|
---|
1912 | .BR [\-]d.dddddde[+\^\-]dd .
|
---|
1913 | The
|
---|
1914 | .B %E
|
---|
1915 | format uses
|
---|
1916 | .B E
|
---|
1917 | instead of
|
---|
1918 | .BR e .
|
---|
1919 | .TP
|
---|
1920 | .B %f
|
---|
1921 | A floating point number of the form
|
---|
1922 | .BR [\-]ddd.dddddd .
|
---|
1923 | .TP
|
---|
1924 | .B %g , " %G"
|
---|
1925 | Use
|
---|
1926 | .B %e
|
---|
1927 | or
|
---|
1928 | .B %f
|
---|
1929 | conversion, whichever is shorter, with nonsignificant zeros suppressed.
|
---|
1930 | The
|
---|
1931 | .B %G
|
---|
1932 | format uses
|
---|
1933 | .B %E
|
---|
1934 | instead of
|
---|
1935 | .BR %e .
|
---|
1936 | .TP
|
---|
1937 | .B %o
|
---|
1938 | An unsigned octal number (also an integer).
|
---|
1939 | .TP
|
---|
1940 | .PD
|
---|
1941 | .B %u
|
---|
1942 | An unsigned decimal number (again, an integer).
|
---|
1943 | .TP
|
---|
1944 | .B %s
|
---|
1945 | A character string.
|
---|
1946 | .TP
|
---|
1947 | .B %x , " %X"
|
---|
1948 | An unsigned hexadecimal number (an integer).
|
---|
1949 | The
|
---|
1950 | .B %X
|
---|
1951 | format uses
|
---|
1952 | .B ABCDEF
|
---|
1953 | instead of
|
---|
1954 | .BR abcdef .
|
---|
1955 | .TP
|
---|
1956 | .B %%
|
---|
1957 | A single
|
---|
1958 | .B %
|
---|
1959 | character; no argument is converted.
|
---|
1960 | .PP
|
---|
1961 | .BR NOTE :
|
---|
1962 | When using the integer format-control letters for values that are
|
---|
1963 | outside the range of a C
|
---|
1964 | .B long
|
---|
1965 | integer,
|
---|
1966 | .I gawk
|
---|
1967 | switches to the
|
---|
1968 | .B %g
|
---|
1969 | format specifier. If
|
---|
1970 | .B \-\^\-lint
|
---|
1971 | is provided on the command line
|
---|
1972 | .I gawk
|
---|
1973 | warns about this. Other versions of
|
---|
1974 | .I awk
|
---|
1975 | may print invalid values or do something else entirely.
|
---|
1976 | .PP
|
---|
1977 | Optional, additional parameters may lie between the
|
---|
1978 | .B %
|
---|
1979 | and the control letter:
|
---|
1980 | .TP
|
---|
1981 | .IB count $
|
---|
1982 | Use the
|
---|
1983 | .IR count "'th"
|
---|
1984 | argument at this point in the formatting.
|
---|
1985 | This is called a
|
---|
1986 | .I "positional specifier"
|
---|
1987 | and
|
---|
1988 | is intended primarily for use in translated versions of
|
---|
1989 | format strings, not in the original text of an AWK program.
|
---|
1990 | It is a
|
---|
1991 | .I gawk
|
---|
1992 | extension.
|
---|
1993 | .TP
|
---|
1994 | .B \-
|
---|
1995 | The expression should be left-justified within its field.
|
---|
1996 | .TP
|
---|
1997 | .I space
|
---|
1998 | For numeric conversions, prefix positive values with a space, and
|
---|
1999 | negative values with a minus sign.
|
---|
2000 | .TP
|
---|
2001 | .B +
|
---|
2002 | The plus sign, used before the width modifier (see below),
|
---|
2003 | says to always supply a sign for numeric conversions, even if the data
|
---|
2004 | to be formatted is positive. The
|
---|
2005 | .B +
|
---|
2006 | overrides the space modifier.
|
---|
2007 | .TP
|
---|
2008 | .B #
|
---|
2009 | Use an \*(lqalternate form\*(rq for certain control letters.
|
---|
2010 | For
|
---|
2011 | .BR %o ,
|
---|
2012 | supply a leading zero.
|
---|
2013 | For
|
---|
2014 | .BR %x ,
|
---|
2015 | and
|
---|
2016 | .BR %X ,
|
---|
2017 | supply a leading
|
---|
2018 | .BR 0x
|
---|
2019 | or
|
---|
2020 | .BR 0X
|
---|
2021 | for
|
---|
2022 | a nonzero result.
|
---|
2023 | For
|
---|
2024 | .BR %e ,
|
---|
2025 | .BR %E ,
|
---|
2026 | and
|
---|
2027 | .BR %f ,
|
---|
2028 | the result always contains a
|
---|
2029 | decimal point.
|
---|
2030 | For
|
---|
2031 | .BR %g ,
|
---|
2032 | and
|
---|
2033 | .BR %G ,
|
---|
2034 | trailing zeros are not removed from the result.
|
---|
2035 | .TP
|
---|
2036 | .B 0
|
---|
2037 | A leading
|
---|
2038 | .B 0
|
---|
2039 | (zero) acts as a flag, that indicates output should be
|
---|
2040 | padded with zeroes instead of spaces.
|
---|
2041 | This applies even to non-numeric output formats.
|
---|
2042 | This flag only has an effect when the field width is wider than the
|
---|
2043 | value to be printed.
|
---|
2044 | .TP
|
---|
2045 | .I width
|
---|
2046 | The field should be padded to this width. The field is normally padded
|
---|
2047 | with spaces. If the
|
---|
2048 | .B 0
|
---|
2049 | flag has been used, it is padded with zeroes.
|
---|
2050 | .TP
|
---|
2051 | .BI \&. prec
|
---|
2052 | A number that specifies the precision to use when printing.
|
---|
2053 | For the
|
---|
2054 | .BR %e ,
|
---|
2055 | .BR %E ,
|
---|
2056 | and
|
---|
2057 | .BR %f
|
---|
2058 | formats, this specifies the
|
---|
2059 | number of digits you want printed to the right of the decimal point.
|
---|
2060 | For the
|
---|
2061 | .BR %g ,
|
---|
2062 | and
|
---|
2063 | .B %G
|
---|
2064 | formats, it specifies the maximum number
|
---|
2065 | of significant digits. For the
|
---|
2066 | .BR %d ,
|
---|
2067 | .BR %o ,
|
---|
2068 | .BR %i ,
|
---|
2069 | .BR %u ,
|
---|
2070 | .BR %x ,
|
---|
2071 | and
|
---|
2072 | .B %X
|
---|
2073 | formats, it specifies the minimum number of
|
---|
2074 | digits to print. For
|
---|
2075 | .BR %s ,
|
---|
2076 | it specifies the maximum number of
|
---|
2077 | characters from the string that should be printed.
|
---|
2078 | .PP
|
---|
2079 | The dynamic
|
---|
2080 | .I width
|
---|
2081 | and
|
---|
2082 | .I prec
|
---|
2083 | capabilities of the \*(AN C
|
---|
2084 | .B printf()
|
---|
2085 | routines are supported.
|
---|
2086 | A
|
---|
2087 | .B *
|
---|
2088 | in place of either the
|
---|
2089 | .B width
|
---|
2090 | or
|
---|
2091 | .B prec
|
---|
2092 | specifications causes their values to be taken from
|
---|
2093 | the argument list to
|
---|
2094 | .B printf
|
---|
2095 | or
|
---|
2096 | .BR sprintf() .
|
---|
2097 | To use a positional specifier with a dynamic width or precision,
|
---|
2098 | supply the
|
---|
2099 | .IB count $
|
---|
2100 | after the
|
---|
2101 | .B *
|
---|
2102 | in the format string.
|
---|
2103 | For example, \fB"%3$*2$.*1$s"\fP.
|
---|
2104 | .SS Special File Names
|
---|
2105 | .PP
|
---|
2106 | When doing I/O redirection from either
|
---|
2107 | .B print
|
---|
2108 | or
|
---|
2109 | .B printf
|
---|
2110 | into a file,
|
---|
2111 | or via
|
---|
2112 | .B getline
|
---|
2113 | from a file,
|
---|
2114 | .I gawk
|
---|
2115 | recognizes certain special filenames internally. These filenames
|
---|
2116 | allow access to open file descriptors inherited from
|
---|
2117 | .IR gawk\^ "'s"
|
---|
2118 | parent process (usually the shell).
|
---|
2119 | These file names may also be used on the command line to name data files.
|
---|
2120 | The filenames are:
|
---|
2121 | .TP "\w'\fB/dev/stdout\fR'u+1n"
|
---|
2122 | .B /dev/stdin
|
---|
2123 | The standard input.
|
---|
2124 | .TP
|
---|
2125 | .B /dev/stdout
|
---|
2126 | The standard output.
|
---|
2127 | .TP
|
---|
2128 | .B /dev/stderr
|
---|
2129 | The standard error output.
|
---|
2130 | .TP
|
---|
2131 | .BI /dev/fd/\^ n
|
---|
2132 | The file associated with the open file descriptor
|
---|
2133 | .IR n .
|
---|
2134 | .PP
|
---|
2135 | These are particularly useful for error messages. For example:
|
---|
2136 | .PP
|
---|
2137 | .RS
|
---|
2138 | .ft B
|
---|
2139 | print "You blew it!" > "/dev/stderr"
|
---|
2140 | .ft R
|
---|
2141 | .RE
|
---|
2142 | .PP
|
---|
2143 | whereas you would otherwise have to use
|
---|
2144 | .PP
|
---|
2145 | .RS
|
---|
2146 | .ft B
|
---|
2147 | print "You blew it!" | "cat 1>&2"
|
---|
2148 | .ft R
|
---|
2149 | .RE
|
---|
2150 | .PP
|
---|
2151 | The following special filenames may be used with the
|
---|
2152 | .B |&
|
---|
2153 | co-process operator for creating TCP/IP network connections.
|
---|
2154 | .TP "\w'\fB/inet/tcp/\fIlport\fB/\fIrhost\fB/\fIrport\fR'u+2n"
|
---|
2155 | .BI /inet/tcp/ lport / rhost / rport
|
---|
2156 | File for TCP/IP connection on local port
|
---|
2157 | .I lport
|
---|
2158 | to
|
---|
2159 | remote host
|
---|
2160 | .I rhost
|
---|
2161 | on remote port
|
---|
2162 | .IR rport .
|
---|
2163 | Use a port of
|
---|
2164 | .B 0
|
---|
2165 | to have the system pick a port.
|
---|
2166 | .TP
|
---|
2167 | .BI /inet/udp/ lport / rhost / rport
|
---|
2168 | Similar, but use UDP/IP instead of TCP/IP.
|
---|
2169 | .TP
|
---|
2170 | .BI /inet/raw/ lport / rhost / rport
|
---|
2171 | .\" Similar, but use raw IP sockets.
|
---|
2172 | Reserved for future use.
|
---|
2173 | .PP
|
---|
2174 | Other special filenames provide access to information about the running
|
---|
2175 | .I gawk
|
---|
2176 | process.
|
---|
2177 | .B "These filenames are now obsolete."
|
---|
2178 | Use the
|
---|
2179 | .B PROCINFO
|
---|
2180 | array to obtain the information they provide.
|
---|
2181 | The filenames are:
|
---|
2182 | .TP "\w'\fB/dev/stdout\fR'u+1n"
|
---|
2183 | .B /dev/pid
|
---|
2184 | Reading this file returns the process ID of the current process,
|
---|
2185 | in decimal, terminated with a newline.
|
---|
2186 | .TP
|
---|
2187 | .B /dev/ppid
|
---|
2188 | Reading this file returns the parent process ID of the current process,
|
---|
2189 | in decimal, terminated with a newline.
|
---|
2190 | .TP
|
---|
2191 | .B /dev/pgrpid
|
---|
2192 | Reading this file returns the process group ID of the current process,
|
---|
2193 | in decimal, terminated with a newline.
|
---|
2194 | .TP
|
---|
2195 | .B /dev/user
|
---|
2196 | Reading this file returns a single record terminated with a newline.
|
---|
2197 | The fields are separated with spaces.
|
---|
2198 | .B $1
|
---|
2199 | is the value of the
|
---|
2200 | .IR getuid (2)
|
---|
2201 | system call,
|
---|
2202 | .B $2
|
---|
2203 | is the value of the
|
---|
2204 | .IR geteuid (2)
|
---|
2205 | system call,
|
---|
2206 | .B $3
|
---|
2207 | is the value of the
|
---|
2208 | .IR getgid (2)
|
---|
2209 | system call, and
|
---|
2210 | .B $4
|
---|
2211 | is the value of the
|
---|
2212 | .IR getegid (2)
|
---|
2213 | system call.
|
---|
2214 | If there are any additional fields, they are the group IDs returned by
|
---|
2215 | .IR getgroups (2).
|
---|
2216 | Multiple groups may not be supported on all systems.
|
---|
2217 | .SS Numeric Functions
|
---|
2218 | .PP
|
---|
2219 | \*(AK has the following built-in arithmetic functions:
|
---|
2220 | .PP
|
---|
2221 | .TP "\w'\fBsrand(\fR[\fIexpr\^\fR]\fB)\fR'u+1n"
|
---|
2222 | .BI atan2( y , " x" )
|
---|
2223 | Returns the arctangent of
|
---|
2224 | .I y/x
|
---|
2225 | in radians.
|
---|
2226 | .TP
|
---|
2227 | .BI cos( expr )
|
---|
2228 | Returns the cosine of
|
---|
2229 | .IR expr ,
|
---|
2230 | which is in radians.
|
---|
2231 | .TP
|
---|
2232 | .BI exp( expr )
|
---|
2233 | The exponential function.
|
---|
2234 | .TP
|
---|
2235 | .BI int( expr )
|
---|
2236 | Truncates to integer.
|
---|
2237 | .TP
|
---|
2238 | .BI log( expr )
|
---|
2239 | The natural logarithm function.
|
---|
2240 | .TP
|
---|
2241 | .B rand()
|
---|
2242 | Returns a random number
|
---|
2243 | .IR N ,
|
---|
2244 | between 0 and 1,
|
---|
2245 | such that 0 \(<= \fIN\fP < 1.
|
---|
2246 | .TP
|
---|
2247 | .BI sin( expr )
|
---|
2248 | Returns the sine of
|
---|
2249 | .IR expr ,
|
---|
2250 | which is in radians.
|
---|
2251 | .TP
|
---|
2252 | .BI sqrt( expr )
|
---|
2253 | The square root function.
|
---|
2254 | .TP
|
---|
2255 | \&\fBsrand(\fR[\fIexpr\^\fR]\fB)\fR
|
---|
2256 | Uses
|
---|
2257 | .I expr
|
---|
2258 | as a new seed for the random number generator. If no
|
---|
2259 | .I expr
|
---|
2260 | is provided, the time of day is used.
|
---|
2261 | The return value is the previous seed for the random
|
---|
2262 | number generator.
|
---|
2263 | .SS String Functions
|
---|
2264 | .PP
|
---|
2265 | .I Gawk
|
---|
2266 | has the following built-in string functions:
|
---|
2267 | .PP
|
---|
2268 | .TP "\w'\fBsprintf(\^\fIfmt\fB\^, \fIexpr-list\^\fB)\fR'u+1n"
|
---|
2269 | \fBasort(\fIs \fR[\fB, \fId\fR]\fB)\fR
|
---|
2270 | Returns the number of elements in the source
|
---|
2271 | array
|
---|
2272 | .IR s .
|
---|
2273 | The contents of
|
---|
2274 | .I s
|
---|
2275 | are sorted using
|
---|
2276 | .IR gawk\^ "'s"
|
---|
2277 | normal rules for
|
---|
2278 | comparing values, and the indexes of the
|
---|
2279 | sorted values of
|
---|
2280 | .I s
|
---|
2281 | are replaced with sequential
|
---|
2282 | integers starting with 1. If the optional
|
---|
2283 | destination array
|
---|
2284 | .I d
|
---|
2285 | is specified, then
|
---|
2286 | .I s
|
---|
2287 | is first duplicated into
|
---|
2288 | .IR d ,
|
---|
2289 | and then
|
---|
2290 | .I d
|
---|
2291 | is sorted, leaving the indexes of the
|
---|
2292 | source array
|
---|
2293 | .I s
|
---|
2294 | unchanged.
|
---|
2295 | .TP "\w'\fBsprintf(\^\fIfmt\fB\^, \fIexpr-list\^\fB)\fR'u+1n"
|
---|
2296 | \fBasorti(\fIs \fR[\fB, \fId\fR]\fB)\fR
|
---|
2297 | Returns the number of elements in the source
|
---|
2298 | array
|
---|
2299 | .IR s .
|
---|
2300 | The behavior is the same as that of
|
---|
2301 | .BR asort() ,
|
---|
2302 | except that the array
|
---|
2303 | .I indices
|
---|
2304 | are used for sorting, not the array values.
|
---|
2305 | When done, the array is indexed numerically, and
|
---|
2306 | the values are those of the original indices.
|
---|
2307 | The original values are lost; thus provide
|
---|
2308 | a second array if you wish to preserve the original.
|
---|
2309 | .TP
|
---|
2310 | \fBgensub(\fIr\fB, \fIs\fB, \fIh \fR[\fB, \fIt\fR]\fB)\fR
|
---|
2311 | Search the target string
|
---|
2312 | .I t
|
---|
2313 | for matches of the regular expression
|
---|
2314 | .IR r .
|
---|
2315 | If
|
---|
2316 | .I h
|
---|
2317 | is a string beginning with
|
---|
2318 | .B g
|
---|
2319 | or
|
---|
2320 | .BR G ,
|
---|
2321 | then replace all matches of
|
---|
2322 | .I r
|
---|
2323 | with
|
---|
2324 | .IR s .
|
---|
2325 | Otherwise,
|
---|
2326 | .I h
|
---|
2327 | is a number indicating which match of
|
---|
2328 | .I r
|
---|
2329 | to replace.
|
---|
2330 | If
|
---|
2331 | .I t
|
---|
2332 | is not supplied,
|
---|
2333 | .B $0
|
---|
2334 | is used instead.
|
---|
2335 | Within the replacement text
|
---|
2336 | .IR s ,
|
---|
2337 | the sequence
|
---|
2338 | .BI \e n\fR,
|
---|
2339 | where
|
---|
2340 | .I n
|
---|
2341 | is a digit from 1 to 9, may be used to indicate just the text that
|
---|
2342 | matched the
|
---|
2343 | .IR n 'th
|
---|
2344 | parenthesized subexpression. The sequence
|
---|
2345 | .B \e0
|
---|
2346 | represents the entire matched text, as does the character
|
---|
2347 | .BR & .
|
---|
2348 | Unlike
|
---|
2349 | .B sub()
|
---|
2350 | and
|
---|
2351 | .BR gsub() ,
|
---|
2352 | the modified string is returned as the result of the function,
|
---|
2353 | and the original target string is
|
---|
2354 | .I not
|
---|
2355 | changed.
|
---|
2356 | .TP "\w'\fBsprintf(\^\fIfmt\fB\^, \fIexpr-list\^\fB)\fR'u+1n"
|
---|
2357 | \fBgsub(\fIr\fB, \fIs \fR[\fB, \fIt\fR]\fB)\fR
|
---|
2358 | For each substring matching the regular expression
|
---|
2359 | .I r
|
---|
2360 | in the string
|
---|
2361 | .IR t ,
|
---|
2362 | substitute the string
|
---|
2363 | .IR s ,
|
---|
2364 | and return the number of substitutions.
|
---|
2365 | If
|
---|
2366 | .I t
|
---|
2367 | is not supplied, use
|
---|
2368 | .BR $0 .
|
---|
2369 | An
|
---|
2370 | .B &
|
---|
2371 | in the replacement text is replaced with the text that was actually matched.
|
---|
2372 | Use
|
---|
2373 | .B \e&
|
---|
2374 | to get a literal
|
---|
2375 | .BR & .
|
---|
2376 | (This must be typed as \fB"\e\e&"\fP;
|
---|
2377 | see \*(EP
|
---|
2378 | for a fuller discussion of the rules for
|
---|
2379 | .BR &'s
|
---|
2380 | and backslashes in the replacement text of
|
---|
2381 | .BR sub() ,
|
---|
2382 | .BR gsub() ,
|
---|
2383 | and
|
---|
2384 | .BR gensub() .)
|
---|
2385 | .TP
|
---|
2386 | .BI index( s , " t" )
|
---|
2387 | Returns the index of the string
|
---|
2388 | .I t
|
---|
2389 | in the string
|
---|
2390 | .IR s ,
|
---|
2391 | or 0 if
|
---|
2392 | .I t
|
---|
2393 | is not present.
|
---|
2394 | (This implies that character indices start at one.)
|
---|
2395 | .TP
|
---|
2396 | \fBlength(\fR[\fIs\fR]\fB)
|
---|
2397 | Returns the length of the string
|
---|
2398 | .IR s ,
|
---|
2399 | or the length of
|
---|
2400 | .B $0
|
---|
2401 | if
|
---|
2402 | .I s
|
---|
2403 | is not supplied.
|
---|
2404 | Starting with version 3.1.5,
|
---|
2405 | as a non-standard extension, with an array argument,
|
---|
2406 | .B length()
|
---|
2407 | returns the number of elements in the array.
|
---|
2408 | .TP
|
---|
2409 | \fBmatch(\fIs\fB, \fIr \fR[\fB, \fIa\fR]\fB)\fR
|
---|
2410 | Returns the position in
|
---|
2411 | .I s
|
---|
2412 | where the regular expression
|
---|
2413 | .I r
|
---|
2414 | occurs, or 0 if
|
---|
2415 | .I r
|
---|
2416 | is not present, and sets the values of
|
---|
2417 | .B RSTART
|
---|
2418 | and
|
---|
2419 | .BR RLENGTH .
|
---|
2420 | Note that the argument order is the same as for the
|
---|
2421 | .B ~
|
---|
2422 | operator:
|
---|
2423 | .IB str " ~"
|
---|
2424 | .IR re .
|
---|
2425 | .ft R
|
---|
2426 | If array
|
---|
2427 | .I a
|
---|
2428 | is provided,
|
---|
2429 | .I a
|
---|
2430 | is cleared and then elements 1 through
|
---|
2431 | .I n
|
---|
2432 | are filled with the portions of
|
---|
2433 | .I s
|
---|
2434 | that match the corresponding parenthesized
|
---|
2435 | subexpression in
|
---|
2436 | .IR r .
|
---|
2437 | The 0'th element of
|
---|
2438 | .I a
|
---|
2439 | contains the portion
|
---|
2440 | of
|
---|
2441 | .I s
|
---|
2442 | matched by the entire regular expression
|
---|
2443 | .IR r .
|
---|
2444 | Subscripts
|
---|
2445 | \fBa[\fIn\^\fB, "start"]\fR,
|
---|
2446 | and
|
---|
2447 | \fBa[\fIn\^\fB, "length"]\fR
|
---|
2448 | provide the starting index in the string and length
|
---|
2449 | respectively, of each matching substring.
|
---|
2450 | .TP
|
---|
2451 | \fBsplit(\fIs\fB, \fIa \fR[\fB, \fIr\fR]\fB)\fR
|
---|
2452 | Splits the string
|
---|
2453 | .I s
|
---|
2454 | into the array
|
---|
2455 | .I a
|
---|
2456 | on the regular expression
|
---|
2457 | .IR r ,
|
---|
2458 | and returns the number of fields. If
|
---|
2459 | .I r
|
---|
2460 | is omitted,
|
---|
2461 | .B FS
|
---|
2462 | is used instead.
|
---|
2463 | The array
|
---|
2464 | .I a
|
---|
2465 | is cleared first.
|
---|
2466 | Splitting behaves identically to field splitting, described above.
|
---|
2467 | .TP
|
---|
2468 | .BI sprintf( fmt , " expr-list" )
|
---|
2469 | Prints
|
---|
2470 | .I expr-list
|
---|
2471 | according to
|
---|
2472 | .IR fmt ,
|
---|
2473 | and returns the resulting string.
|
---|
2474 | .TP
|
---|
2475 | .BI strtonum( str )
|
---|
2476 | Examines
|
---|
2477 | .IR str ,
|
---|
2478 | and returns its numeric value.
|
---|
2479 | If
|
---|
2480 | .I str
|
---|
2481 | begins
|
---|
2482 | with a leading
|
---|
2483 | .BR 0 ,
|
---|
2484 | .B strtonum()
|
---|
2485 | assumes that
|
---|
2486 | .I str
|
---|
2487 | is an octal number.
|
---|
2488 | If
|
---|
2489 | .I str
|
---|
2490 | begins
|
---|
2491 | with a leading
|
---|
2492 | .B 0x
|
---|
2493 | or
|
---|
2494 | .BR 0X ,
|
---|
2495 | .B strtonum()
|
---|
2496 | assumes that
|
---|
2497 | .I str
|
---|
2498 | is a hexadecimal number.
|
---|
2499 | .TP
|
---|
2500 | \fBsub(\fIr\fB, \fIs \fR[\fB, \fIt\fR]\fB)\fR
|
---|
2501 | Just like
|
---|
2502 | .BR gsub() ,
|
---|
2503 | but only the first matching substring is replaced.
|
---|
2504 | .TP
|
---|
2505 | \fBsubstr(\fIs\fB, \fIi \fR[\fB, \fIn\fR]\fB)\fR
|
---|
2506 | Returns the at most
|
---|
2507 | .IR n -character
|
---|
2508 | substring of
|
---|
2509 | .I s
|
---|
2510 | starting at
|
---|
2511 | .IR i .
|
---|
2512 | If
|
---|
2513 | .I n
|
---|
2514 | is omitted, the rest of
|
---|
2515 | .I s
|
---|
2516 | is used.
|
---|
2517 | .TP
|
---|
2518 | .BI tolower( str )
|
---|
2519 | Returns a copy of the string
|
---|
2520 | .IR str ,
|
---|
2521 | with all the upper-case characters in
|
---|
2522 | .I str
|
---|
2523 | translated to their corresponding lower-case counterparts.
|
---|
2524 | Non-alphabetic characters are left unchanged.
|
---|
2525 | .TP
|
---|
2526 | .BI toupper( str )
|
---|
2527 | Returns a copy of the string
|
---|
2528 | .IR str ,
|
---|
2529 | with all the lower-case characters in
|
---|
2530 | .I str
|
---|
2531 | translated to their corresponding upper-case counterparts.
|
---|
2532 | Non-alphabetic characters are left unchanged.
|
---|
2533 | .SS Time Functions
|
---|
2534 | Since one of the primary uses of \*(AK programs is processing log files
|
---|
2535 | that contain time stamp information,
|
---|
2536 | .I gawk
|
---|
2537 | provides the following functions for obtaining time stamps and
|
---|
2538 | formatting them.
|
---|
2539 | .PP
|
---|
2540 | .TP "\w'\fBsystime()\fR'u+1n"
|
---|
2541 | \fBmktime(\fIdatespec\fB)\fR
|
---|
2542 | Turns
|
---|
2543 | .I datespec
|
---|
2544 | into a time stamp of the same form as returned by
|
---|
2545 | .BR systime() .
|
---|
2546 | The
|
---|
2547 | .I datespec
|
---|
2548 | is a string of the form
|
---|
2549 | .IR "YYYY MM DD HH MM SS[ DST]" .
|
---|
2550 | The contents of the string are six or seven numbers representing respectively
|
---|
2551 | the full year including century,
|
---|
2552 | the month from 1 to 12,
|
---|
2553 | the day of the month from 1 to 31,
|
---|
2554 | the hour of the day from 0 to 23,
|
---|
2555 | the minute from 0 to 59,
|
---|
2556 | and the second from 0 to 60,
|
---|
2557 | and an optional daylight saving flag.
|
---|
2558 | The values of these numbers need not be within the ranges specified;
|
---|
2559 | for example, an hour of \-1 means 1 hour before midnight.
|
---|
2560 | The origin-zero Gregorian calendar is assumed,
|
---|
2561 | with year 0 preceding year 1 and year \-1 preceding year 0.
|
---|
2562 | The time is assumed to be in the local timezone.
|
---|
2563 | If the daylight saving flag is positive,
|
---|
2564 | the time is assumed to be daylight saving time;
|
---|
2565 | if zero, the time is assumed to be standard time;
|
---|
2566 | and if negative (the default),
|
---|
2567 | .B mktime()
|
---|
2568 | attempts to determine whether daylight saving time is in effect
|
---|
2569 | for the specified time.
|
---|
2570 | If
|
---|
2571 | .I datespec
|
---|
2572 | does not contain enough elements or if the resulting time
|
---|
2573 | is out of range,
|
---|
2574 | .B mktime()
|
---|
2575 | returns \-1.
|
---|
2576 | .TP
|
---|
2577 | \fBstrftime(\fR[\fIformat \fR[\fB, \fItimestamp\fR]]\fB)\fR
|
---|
2578 | Formats
|
---|
2579 | .I timestamp
|
---|
2580 | according to the specification in
|
---|
2581 | .IR format.
|
---|
2582 | The
|
---|
2583 | .I timestamp
|
---|
2584 | should be of the same form as returned by
|
---|
2585 | .BR systime() .
|
---|
2586 | If
|
---|
2587 | .I timestamp
|
---|
2588 | is missing, the current time of day is used.
|
---|
2589 | If
|
---|
2590 | .I format
|
---|
2591 | is missing, a default format equivalent to the output of
|
---|
2592 | .IR date (1)
|
---|
2593 | is used.
|
---|
2594 | See the specification for the
|
---|
2595 | .B strftime()
|
---|
2596 | function in \*(AN C for the format conversions that are
|
---|
2597 | guaranteed to be available.
|
---|
2598 | A public-domain version of
|
---|
2599 | .IR strftime (3)
|
---|
2600 | and a man page for it come with
|
---|
2601 | .IR gawk ;
|
---|
2602 | if that version was used to build
|
---|
2603 | .IR gawk ,
|
---|
2604 | then all of the conversions described in that man page are available to
|
---|
2605 | .IR gawk.
|
---|
2606 | .TP
|
---|
2607 | .B systime()
|
---|
2608 | Returns the current time of day as the number of seconds since the Epoch
|
---|
2609 | (1970-01-01 00:00:00 UTC on \*(PX systems).
|
---|
2610 | .SS Bit Manipulations Functions
|
---|
2611 | Starting with version 3.1 of
|
---|
2612 | .IR gawk ,
|
---|
2613 | the following bit manipulation functions are available.
|
---|
2614 | They work by converting double-precision floating point
|
---|
2615 | values to
|
---|
2616 | .B "unsigned long"
|
---|
2617 | integers, doing the operation, and then converting the
|
---|
2618 | result back to floating point.
|
---|
2619 | The functions are:
|
---|
2620 | .TP "\w'\fBrshift(\fIval\fB, \fIcount\fB)\fR'u+2n"
|
---|
2621 | \fBand(\fIv1\fB, \fIv2\fB)\fR
|
---|
2622 | Return the bitwise AND of the values provided by
|
---|
2623 | .I v1
|
---|
2624 | and
|
---|
2625 | .IR v2 .
|
---|
2626 | .TP
|
---|
2627 | \fBcompl(\fIval\fB)\fR
|
---|
2628 | Return the bitwise complement of
|
---|
2629 | .IR val .
|
---|
2630 | .TP
|
---|
2631 | \fBlshift(\fIval\fB, \fIcount\fB)\fR
|
---|
2632 | Return the value of
|
---|
2633 | .IR val ,
|
---|
2634 | shifted left by
|
---|
2635 | .I count
|
---|
2636 | bits.
|
---|
2637 | .TP
|
---|
2638 | \fBor(\fIv1\fB, \fIv2\fB)\fR
|
---|
2639 | Return the bitwise OR of the values provided by
|
---|
2640 | .I v1
|
---|
2641 | and
|
---|
2642 | .IR v2 .
|
---|
2643 | .TP
|
---|
2644 | \fBrshift(\fIval\fB, \fIcount\fB)\fR
|
---|
2645 | Return the value of
|
---|
2646 | .IR val ,
|
---|
2647 | shifted right by
|
---|
2648 | .I count
|
---|
2649 | bits.
|
---|
2650 | .TP
|
---|
2651 | \fBxor(\fIv1\fB, \fIv2\fB)\fR
|
---|
2652 | Return the bitwise XOR of the values provided by
|
---|
2653 | .I v1
|
---|
2654 | and
|
---|
2655 | .IR v2 .
|
---|
2656 | .PP
|
---|
2657 | .SS Internationalization Functions
|
---|
2658 | Starting with version 3.1 of
|
---|
2659 | .IR gawk ,
|
---|
2660 | the following functions may be used from within your AWK program for
|
---|
2661 | translating strings at run-time.
|
---|
2662 | For full details, see \*(EP.
|
---|
2663 | .TP
|
---|
2664 | \fBbindtextdomain(\fIdirectory \fR[\fB, \fIdomain\fR]\fB)\fR
|
---|
2665 | Specifies the directory where
|
---|
2666 | .I gawk
|
---|
2667 | looks for the
|
---|
2668 | .B \&.mo
|
---|
2669 | files, in case they
|
---|
2670 | will not or cannot be placed in the ``standard'' locations
|
---|
2671 | (e.g., during testing).
|
---|
2672 | It returns the directory where
|
---|
2673 | .I domain
|
---|
2674 | is ``bound.''
|
---|
2675 | .sp .5
|
---|
2676 | The default
|
---|
2677 | .I domain
|
---|
2678 | is the value of
|
---|
2679 | .BR TEXTDOMAIN .
|
---|
2680 | If
|
---|
2681 | .I directory
|
---|
2682 | is the null string (\fB""\fR), then
|
---|
2683 | .B bindtextdomain()
|
---|
2684 | returns the current binding for the
|
---|
2685 | given
|
---|
2686 | .IR domain .
|
---|
2687 | .TP
|
---|
2688 | \fBdcgettext(\fIstring \fR[\fB, \fIdomain \fR[\fB, \fIcategory\fR]]\fB)\fR
|
---|
2689 | Returns the translation of
|
---|
2690 | .I string
|
---|
2691 | in
|
---|
2692 | text domain
|
---|
2693 | .I domain
|
---|
2694 | for locale category
|
---|
2695 | .IR category .
|
---|
2696 | The default value for
|
---|
2697 | .I domain
|
---|
2698 | is the current value of
|
---|
2699 | .BR TEXTDOMAIN .
|
---|
2700 | The default value for
|
---|
2701 | .I category
|
---|
2702 | is \fB"LC_MESSAGES"\fR.
|
---|
2703 | .sp .5
|
---|
2704 | If you supply a value for
|
---|
2705 | .IR category ,
|
---|
2706 | it must be a string equal to
|
---|
2707 | one of the known locale categories described
|
---|
2708 | in \*(EP.
|
---|
2709 | You must also supply a text domain. Use
|
---|
2710 | .B TEXTDOMAIN
|
---|
2711 | if you want to use the current domain.
|
---|
2712 | .TP
|
---|
2713 | \fBdcngettext(\fIstring1 \fR, \fIstring2 \fR, \fInumber \fR[\fB, \fIdomain \fR[\fB, \fIcategory\fR]]\fB)\fR
|
---|
2714 | Returns the plural form used for
|
---|
2715 | .I number
|
---|
2716 | of the translation of
|
---|
2717 | .I string1
|
---|
2718 | and
|
---|
2719 | .I string2
|
---|
2720 | in
|
---|
2721 | text domain
|
---|
2722 | .I domain
|
---|
2723 | for locale category
|
---|
2724 | .IR category .
|
---|
2725 | The default value for
|
---|
2726 | .I domain
|
---|
2727 | is the current value of
|
---|
2728 | .BR TEXTDOMAIN .
|
---|
2729 | The default value for
|
---|
2730 | .I category
|
---|
2731 | is \fB"LC_MESSAGES"\fR.
|
---|
2732 | .sp .5
|
---|
2733 | If you supply a value for
|
---|
2734 | .IR category ,
|
---|
2735 | it must be a string equal to
|
---|
2736 | one of the known locale categories described
|
---|
2737 | in \*(EP.
|
---|
2738 | You must also supply a text domain. Use
|
---|
2739 | .B TEXTDOMAIN
|
---|
2740 | if you want to use the current domain.
|
---|
2741 | .SH USER-DEFINED FUNCTIONS
|
---|
2742 | Functions in \*(AK are defined as follows:
|
---|
2743 | .PP
|
---|
2744 | .RS
|
---|
2745 | \fBfunction \fIname\fB(\fIparameter list\fB) { \fIstatements \fB}\fR
|
---|
2746 | .RE
|
---|
2747 | .PP
|
---|
2748 | Functions are executed when they are called from within expressions
|
---|
2749 | in either patterns or actions. Actual parameters supplied in the function
|
---|
2750 | call are used to instantiate the formal parameters declared in the function.
|
---|
2751 | Arrays are passed by reference, other variables are passed by value.
|
---|
2752 | .PP
|
---|
2753 | Since functions were not originally part of the \*(AK language, the provision
|
---|
2754 | for local variables is rather clumsy: They are declared as extra parameters
|
---|
2755 | in the parameter list. The convention is to separate local variables from
|
---|
2756 | real parameters by extra spaces in the parameter list. For example:
|
---|
2757 | .PP
|
---|
2758 | .RS
|
---|
2759 | .ft B
|
---|
2760 | .nf
|
---|
2761 | function f(p, q, a, b) # a and b are local
|
---|
2762 | {
|
---|
2763 | \&.\|.\|.
|
---|
2764 | }
|
---|
2765 |
|
---|
2766 | /abc/ { .\|.\|. ; f(1, 2) ; .\|.\|. }
|
---|
2767 | .fi
|
---|
2768 | .ft R
|
---|
2769 | .RE
|
---|
2770 | .PP
|
---|
2771 | The left parenthesis in a function call is required
|
---|
2772 | to immediately follow the function name,
|
---|
2773 | without any intervening white space.
|
---|
2774 | This is to avoid a syntactic ambiguity with the concatenation operator.
|
---|
2775 | This restriction does not apply to the built-in functions listed above.
|
---|
2776 | .PP
|
---|
2777 | Functions may call each other and may be recursive.
|
---|
2778 | Function parameters used as local variables are initialized
|
---|
2779 | to the null string and the number zero upon function invocation.
|
---|
2780 | .PP
|
---|
2781 | Use
|
---|
2782 | .BI return " expr"
|
---|
2783 | to return a value from a function. The return value is undefined if no
|
---|
2784 | value is provided, or if the function returns by \*(lqfalling off\*(rq the
|
---|
2785 | end.
|
---|
2786 | .PP
|
---|
2787 | If
|
---|
2788 | .B \-\^\-lint
|
---|
2789 | has been provided,
|
---|
2790 | .I gawk
|
---|
2791 | warns about calls to undefined functions at parse time,
|
---|
2792 | instead of at run time.
|
---|
2793 | Calling an undefined function at run time is a fatal error.
|
---|
2794 | .PP
|
---|
2795 | The word
|
---|
2796 | .B func
|
---|
2797 | may be used in place of
|
---|
2798 | .BR function .
|
---|
2799 | .SH DYNAMICALLY LOADING NEW FUNCTIONS
|
---|
2800 | Beginning with version 3.1 of
|
---|
2801 | .IR gawk ,
|
---|
2802 | you can dynamically add new built-in functions to the running
|
---|
2803 | .I gawk
|
---|
2804 | interpreter.
|
---|
2805 | The full details are beyond the scope of this manual page;
|
---|
2806 | see \*(EP for the details.
|
---|
2807 | .PP
|
---|
2808 | .TP 8
|
---|
2809 | \fBextension(\fIobject\fB, \fIfunction\fB)\fR
|
---|
2810 | Dynamically link the shared object file named by
|
---|
2811 | .IR object ,
|
---|
2812 | and invoke
|
---|
2813 | .I function
|
---|
2814 | in that object, to perform initialization.
|
---|
2815 | These should both be provided as strings.
|
---|
2816 | Returns the value returned by
|
---|
2817 | .IR function .
|
---|
2818 | .PP
|
---|
2819 | .ft B
|
---|
2820 | This function is provided and documented in \*(EP,
|
---|
2821 | but everything about this feature is likely to change
|
---|
2822 | in the next release.
|
---|
2823 | We STRONGLY recommend that you do not use this feature
|
---|
2824 | for anything that you aren't willing to redo.
|
---|
2825 | .ft R
|
---|
2826 | .SH SIGNALS
|
---|
2827 | .I pgawk
|
---|
2828 | accepts two signals.
|
---|
2829 | .B SIGUSR1
|
---|
2830 | causes it to dump a profile and function call stack to the
|
---|
2831 | profile file, which is either
|
---|
2832 | .BR awkprof.out ,
|
---|
2833 | or whatever file was named with the
|
---|
2834 | .B \-\^\-profile
|
---|
2835 | option. It then continues to run.
|
---|
2836 | .B SIGHUP
|
---|
2837 | causes it to dump the profile and function call stack and then exit.
|
---|
2838 | .SH EXAMPLES
|
---|
2839 | .nf
|
---|
2840 | Print and sort the login names of all users:
|
---|
2841 |
|
---|
2842 | .ft B
|
---|
2843 | BEGIN { FS = ":" }
|
---|
2844 | { print $1 | "sort" }
|
---|
2845 |
|
---|
2846 | .ft R
|
---|
2847 | Count lines in a file:
|
---|
2848 |
|
---|
2849 | .ft B
|
---|
2850 | { nlines++ }
|
---|
2851 | END { print nlines }
|
---|
2852 |
|
---|
2853 | .ft R
|
---|
2854 | Precede each line by its number in the file:
|
---|
2855 |
|
---|
2856 | .ft B
|
---|
2857 | { print FNR, $0 }
|
---|
2858 |
|
---|
2859 | .ft R
|
---|
2860 | Concatenate and line number (a variation on a theme):
|
---|
2861 |
|
---|
2862 | .ft B
|
---|
2863 | { print NR, $0 }
|
---|
2864 | .ft R
|
---|
2865 | Run an external command for particular lines of data:
|
---|
2866 |
|
---|
2867 | .ft B
|
---|
2868 | tail -f access_log |
|
---|
2869 | awk '/myhome.html/ { system("nmap " $1 ">> logdir/myhome.html") }'
|
---|
2870 | .ft R
|
---|
2871 | .fi
|
---|
2872 | .SH INTERNATIONALIZATION
|
---|
2873 | .PP
|
---|
2874 | String constants are sequences of characters enclosed in double
|
---|
2875 | quotes. In non-English speaking environments, it is possible to mark
|
---|
2876 | strings in the \*(AK program as requiring translation to the native
|
---|
2877 | natural language. Such strings are marked in the \*(AK program with
|
---|
2878 | a leading underscore (\*(lq_\*(rq). For example,
|
---|
2879 | .sp
|
---|
2880 | .RS
|
---|
2881 | .ft B
|
---|
2882 | gawk 'BEGIN { print "hello, world" }'
|
---|
2883 | .RE
|
---|
2884 | .sp
|
---|
2885 | .ft R
|
---|
2886 | always prints
|
---|
2887 | .BR "hello, world" .
|
---|
2888 | But,
|
---|
2889 | .sp
|
---|
2890 | .RS
|
---|
2891 | .ft B
|
---|
2892 | gawk 'BEGIN { print _"hello, world" }'
|
---|
2893 | .RE
|
---|
2894 | .sp
|
---|
2895 | .ft R
|
---|
2896 | might print
|
---|
2897 | .B "bonjour, monde"
|
---|
2898 | in France.
|
---|
2899 | .PP
|
---|
2900 | There are several steps involved in producing and running a localizable
|
---|
2901 | \*(AK program.
|
---|
2902 | .TP "\w'4.'u+2n"
|
---|
2903 | 1.
|
---|
2904 | Add a
|
---|
2905 | .B BEGIN
|
---|
2906 | action to assign a value to the
|
---|
2907 | .B TEXTDOMAIN
|
---|
2908 | variable to set the text domain to a name associated with your program.
|
---|
2909 | .sp
|
---|
2910 | .ti +5n
|
---|
2911 | .ft B
|
---|
2912 | BEGIN { TEXTDOMAIN = "myprog" }
|
---|
2913 | .ft R
|
---|
2914 | .sp
|
---|
2915 | This allows
|
---|
2916 | .I gawk
|
---|
2917 | to find the
|
---|
2918 | .B \&.mo
|
---|
2919 | file associated with your program.
|
---|
2920 | Without this step,
|
---|
2921 | .I gawk
|
---|
2922 | uses the
|
---|
2923 | .B messages
|
---|
2924 | text domain,
|
---|
2925 | which likely does not contain translations for your program.
|
---|
2926 | .TP
|
---|
2927 | 2.
|
---|
2928 | Mark all strings that should be translated with leading underscores.
|
---|
2929 | .TP
|
---|
2930 | 3.
|
---|
2931 | If necessary, use the
|
---|
2932 | .B dcgettext()
|
---|
2933 | and/or
|
---|
2934 | .B bindtextdomain()
|
---|
2935 | functions in your program, as appropriate.
|
---|
2936 | .TP
|
---|
2937 | 4.
|
---|
2938 | Run
|
---|
2939 | .B "gawk \-\^\-gen\-po \-f myprog.awk > myprog.po"
|
---|
2940 | to generate a
|
---|
2941 | .B \&.po
|
---|
2942 | file for your program.
|
---|
2943 | .TP
|
---|
2944 | 5.
|
---|
2945 | Provide appropriate translations, and build and install a corresponding
|
---|
2946 | .B \&.mo
|
---|
2947 | file.
|
---|
2948 | .PP
|
---|
2949 | The internationalization features are described in full detail in \*(EP.
|
---|
2950 | .SH POSIX COMPATIBILITY
|
---|
2951 | A primary goal for
|
---|
2952 | .I gawk
|
---|
2953 | is compatibility with the \*(PX standard, as well as with the
|
---|
2954 | latest version of \*(UX
|
---|
2955 | .IR awk .
|
---|
2956 | To this end,
|
---|
2957 | .I gawk
|
---|
2958 | incorporates the following user visible
|
---|
2959 | features which are not described in the \*(AK book,
|
---|
2960 | but are part of the Bell Laboratories version of
|
---|
2961 | .IR awk ,
|
---|
2962 | and are in the \*(PX standard.
|
---|
2963 | .PP
|
---|
2964 | The book indicates that command line variable assignment happens when
|
---|
2965 | .I awk
|
---|
2966 | would otherwise open the argument as a file, which is after the
|
---|
2967 | .B BEGIN
|
---|
2968 | block is executed. However, in earlier implementations, when such an
|
---|
2969 | assignment appeared before any file names, the assignment would happen
|
---|
2970 | .I before
|
---|
2971 | the
|
---|
2972 | .B BEGIN
|
---|
2973 | block was run. Applications came to depend on this \*(lqfeature.\*(rq
|
---|
2974 | When
|
---|
2975 | .I awk
|
---|
2976 | was changed to match its documentation, the
|
---|
2977 | .B \-v
|
---|
2978 | option for assigning variables before program execution was added to
|
---|
2979 | accommodate applications that depended upon the old behavior.
|
---|
2980 | (This feature was agreed upon by both the Bell Laboratories and the \*(GN developers.)
|
---|
2981 | .PP
|
---|
2982 | The
|
---|
2983 | .B \-W
|
---|
2984 | option for implementation specific features is from the \*(PX standard.
|
---|
2985 | .PP
|
---|
2986 | When processing arguments,
|
---|
2987 | .I gawk
|
---|
2988 | uses the special option \*(lq\-\^\-\*(rq to signal the end of
|
---|
2989 | arguments.
|
---|
2990 | In compatibility mode, it warns about but otherwise ignores
|
---|
2991 | undefined options.
|
---|
2992 | In normal operation, such arguments are passed on to the \*(AK program for
|
---|
2993 | it to process.
|
---|
2994 | .PP
|
---|
2995 | The \*(AK book does not define the return value of
|
---|
2996 | .BR srand() .
|
---|
2997 | The \*(PX standard
|
---|
2998 | has it return the seed it was using, to allow keeping track
|
---|
2999 | of random number sequences. Therefore
|
---|
3000 | .B srand()
|
---|
3001 | in
|
---|
3002 | .I gawk
|
---|
3003 | also returns its current seed.
|
---|
3004 | .PP
|
---|
3005 | Other new features are:
|
---|
3006 | The use of multiple
|
---|
3007 | .B \-f
|
---|
3008 | options (from MKS
|
---|
3009 | .IR awk );
|
---|
3010 | the
|
---|
3011 | .B ENVIRON
|
---|
3012 | array; the
|
---|
3013 | .BR \ea ,
|
---|
3014 | and
|
---|
3015 | .BR \ev
|
---|
3016 | escape sequences (done originally in
|
---|
3017 | .I gawk
|
---|
3018 | and fed back into the Bell Laboratories version); the
|
---|
3019 | .B tolower()
|
---|
3020 | and
|
---|
3021 | .B toupper()
|
---|
3022 | built-in functions (from the Bell Laboratories version); and the \*(AN C conversion specifications in
|
---|
3023 | .B printf
|
---|
3024 | (done first in the Bell Laboratories version).
|
---|
3025 | .SH HISTORICAL FEATURES
|
---|
3026 | There are two features of historical \*(AK implementations that
|
---|
3027 | .I gawk
|
---|
3028 | supports.
|
---|
3029 | First, it is possible to call the
|
---|
3030 | .B length()
|
---|
3031 | built-in function not only with no argument, but even without parentheses!
|
---|
3032 | Thus,
|
---|
3033 | .RS
|
---|
3034 | .PP
|
---|
3035 | .ft B
|
---|
3036 | a = length # Holy Algol 60, Batman!
|
---|
3037 | .ft R
|
---|
3038 | .RE
|
---|
3039 | .PP
|
---|
3040 | is the same as either of
|
---|
3041 | .RS
|
---|
3042 | .PP
|
---|
3043 | .ft B
|
---|
3044 | a = length()
|
---|
3045 | .br
|
---|
3046 | a = length($0)
|
---|
3047 | .ft R
|
---|
3048 | .RE
|
---|
3049 | .PP
|
---|
3050 | This feature is marked as \*(lqdeprecated\*(rq in the \*(PX standard, and
|
---|
3051 | .I gawk
|
---|
3052 | issues a warning about its use if
|
---|
3053 | .B \-\^\-lint
|
---|
3054 | is specified on the command line.
|
---|
3055 | .PP
|
---|
3056 | The other feature is the use of either the
|
---|
3057 | .B continue
|
---|
3058 | or the
|
---|
3059 | .B break
|
---|
3060 | statements outside the body of a
|
---|
3061 | .BR while ,
|
---|
3062 | .BR for ,
|
---|
3063 | or
|
---|
3064 | .B do
|
---|
3065 | loop. Traditional \*(AK implementations have treated such usage as
|
---|
3066 | equivalent to the
|
---|
3067 | .B next
|
---|
3068 | statement.
|
---|
3069 | .I Gawk
|
---|
3070 | supports this usage if
|
---|
3071 | .B \-\^\-traditional
|
---|
3072 | has been specified.
|
---|
3073 | .SH GNU EXTENSIONS
|
---|
3074 | .I Gawk
|
---|
3075 | has a number of extensions to \*(PX
|
---|
3076 | .IR awk .
|
---|
3077 | They are described in this section. All the extensions described here
|
---|
3078 | can be disabled by
|
---|
3079 | invoking
|
---|
3080 | .I gawk
|
---|
3081 | with the
|
---|
3082 | .B \-\^\-traditional
|
---|
3083 | option.
|
---|
3084 | .PP
|
---|
3085 | The following features of
|
---|
3086 | .I gawk
|
---|
3087 | are not available in
|
---|
3088 | \*(PX
|
---|
3089 | .IR awk .
|
---|
3090 | .\" Environment vars and startup stuff
|
---|
3091 | .TP "\w'\(bu'u+1n"
|
---|
3092 | \(bu
|
---|
3093 | No path search is performed for files named via the
|
---|
3094 | .B \-f
|
---|
3095 | option. Therefore the
|
---|
3096 | .B AWKPATH
|
---|
3097 | environment variable is not special.
|
---|
3098 | .\" POSIX and language recognition issues
|
---|
3099 | .TP
|
---|
3100 | \(bu
|
---|
3101 | The
|
---|
3102 | .B \ex
|
---|
3103 | escape sequence.
|
---|
3104 | (Disabled with
|
---|
3105 | .BR \-\^\-posix .)
|
---|
3106 | .TP
|
---|
3107 | \(bu
|
---|
3108 | The
|
---|
3109 | .B fflush()
|
---|
3110 | function.
|
---|
3111 | (Disabled with
|
---|
3112 | .BR \-\^\-posix .)
|
---|
3113 | .TP
|
---|
3114 | \(bu
|
---|
3115 | The ability to continue lines after
|
---|
3116 | .B ?
|
---|
3117 | and
|
---|
3118 | .BR : .
|
---|
3119 | (Disabled with
|
---|
3120 | .BR \-\^\-posix .)
|
---|
3121 | .TP
|
---|
3122 | \(bu
|
---|
3123 | Octal and hexadecimal constants in AWK programs.
|
---|
3124 | .\" Special variables
|
---|
3125 | .TP
|
---|
3126 | \(bu
|
---|
3127 | The
|
---|
3128 | .BR ARGIND ,
|
---|
3129 | .BR BINMODE ,
|
---|
3130 | .BR ERRNO ,
|
---|
3131 | .BR LINT ,
|
---|
3132 | .B RT
|
---|
3133 | and
|
---|
3134 | .B TEXTDOMAIN
|
---|
3135 | variables are not special.
|
---|
3136 | .TP
|
---|
3137 | \(bu
|
---|
3138 | The
|
---|
3139 | .B IGNORECASE
|
---|
3140 | variable and its side-effects are not available.
|
---|
3141 | .TP
|
---|
3142 | \(bu
|
---|
3143 | The
|
---|
3144 | .B FIELDWIDTHS
|
---|
3145 | variable and fixed-width field splitting.
|
---|
3146 | .TP
|
---|
3147 | \(bu
|
---|
3148 | The
|
---|
3149 | .B PROCINFO
|
---|
3150 | array is not available.
|
---|
3151 | .\" I/O stuff
|
---|
3152 | .TP
|
---|
3153 | \(bu
|
---|
3154 | The use of
|
---|
3155 | .B RS
|
---|
3156 | as a regular expression.
|
---|
3157 | .TP
|
---|
3158 | \(bu
|
---|
3159 | The special file names available for I/O redirection are not recognized.
|
---|
3160 | .TP
|
---|
3161 | \(bu
|
---|
3162 | The
|
---|
3163 | .B |&
|
---|
3164 | operator for creating co-processes.
|
---|
3165 | .\" Changes to standard awk functions
|
---|
3166 | .TP
|
---|
3167 | \(bu
|
---|
3168 | The ability to split out individual characters using the null string
|
---|
3169 | as the value of
|
---|
3170 | .BR FS ,
|
---|
3171 | and as the third argument to
|
---|
3172 | .BR split() .
|
---|
3173 | .TP
|
---|
3174 | \(bu
|
---|
3175 | The optional second argument to the
|
---|
3176 | .B close()
|
---|
3177 | function.
|
---|
3178 | .TP
|
---|
3179 | \(bu
|
---|
3180 | The optional third argument to the
|
---|
3181 | .B match()
|
---|
3182 | function.
|
---|
3183 | .TP
|
---|
3184 | \(bu
|
---|
3185 | The ability to use positional specifiers with
|
---|
3186 | .B printf
|
---|
3187 | and
|
---|
3188 | .BR sprintf() .
|
---|
3189 | .\" New keywords or changes to keywords
|
---|
3190 | .TP
|
---|
3191 | \(bu
|
---|
3192 | The use of
|
---|
3193 | .BI delete " array"
|
---|
3194 | to delete the entire contents of an array.
|
---|
3195 | .TP
|
---|
3196 | \(bu
|
---|
3197 | The use of
|
---|
3198 | .B "nextfile"
|
---|
3199 | to abandon processing of the current input file.
|
---|
3200 | .\" New functions
|
---|
3201 | .TP
|
---|
3202 | \(bu
|
---|
3203 | The
|
---|
3204 | .BR and() ,
|
---|
3205 | .BR asort() ,
|
---|
3206 | .BR asorti() ,
|
---|
3207 | .BR bindtextdomain() ,
|
---|
3208 | .BR compl() ,
|
---|
3209 | .BR dcgettext() ,
|
---|
3210 | .BR dcngettext() ,
|
---|
3211 | .BR gensub() ,
|
---|
3212 | .BR lshift() ,
|
---|
3213 | .BR mktime() ,
|
---|
3214 | .BR or() ,
|
---|
3215 | .BR rshift() ,
|
---|
3216 | .BR strftime() ,
|
---|
3217 | .BR strtonum() ,
|
---|
3218 | .B systime()
|
---|
3219 | and
|
---|
3220 | .B xor()
|
---|
3221 | functions.
|
---|
3222 | .\" I18N stuff
|
---|
3223 | .TP
|
---|
3224 | \(bu
|
---|
3225 | Localizable strings.
|
---|
3226 | .\" Extending gawk
|
---|
3227 | .TP
|
---|
3228 | \(bu
|
---|
3229 | Adding new built-in functions dynamically with the
|
---|
3230 | .B extension()
|
---|
3231 | function.
|
---|
3232 | .PP
|
---|
3233 | The \*(AK book does not define the return value of the
|
---|
3234 | .B close()
|
---|
3235 | function.
|
---|
3236 | .IR Gawk\^ "'s"
|
---|
3237 | .B close()
|
---|
3238 | returns the value from
|
---|
3239 | .IR fclose (3),
|
---|
3240 | or
|
---|
3241 | .IR pclose (3),
|
---|
3242 | when closing an output file or pipe, respectively.
|
---|
3243 | It returns the process's exit status when closing an input pipe.
|
---|
3244 | The return value is \-1 if the named file, pipe
|
---|
3245 | or co-process was not opened with a redirection.
|
---|
3246 | .PP
|
---|
3247 | When
|
---|
3248 | .I gawk
|
---|
3249 | is invoked with the
|
---|
3250 | .B \-\^\-traditional
|
---|
3251 | option,
|
---|
3252 | if the
|
---|
3253 | .I fs
|
---|
3254 | argument to the
|
---|
3255 | .B \-F
|
---|
3256 | option is \*(lqt\*(rq, then
|
---|
3257 | .B FS
|
---|
3258 | is set to the tab character.
|
---|
3259 | Note that typing
|
---|
3260 | .B "gawk \-F\et \&.\|.\|."
|
---|
3261 | simply causes the shell to quote the \*(lqt,\*(rq, and does not pass
|
---|
3262 | \*(lq\et\*(rq to the
|
---|
3263 | .B \-F
|
---|
3264 | option.
|
---|
3265 | Since this is a rather ugly special case, it is not the default behavior.
|
---|
3266 | This behavior also does not occur if
|
---|
3267 | .B \-\^\-posix
|
---|
3268 | has been specified.
|
---|
3269 | To really get a tab character as the field separator, it is best to use
|
---|
3270 | single quotes:
|
---|
3271 | .BR "gawk \-F'\et' \&.\|.\|." .
|
---|
3272 | .ig
|
---|
3273 | .PP
|
---|
3274 | If
|
---|
3275 | .I gawk
|
---|
3276 | was compiled for debugging, it
|
---|
3277 | accepts the following additional options:
|
---|
3278 | .TP
|
---|
3279 | .PD 0
|
---|
3280 | .B \-Wparsedebug
|
---|
3281 | .TP
|
---|
3282 | .PD
|
---|
3283 | .B \-\^\-parsedebug
|
---|
3284 | Turn on
|
---|
3285 | .IR yacc (1)
|
---|
3286 | or
|
---|
3287 | .IR bison (1)
|
---|
3288 | debugging output during program parsing.
|
---|
3289 | This option should only be of interest to the
|
---|
3290 | .I gawk
|
---|
3291 | maintainers, and may not even be compiled into
|
---|
3292 | .IR gawk .
|
---|
3293 | ..
|
---|
3294 | .PP
|
---|
3295 | If
|
---|
3296 | .I gawk
|
---|
3297 | is
|
---|
3298 | .I configured
|
---|
3299 | with the
|
---|
3300 | .B \-\^\-enable\-switch
|
---|
3301 | option to the
|
---|
3302 | .I configure
|
---|
3303 | command, then it accepts an additional control-flow statement:
|
---|
3304 | .RS
|
---|
3305 | .nf
|
---|
3306 | \fBswitch (\fIexpression\fB) {
|
---|
3307 | \fBcase \fIvalue\fB|\fIregex\fB : \fIstatement
|
---|
3308 | \&.\^.\^.
|
---|
3309 | \fR[ \fBdefault: \fIstatement \fR]
|
---|
3310 | \fB}\fR
|
---|
3311 | .fi
|
---|
3312 | .RE
|
---|
3313 | .SH ENVIRONMENT VARIABLES
|
---|
3314 | The
|
---|
3315 | .B AWKPATH
|
---|
3316 | environment variable can be used to provide a list of directories that
|
---|
3317 | .I gawk
|
---|
3318 | searches when looking for files named via the
|
---|
3319 | .B \-f
|
---|
3320 | and
|
---|
3321 | .B \-\^\-file
|
---|
3322 | options.
|
---|
3323 | .PP
|
---|
3324 | If
|
---|
3325 | .B POSIXLY_CORRECT
|
---|
3326 | exists in the environment, then
|
---|
3327 | .I gawk
|
---|
3328 | behaves exactly as if
|
---|
3329 | .B \-\^\-posix
|
---|
3330 | had been specified on the command line.
|
---|
3331 | If
|
---|
3332 | .B \-\^\-lint
|
---|
3333 | has been specified,
|
---|
3334 | .I gawk
|
---|
3335 | issues a warning message to this effect.
|
---|
3336 | .SH SEE ALSO
|
---|
3337 | .IR egrep (1),
|
---|
3338 | .IR getpid (2),
|
---|
3339 | .IR getppid (2),
|
---|
3340 | .IR getpgrp (2),
|
---|
3341 | .IR getuid (2),
|
---|
3342 | .IR geteuid (2),
|
---|
3343 | .IR getgid (2),
|
---|
3344 | .IR getegid (2),
|
---|
3345 | .IR getgroups (2)
|
---|
3346 | .PP
|
---|
3347 | .IR "The AWK Programming Language" ,
|
---|
3348 | Alfred V. Aho, Brian W. Kernighan, Peter J. Weinberger,
|
---|
3349 | Addison-Wesley, 1988. ISBN 0-201-07981-X.
|
---|
3350 | .PP
|
---|
3351 | \*(EP,
|
---|
3352 | Edition 3.0, published by the Free Software Foundation, 2001.
|
---|
3353 | .SH BUGS
|
---|
3354 | The
|
---|
3355 | .B \-F
|
---|
3356 | option is not necessary given the command line variable assignment feature;
|
---|
3357 | it remains only for backwards compatibility.
|
---|
3358 | .PP
|
---|
3359 | Syntactically invalid single character programs tend to overflow
|
---|
3360 | the parse stack, generating a rather unhelpful message. Such programs
|
---|
3361 | are surprisingly difficult to diagnose in the completely general case,
|
---|
3362 | and the effort to do so really is not worth it.
|
---|
3363 | .SH AUTHORS
|
---|
3364 | The original version of \*(UX
|
---|
3365 | .I awk
|
---|
3366 | was designed and implemented by Alfred Aho,
|
---|
3367 | Peter Weinberger, and Brian Kernighan of Bell Laboratories. Brian Kernighan
|
---|
3368 | continues to maintain and enhance it.
|
---|
3369 | .PP
|
---|
3370 | Paul Rubin and Jay Fenlason,
|
---|
3371 | of the Free Software Foundation, wrote
|
---|
3372 | .IR gawk ,
|
---|
3373 | to be compatible with the original version of
|
---|
3374 | .I awk
|
---|
3375 | distributed in Seventh Edition \*(UX.
|
---|
3376 | John Woods contributed a number of bug fixes.
|
---|
3377 | David Trueman, with contributions
|
---|
3378 | from Arnold Robbins, made
|
---|
3379 | .I gawk
|
---|
3380 | compatible with the new version of \*(UX
|
---|
3381 | .IR awk .
|
---|
3382 | Arnold Robbins is the current maintainer.
|
---|
3383 | .PP
|
---|
3384 | The initial DOS port was done by Conrad Kwok and Scott Garfinkle.
|
---|
3385 | Scott Deifik is the current DOS maintainer. Pat Rankin did the
|
---|
3386 | port to VMS, and Michal Jaegermann did the port to the Atari ST.
|
---|
3387 | The port to OS/2 was done by Kai Uwe Rommel, with contributions and
|
---|
3388 | help from Darrel Hankerson. Fred Fish supplied support for the Amiga,
|
---|
3389 | Stephen Davies provided the Tandem port,
|
---|
3390 | and Martin Brown provided the BeOS port.
|
---|
3391 | .SH VERSION INFORMATION
|
---|
3392 | This man page documents
|
---|
3393 | .IR gawk ,
|
---|
3394 | version 3.1.5.
|
---|
3395 | .SH BUG REPORTS
|
---|
3396 | If you find a bug in
|
---|
3397 | .IR gawk ,
|
---|
3398 | please send electronic mail to
|
---|
3399 | .BR bug-gawk@gnu.org .
|
---|
3400 | Please include your operating system and its revision, the version of
|
---|
3401 | .I gawk
|
---|
3402 | (from
|
---|
3403 | .BR "gawk \-\^\-version" ),
|
---|
3404 | what C compiler you used to compile it, and a test program
|
---|
3405 | and data that are as small as possible for reproducing the problem.
|
---|
3406 | .PP
|
---|
3407 | Before sending a bug report, please do two things. First, verify that
|
---|
3408 | you have the latest version of
|
---|
3409 | .IR gawk .
|
---|
3410 | Many bugs (usually subtle ones) are fixed at each release, and if
|
---|
3411 | yours is out of date, the problem may already have been solved.
|
---|
3412 | Second, please read this man page and the reference manual carefully to
|
---|
3413 | be sure that what you think is a bug really is, instead of just a quirk
|
---|
3414 | in the language.
|
---|
3415 | .PP
|
---|
3416 | Whatever you do, do
|
---|
3417 | .B NOT
|
---|
3418 | post a bug report in
|
---|
3419 | .BR comp.lang.awk .
|
---|
3420 | While the
|
---|
3421 | .I gawk
|
---|
3422 | developers occasionally read this newsgroup, posting bug reports there
|
---|
3423 | is an unreliable way to report bugs. Instead, please use the electronic mail
|
---|
3424 | addresses given above.
|
---|
3425 | .PP
|
---|
3426 | If you're using a GNU/Linux system or BSD-based system,
|
---|
3427 | you may wish to submit a bug report to the vendor of your distribution.
|
---|
3428 | That's fine, but please send a copy to the official email address as well,
|
---|
3429 | since there's no guarantee that the bug will be forwarded to the
|
---|
3430 | .I gawk
|
---|
3431 | maintainer.
|
---|
3432 | .SH ACKNOWLEDGEMENTS
|
---|
3433 | Brian Kernighan of Bell Laboratories
|
---|
3434 | provided valuable assistance during testing and debugging.
|
---|
3435 | We thank him.
|
---|
3436 | .SH COPYING PERMISSIONS
|
---|
3437 | Copyright \(co 1989, 1991, 1992, 1993, 1994, 1995, 1996,
|
---|
3438 | 1997, 1998, 1999, 2001, 2002, 2003, 2004, 2005 Free Software Foundation, Inc.
|
---|
3439 | .PP
|
---|
3440 | Permission is granted to make and distribute verbatim copies of
|
---|
3441 | this manual page provided the copyright notice and this permission
|
---|
3442 | notice are preserved on all copies.
|
---|
3443 | .ig
|
---|
3444 | Permission is granted to process this file through troff and print the
|
---|
3445 | results, provided the printed document carries copying permission
|
---|
3446 | notice identical to this one except for the removal of this paragraph
|
---|
3447 | (this paragraph not being relevant to the printed manual page).
|
---|
3448 | ..
|
---|
3449 | .PP
|
---|
3450 | Permission is granted to copy and distribute modified versions of this
|
---|
3451 | manual page under the conditions for verbatim copying, provided that
|
---|
3452 | the entire resulting derived work is distributed under the terms of a
|
---|
3453 | permission notice identical to this one.
|
---|
3454 | .PP
|
---|
3455 | Permission is granted to copy and distribute translations of this
|
---|
3456 | manual page into another language, under the above conditions for
|
---|
3457 | modified versions, except that this permission notice may be stated in
|
---|
3458 | a translation approved by the Foundation.
|
---|