source: trunk/essentials/net-misc/wget/doc/wget.info

Last change on this file was 3440, checked in by bird, 18 years ago

wget 1.10.2

File size: 194.9 KB
Line 
1This is wget.info, produced by makeinfo version 4.8 from ./wget.texi.
2
3INFO-DIR-SECTION Network Applications
4START-INFO-DIR-ENTRY
5* Wget: (wget). The non-interactive network downloader.
6END-INFO-DIR-ENTRY
7
8 This file documents the the GNU Wget utility for downloading network
9data.
10
11 Copyright (C) 1996-2005 Free Software Foundation, Inc.
12
13 Permission is granted to make and distribute verbatim copies of this
14manual provided the copyright notice and this permission notice are
15preserved on all copies.
16
17 Permission is granted to copy, distribute and/or modify this document
18under the terms of the GNU Free Documentation License, Version 1.2 or
19any later version published by the Free Software Foundation; with the
20Invariant Sections being "GNU General Public License" and "GNU Free
21Documentation License", with no Front-Cover Texts, and with no
22Back-Cover Texts. A copy of the license is included in the section
23entitled "GNU Free Documentation License".
24
25
26File: wget.info, Node: Top, Next: Overview, Up: (dir)
27
28Wget 1.10.2
29***********
30
31This manual documents version 1.10.2 of GNU Wget, the freely available
32utility for network downloads.
33
34 Copyright (C) 1996-2005 Free Software Foundation, Inc.
35
36* Menu:
37
38* Overview:: Features of Wget.
39* Invoking:: Wget command-line arguments.
40* Recursive Download:: Downloading interlinked pages.
41* Following Links:: The available methods of chasing links.
42* Time-Stamping:: Mirroring according to time-stamps.
43* Startup File:: Wget's initialization file.
44* Examples:: Examples of usage.
45* Various:: The stuff that doesn't fit anywhere else.
46* Appendices:: Some useful references.
47* Copying:: You may give out copies of Wget and of this manual.
48* Concept Index:: Topics covered by this manual.
49
50
51File: wget.info, Node: Overview, Next: Invoking, Prev: Top, Up: Top
52
531 Overview
54**********
55
56GNU Wget is a free utility for non-interactive download of files from
57the Web. It supports HTTP, HTTPS, and FTP protocols, as well as
58retrieval through HTTP proxies.
59
60 This chapter is a partial overview of Wget's features.
61
62 * Wget is non-interactive, meaning that it can work in the
63 background, while the user is not logged on. This allows you to
64 start a retrieval and disconnect from the system, letting Wget
65 finish the work. By contrast, most of the Web browsers require
66 constant user's presence, which can be a great hindrance when
67 transferring a lot of data.
68
69 * Wget can follow links in HTML and XHTML pages and create local
70 versions of remote web sites, fully recreating the directory
71 structure of the original site. This is sometimes referred to as
72 "recursive downloading." While doing that, Wget respects the
73 Robot Exclusion Standard (`/robots.txt'). Wget can be instructed
74 to convert the links in downloaded HTML files to the local files
75 for offline viewing.
76
77 * File name wildcard matching and recursive mirroring of directories
78 are available when retrieving via FTP. Wget can read the
79 time-stamp information given by both HTTP and FTP servers, and
80 store it locally. Thus Wget can see if the remote file has
81 changed since last retrieval, and automatically retrieve the new
82 version if it has. This makes Wget suitable for mirroring of FTP
83 sites, as well as home pages.
84
85 * Wget has been designed for robustness over slow or unstable network
86 connections; if a download fails due to a network problem, it will
87 keep retrying until the whole file has been retrieved. If the
88 server supports regetting, it will instruct the server to continue
89 the download from where it left off.
90
91 * Wget supports proxy servers, which can lighten the network load,
92 speed up retrieval and provide access behind firewalls. However,
93 if you are behind a firewall that requires that you use a socks
94 style gateway, you can get the socks library and build Wget with
95 support for socks. Wget uses the passive FTP downloading by
96 default, active FTP being an option.
97
98 * Wget supports IP version 6, the next generation of IP. IPv6 is
99 autodetected at compile-time, and can be disabled at either build
100 or run time. Binaries built with IPv6 support work well in both
101 IPv4-only and dual family environments.
102
103 * Built-in features offer mechanisms to tune which links you wish to
104 follow (*note Following Links::).
105
106 * The progress of individual downloads is traced using a progress
107 gauge. Interactive downloads are tracked using a
108 "thermometer"-style gauge, whereas non-interactive ones are traced
109 with dots, each dot representing a fixed amount of data received
110 (1KB by default). Either gauge can be customized to your
111 preferences.
112
113 * Most of the features are fully configurable, either through
114 command line options, or via the initialization file `.wgetrc'
115 (*note Startup File::). Wget allows you to define "global"
116 startup files (`/usr/local/etc/wgetrc' by default) for site
117 settings.
118
119 * Finally, GNU Wget is free software. This means that everyone may
120 use it, redistribute it and/or modify it under the terms of the
121 GNU General Public License, as published by the Free Software
122 Foundation (*note Copying::).
123
124
125File: wget.info, Node: Invoking, Next: Recursive Download, Prev: Overview, Up: Top
126
1272 Invoking
128**********
129
130By default, Wget is very simple to invoke. The basic syntax is:
131
132 wget [OPTION]... [URL]...
133
134 Wget will simply download all the URLs specified on the command
135line. URL is a "Uniform Resource Locator", as defined below.
136
137 However, you may wish to change some of the default parameters of
138Wget. You can do it two ways: permanently, adding the appropriate
139command to `.wgetrc' (*note Startup File::), or specifying it on the
140command line.
141
142* Menu:
143
144* URL Format::
145* Option Syntax::
146* Basic Startup Options::
147* Logging and Input File Options::
148* Download Options::
149* Directory Options::
150* HTTP Options::
151* HTTPS (SSL/TLS) Options::
152* FTP Options::
153* Recursive Retrieval Options::
154* Recursive Accept/Reject Options::
155
156
157File: wget.info, Node: URL Format, Next: Option Syntax, Up: Invoking
158
1592.1 URL Format
160==============
161
162"URL" is an acronym for Uniform Resource Locator. A uniform resource
163locator is a compact string representation for a resource available via
164the Internet. Wget recognizes the URL syntax as per RFC1738. This is
165the most widely used form (square brackets denote optional parts):
166
167 http://host[:port]/directory/file
168 ftp://host[:port]/directory/file
169
170 You can also encode your username and password within a URL:
171
172 ftp://user:password@host/path
173 http://user:password@host/path
174
175 Either USER or PASSWORD, or both, may be left out. If you leave out
176either the HTTP username or password, no authentication will be sent.
177If you leave out the FTP username, `anonymous' will be used. If you
178leave out the FTP password, your email address will be supplied as a
179default password.(1)
180
181 *Important Note*: if you specify a password-containing URL on the
182command line, the username and password will be plainly visible to all
183users on the system, by way of `ps'. On multi-user systems, this is a
184big security risk. To work around it, use `wget -i -' and feed the
185URLs to Wget's standard input, each on a separate line, terminated by
186`C-d'.
187
188 You can encode unsafe characters in a URL as `%xy', `xy' being the
189hexadecimal representation of the character's ASCII value. Some common
190unsafe characters include `%' (quoted as `%25'), `:' (quoted as `%3A'),
191and `@' (quoted as `%40'). Refer to RFC1738 for a comprehensive list
192of unsafe characters.
193
194 Wget also supports the `type' feature for FTP URLs. By default, FTP
195documents are retrieved in the binary mode (type `i'), which means that
196they are downloaded unchanged. Another useful mode is the `a'
197("ASCII") mode, which converts the line delimiters between the
198different operating systems, and is thus useful for text files. Here
199is an example:
200
201 ftp://host/directory/file;type=a
202
203 Two alternative variants of URL specification are also supported,
204because of historical (hysterical?) reasons and their widespreaded use.
205
206 FTP-only syntax (supported by `NcFTP'):
207 host:/dir/file
208
209 HTTP-only syntax (introduced by `Netscape'):
210 host[:port]/dir/file
211
212 These two alternative forms are deprecated, and may cease being
213supported in the future.
214
215 If you do not understand the difference between these notations, or
216do not know which one to use, just use the plain ordinary format you use
217with your favorite browser, like `Lynx' or `Netscape'.
218
219 ---------- Footnotes ----------
220
221 (1) If you have a `.netrc' file in your home directory, password
222will also be searched for there.
223
224
225File: wget.info, Node: Option Syntax, Next: Basic Startup Options, Prev: URL Format, Up: Invoking
226
2272.2 Option Syntax
228=================
229
230Since Wget uses GNU getopt to process command-line arguments, every
231option has a long form along with the short one. Long options are more
232convenient to remember, but take time to type. You may freely mix
233different option styles, or specify options after the command-line
234arguments. Thus you may write:
235
236 wget -r --tries=10 http://fly.srk.fer.hr/ -o log
237
238 The space between the option accepting an argument and the argument
239may be omitted. Instead `-o log' you can write `-olog'.
240
241 You may put several options that do not require arguments together,
242like:
243
244 wget -drc URL
245
246 This is a complete equivalent of:
247
248 wget -d -r -c URL
249
250 Since the options can be specified after the arguments, you may
251terminate them with `--'. So the following will try to download URL
252`-x', reporting failure to `log':
253
254 wget -o log -- -x
255
256 The options that accept comma-separated lists all respect the
257convention that specifying an empty list clears its value. This can be
258useful to clear the `.wgetrc' settings. For instance, if your `.wgetrc'
259sets `exclude_directories' to `/cgi-bin', the following example will
260first reset it, and then set it to exclude `/~nobody' and `/~somebody'.
261You can also clear the lists in `.wgetrc' (*note Wgetrc Syntax::).
262
263 wget -X '' -X /~nobody,/~somebody
264
265 Most options that do not accept arguments are "boolean" options, so
266named because their state can be captured with a yes-or-no ("boolean")
267variable. For example, `--follow-ftp' tells Wget to follow FTP links
268from HTML files and, on the other hand, `--no-glob' tells it not to
269perform file globbing on FTP URLs. A boolean option is either
270"affirmative" or "negative" (beginning with `--no'). All such options
271share several properties.
272
273 Unless stated otherwise, it is assumed that the default behavior is
274the opposite of what the option accomplishes. For example, the
275documented existence of `--follow-ftp' assumes that the default is to
276_not_ follow FTP links from HTML pages.
277
278 Affirmative options can be negated by prepending the `--no-' to the
279option name; negative options can be negated by omitting the `--no-'
280prefix. This might seem superfluous--if the default for an affirmative
281option is to not do something, then why provide a way to explicitly
282turn it off? But the startup file may in fact change the default. For
283instance, using `follow_ftp = off' in `.wgetrc' makes Wget _not_ follow
284FTP links by default, and using `--no-follow-ftp' is the only way to
285restore the factory default from the command line.
286
287
288File: wget.info, Node: Basic Startup Options, Next: Logging and Input File Options, Prev: Option Syntax, Up: Invoking
289
2902.3 Basic Startup Options
291=========================
292
293`-V'
294`--version'
295 Display the version of Wget.
296
297`-h'
298`--help'
299 Print a help message describing all of Wget's command-line options.
300
301`-b'
302`--background'
303 Go to background immediately after startup. If no output file is
304 specified via the `-o', output is redirected to `wget-log'.
305
306`-e COMMAND'
307`--execute COMMAND'
308 Execute COMMAND as if it were a part of `.wgetrc' (*note Startup
309 File::). A command thus invoked will be executed _after_ the
310 commands in `.wgetrc', thus taking precedence over them. If you
311 need to specify more than one wgetrc command, use multiple
312 instances of `-e'.
313
314
315
316File: wget.info, Node: Logging and Input File Options, Next: Download Options, Prev: Basic Startup Options, Up: Invoking
317
3182.4 Logging and Input File Options
319==================================
320
321`-o LOGFILE'
322`--output-file=LOGFILE'
323 Log all messages to LOGFILE. The messages are normally reported
324 to standard error.
325
326`-a LOGFILE'
327`--append-output=LOGFILE'
328 Append to LOGFILE. This is the same as `-o', only it appends to
329 LOGFILE instead of overwriting the old log file. If LOGFILE does
330 not exist, a new file is created.
331
332`-d'
333`--debug'
334 Turn on debug output, meaning various information important to the
335 developers of Wget if it does not work properly. Your system
336 administrator may have chosen to compile Wget without debug
337 support, in which case `-d' will not work. Please note that
338 compiling with debug support is always safe--Wget compiled with
339 the debug support will _not_ print any debug info unless requested
340 with `-d'. *Note Reporting Bugs::, for more information on how to
341 use `-d' for sending bug reports.
342
343`-q'
344`--quiet'
345 Turn off Wget's output.
346
347`-v'
348`--verbose'
349 Turn on verbose output, with all the available data. The default
350 output is verbose.
351
352`-nv'
353`--no-verbose'
354 Turn off verbose without being completely quiet (use `-q' for
355 that), which means that error messages and basic information still
356 get printed.
357
358`-i FILE'
359`--input-file=FILE'
360 Read URLs from FILE. If `-' is specified as FILE, URLs are read
361 from the standard input. (Use `./-' to read from a file literally
362 named `-'.)
363
364 If this function is used, no URLs need be present on the command
365 line. If there are URLs both on the command line and in an input
366 file, those on the command lines will be the first ones to be
367 retrieved. The FILE need not be an HTML document (but no harm if
368 it is)--it is enough if the URLs are just listed sequentially.
369
370 However, if you specify `--force-html', the document will be
371 regarded as `html'. In that case you may have problems with
372 relative links, which you can solve either by adding `<base
373 href="URL">' to the documents or by specifying `--base=URL' on the
374 command line.
375
376`-F'
377`--force-html'
378 When input is read from a file, force it to be treated as an HTML
379 file. This enables you to retrieve relative links from existing
380 HTML files on your local disk, by adding `<base href="URL">' to
381 HTML, or using the `--base' command-line option.
382
383`-B URL'
384`--base=URL'
385 Prepends URL to relative links read from the file specified with
386 the `-i' option.
387
388
389File: wget.info, Node: Download Options, Next: Directory Options, Prev: Logging and Input File Options, Up: Invoking
390
3912.5 Download Options
392====================
393
394`--bind-address=ADDRESS'
395 When making client TCP/IP connections, bind to ADDRESS on the
396 local machine. ADDRESS may be specified as a hostname or IP
397 address. This option can be useful if your machine is bound to
398 multiple IPs.
399
400`-t NUMBER'
401`--tries=NUMBER'
402 Set number of retries to NUMBER. Specify 0 or `inf' for infinite
403 retrying. The default is to retry 20 times, with the exception of
404 fatal errors like "connection refused" or "not found" (404), which
405 are not retried.
406
407`-O FILE'
408`--output-document=FILE'
409 The documents will not be written to the appropriate files, but all
410 will be concatenated together and written to FILE. If `-' is used
411 as FILE, documents will be printed to standard output, disabling
412 link conversion. (Use `./-' to print to a file literally named
413 `-'.)
414
415 Note that a combination with `-k' is only well-defined for
416 downloading a single document.
417
418`-nc'
419`--no-clobber'
420 If a file is downloaded more than once in the same directory,
421 Wget's behavior depends on a few options, including `-nc'. In
422 certain cases, the local file will be "clobbered", or overwritten,
423 upon repeated download. In other cases it will be preserved.
424
425 When running Wget without `-N', `-nc', or `-r', downloading the
426 same file in the same directory will result in the original copy
427 of FILE being preserved and the second copy being named `FILE.1'.
428 If that file is downloaded yet again, the third copy will be named
429 `FILE.2', and so on. When `-nc' is specified, this behavior is
430 suppressed, and Wget will refuse to download newer copies of
431 `FILE'. Therefore, "`no-clobber'" is actually a misnomer in this
432 mode--it's not clobbering that's prevented (as the numeric
433 suffixes were already preventing clobbering), but rather the
434 multiple version saving that's prevented.
435
436 When running Wget with `-r', but without `-N' or `-nc',
437 re-downloading a file will result in the new copy simply
438 overwriting the old. Adding `-nc' will prevent this behavior,
439 instead causing the original version to be preserved and any newer
440 copies on the server to be ignored.
441
442 When running Wget with `-N', with or without `-r', the decision as
443 to whether or not to download a newer copy of a file depends on
444 the local and remote timestamp and size of the file (*note
445 Time-Stamping::). `-nc' may not be specified at the same time as
446 `-N'.
447
448 Note that when `-nc' is specified, files with the suffixes `.html'
449 or `.htm' will be loaded from the local disk and parsed as if they
450 had been retrieved from the Web.
451
452`-c'
453`--continue'
454 Continue getting a partially-downloaded file. This is useful when
455 you want to finish up a download started by a previous instance of
456 Wget, or by another program. For instance:
457
458 wget -c ftp://sunsite.doc.ic.ac.uk/ls-lR.Z
459
460 If there is a file named `ls-lR.Z' in the current directory, Wget
461 will assume that it is the first portion of the remote file, and
462 will ask the server to continue the retrieval from an offset equal
463 to the length of the local file.
464
465 Note that you don't need to specify this option if you just want
466 the current invocation of Wget to retry downloading a file should
467 the connection be lost midway through. This is the default
468 behavior. `-c' only affects resumption of downloads started
469 _prior_ to this invocation of Wget, and whose local files are
470 still sitting around.
471
472 Without `-c', the previous example would just download the remote
473 file to `ls-lR.Z.1', leaving the truncated `ls-lR.Z' file alone.
474
475 Beginning with Wget 1.7, if you use `-c' on a non-empty file, and
476 it turns out that the server does not support continued
477 downloading, Wget will refuse to start the download from scratch,
478 which would effectively ruin existing contents. If you really
479 want the download to start from scratch, remove the file.
480
481 Also beginning with Wget 1.7, if you use `-c' on a file which is of
482 equal size as the one on the server, Wget will refuse to download
483 the file and print an explanatory message. The same happens when
484 the file is smaller on the server than locally (presumably because
485 it was changed on the server since your last download
486 attempt)--because "continuing" is not meaningful, no download
487 occurs.
488
489 On the other side of the coin, while using `-c', any file that's
490 bigger on the server than locally will be considered an incomplete
491 download and only `(length(remote) - length(local))' bytes will be
492 downloaded and tacked onto the end of the local file. This
493 behavior can be desirable in certain cases--for instance, you can
494 use `wget -c' to download just the new portion that's been
495 appended to a data collection or log file.
496
497 However, if the file is bigger on the server because it's been
498 _changed_, as opposed to just _appended_ to, you'll end up with a
499 garbled file. Wget has no way of verifying that the local file is
500 really a valid prefix of the remote file. You need to be
501 especially careful of this when using `-c' in conjunction with
502 `-r', since every file will be considered as an "incomplete
503 download" candidate.
504
505 Another instance where you'll get a garbled file if you try to use
506 `-c' is if you have a lame HTTP proxy that inserts a "transfer
507 interrupted" string into the local file. In the future a
508 "rollback" option may be added to deal with this case.
509
510 Note that `-c' only works with FTP servers and with HTTP servers
511 that support the `Range' header.
512
513`--progress=TYPE'
514 Select the type of the progress indicator you wish to use. Legal
515 indicators are "dot" and "bar".
516
517 The "bar" indicator is used by default. It draws an ASCII progress
518 bar graphics (a.k.a "thermometer" display) indicating the status of
519 retrieval. If the output is not a TTY, the "dot" bar will be used
520 by default.
521
522 Use `--progress=dot' to switch to the "dot" display. It traces
523 the retrieval by printing dots on the screen, each dot
524 representing a fixed amount of downloaded data.
525
526 When using the dotted retrieval, you may also set the "style" by
527 specifying the type as `dot:STYLE'. Different styles assign
528 different meaning to one dot. With the `default' style each dot
529 represents 1K, there are ten dots in a cluster and 50 dots in a
530 line. The `binary' style has a more "computer"-like
531 orientation--8K dots, 16-dots clusters and 48 dots per line (which
532 makes for 384K lines). The `mega' style is suitable for
533 downloading very large files--each dot represents 64K retrieved,
534 there are eight dots in a cluster, and 48 dots on each line (so
535 each line contains 3M).
536
537 Note that you can set the default style using the `progress'
538 command in `.wgetrc'. That setting may be overridden from the
539 command line. The exception is that, when the output is not a
540 TTY, the "dot" progress will be favored over "bar". To force the
541 bar output, use `--progress=bar:force'.
542
543`-N'
544`--timestamping'
545 Turn on time-stamping. *Note Time-Stamping::, for details.
546
547`-S'
548`--server-response'
549 Print the headers sent by HTTP servers and responses sent by FTP
550 servers.
551
552`--spider'
553 When invoked with this option, Wget will behave as a Web "spider",
554 which means that it will not download the pages, just check that
555 they are there. For example, you can use Wget to check your
556 bookmarks:
557
558 wget --spider --force-html -i bookmarks.html
559
560 This feature needs much more work for Wget to get close to the
561 functionality of real web spiders.
562
563`-T seconds'
564`--timeout=SECONDS'
565 Set the network timeout to SECONDS seconds. This is equivalent to
566 specifying `--dns-timeout', `--connect-timeout', and
567 `--read-timeout', all at the same time.
568
569 When interacting with the network, Wget can check for timeout and
570 abort the operation if it takes too long. This prevents anomalies
571 like hanging reads and infinite connects. The only timeout
572 enabled by default is a 900-second read timeout. Setting a
573 timeout to 0 disables it altogether. Unless you know what you are
574 doing, it is best not to change the default timeout settings.
575
576 All timeout-related options accept decimal values, as well as
577 subsecond values. For example, `0.1' seconds is a legal (though
578 unwise) choice of timeout. Subsecond timeouts are useful for
579 checking server response times or for testing network latency.
580
581`--dns-timeout=SECONDS'
582 Set the DNS lookup timeout to SECONDS seconds. DNS lookups that
583 don't complete within the specified time will fail. By default,
584 there is no timeout on DNS lookups, other than that implemented by
585 system libraries.
586
587`--connect-timeout=SECONDS'
588 Set the connect timeout to SECONDS seconds. TCP connections that
589 take longer to establish will be aborted. By default, there is no
590 connect timeout, other than that implemented by system libraries.
591
592`--read-timeout=SECONDS'
593 Set the read (and write) timeout to SECONDS seconds. The "time"
594 of this timeout refers "idle time": if, at any point in the
595 download, no data is received for more than the specified number
596 of seconds, reading fails and the download is restarted. This
597 option does not directly affect the duration of the entire
598 download.
599
600 Of course, the remote server may choose to terminate the connection
601 sooner than this option requires. The default read timeout is 900
602 seconds.
603
604`--limit-rate=AMOUNT'
605 Limit the download speed to AMOUNT bytes per second. Amount may
606 be expressed in bytes, kilobytes with the `k' suffix, or megabytes
607 with the `m' suffix. For example, `--limit-rate=20k' will limit
608 the retrieval rate to 20KB/s. This is useful when, for whatever
609 reason, you don't want Wget to consume the entire available
610 bandwidth.
611
612 This option allows the use of decimal numbers, usually in
613 conjunction with power suffixes; for example, `--limit-rate=2.5k'
614 is a legal value.
615
616 Note that Wget implements the limiting by sleeping the appropriate
617 amount of time after a network read that took less time than
618 specified by the rate. Eventually this strategy causes the TCP
619 transfer to slow down to approximately the specified rate.
620 However, it may take some time for this balance to be achieved, so
621 don't be surprised if limiting the rate doesn't work well with
622 very small files.
623
624`-w SECONDS'
625`--wait=SECONDS'
626 Wait the specified number of seconds between the retrievals. Use
627 of this option is recommended, as it lightens the server load by
628 making the requests less frequent. Instead of in seconds, the
629 time can be specified in minutes using the `m' suffix, in hours
630 using `h' suffix, or in days using `d' suffix.
631
632 Specifying a large value for this option is useful if the network
633 or the destination host is down, so that Wget can wait long enough
634 to reasonably expect the network error to be fixed before the
635 retry.
636
637`--waitretry=SECONDS'
638 If you don't want Wget to wait between _every_ retrieval, but only
639 between retries of failed downloads, you can use this option.
640 Wget will use "linear backoff", waiting 1 second after the first
641 failure on a given file, then waiting 2 seconds after the second
642 failure on that file, up to the maximum number of SECONDS you
643 specify. Therefore, a value of 10 will actually make Wget wait up
644 to (1 + 2 + ... + 10) = 55 seconds per file.
645
646 Note that this option is turned on by default in the global
647 `wgetrc' file.
648
649`--random-wait'
650 Some web sites may perform log analysis to identify retrieval
651 programs such as Wget by looking for statistically significant
652 similarities in the time between requests. This option causes the
653 time between requests to vary between 0 and 2 * WAIT seconds,
654 where WAIT was specified using the `--wait' option, in order to
655 mask Wget's presence from such analysis.
656
657 A recent article in a publication devoted to development on a
658 popular consumer platform provided code to perform this analysis
659 on the fly. Its author suggested blocking at the class C address
660 level to ensure automated retrieval programs were blocked despite
661 changing DHCP-supplied addresses.
662
663 The `--random-wait' option was inspired by this ill-advised
664 recommendation to block many unrelated users from a web site due
665 to the actions of one.
666
667`--no-proxy'
668 Don't use proxies, even if the appropriate `*_proxy' environment
669 variable is defined.
670
671 For more information about the use of proxies with Wget, *Note
672 Proxies::.
673
674`-Q QUOTA'
675`--quota=QUOTA'
676 Specify download quota for automatic retrievals. The value can be
677 specified in bytes (default), kilobytes (with `k' suffix), or
678 megabytes (with `m' suffix).
679
680 Note that quota will never affect downloading a single file. So
681 if you specify `wget -Q10k ftp://wuarchive.wustl.edu/ls-lR.gz',
682 all of the `ls-lR.gz' will be downloaded. The same goes even when
683 several URLs are specified on the command-line. However, quota is
684 respected when retrieving either recursively, or from an input
685 file. Thus you may safely type `wget -Q2m -i sites'--download
686 will be aborted when the quota is exceeded.
687
688 Setting quota to 0 or to `inf' unlimits the download quota.
689
690`--no-dns-cache'
691 Turn off caching of DNS lookups. Normally, Wget remembers the IP
692 addresses it looked up from DNS so it doesn't have to repeatedly
693 contact the DNS server for the same (typically small) set of hosts
694 it retrieves from. This cache exists in memory only; a new Wget
695 run will contact DNS again.
696
697 However, it has been reported that in some situations it is not
698 desirable to cache host names, even for the duration of a
699 short-running application like Wget. With this option Wget issues
700 a new DNS lookup (more precisely, a new call to `gethostbyname' or
701 `getaddrinfo') each time it makes a new connection. Please note
702 that this option will _not_ affect caching that might be performed
703 by the resolving library or by an external caching layer, such as
704 NSCD.
705
706 If you don't understand exactly what this option does, you probably
707 won't need it.
708
709`--restrict-file-names=MODE'
710 Change which characters found in remote URLs may show up in local
711 file names generated from those URLs. Characters that are
712 "restricted" by this option are escaped, i.e. replaced with `%HH',
713 where `HH' is the hexadecimal number that corresponds to the
714 restricted character.
715
716 By default, Wget escapes the characters that are not valid as part
717 of file names on your operating system, as well as control
718 characters that are typically unprintable. This option is useful
719 for changing these defaults, either because you are downloading to
720 a non-native partition, or because you want to disable escaping of
721 the control characters.
722
723 When mode is set to "unix", Wget escapes the character `/' and the
724 control characters in the ranges 0-31 and 128-159. This is the
725 default on Unix-like OS'es.
726
727 When mode is set to "windows", Wget escapes the characters `\',
728 `|', `/', `:', `?', `"', `*', `<', `>', and the control characters
729 in the ranges 0-31 and 128-159. In addition to this, Wget in
730 Windows mode uses `+' instead of `:' to separate host and port in
731 local file names, and uses `@' instead of `?' to separate the
732 query portion of the file name from the rest. Therefore, a URL
733 that would be saved as `www.xemacs.org:4300/search.pl?input=blah'
734 in Unix mode would be saved as
735 `www.xemacs.org+4300/search.pl@input=blah' in Windows mode. This
736 mode is the default on Windows.
737
738 If you append `,nocontrol' to the mode, as in `unix,nocontrol',
739 escaping of the control characters is also switched off. You can
740 use `--restrict-file-names=nocontrol' to turn off escaping of
741 control characters without affecting the choice of the OS to use
742 as file name restriction mode.
743
744`-4'
745`--inet4-only'
746`-6'
747`--inet6-only'
748 Force connecting to IPv4 or IPv6 addresses. With `--inet4-only'
749 or `-4', Wget will only connect to IPv4 hosts, ignoring AAAA
750 records in DNS, and refusing to connect to IPv6 addresses
751 specified in URLs. Conversely, with `--inet6-only' or `-6', Wget
752 will only connect to IPv6 hosts and ignore A records and IPv4
753 addresses.
754
755 Neither options should be needed normally. By default, an
756 IPv6-aware Wget will use the address family specified by the
757 host's DNS record. If the DNS responds with both IPv4 and IPv6
758 addresses, Wget will them in sequence until it finds one it can
759 connect to. (Also see `--prefer-family' option described below.)
760
761 These options can be used to deliberately force the use of IPv4 or
762 IPv6 address families on dual family systems, usually to aid
763 debugging or to deal with broken network configuration. Only one
764 of `--inet6-only' and `--inet4-only' may be specified at the same
765 time. Neither option is available in Wget compiled without IPv6
766 support.
767
768`--prefer-family=IPv4/IPv6/none'
769 When given a choice of several addresses, connect to the addresses
770 with specified address family first. IPv4 addresses are preferred
771 by default.
772
773 This avoids spurious errors and connect attempts when accessing
774 hosts that resolve to both IPv6 and IPv4 addresses from IPv4
775 networks. For example, `www.kame.net' resolves to
776 `2001:200:0:8002:203:47ff:fea5:3085' and to `203.178.141.194'.
777 When the preferred family is `IPv4', the IPv4 address is used
778 first; when the preferred family is `IPv6', the IPv6 address is
779 used first; if the specified value is `none', the address order
780 returned by DNS is used without change.
781
782 Unlike `-4' and `-6', this option doesn't inhibit access to any
783 address family, it only changes the _order_ in which the addresses
784 are accessed. Also note that the reordering performed by this
785 option is "stable"--it doesn't affect order of addresses of the
786 same family. That is, the relative order of all IPv4 addresses
787 and of all IPv6 addresses remains intact in all cases.
788
789`--retry-connrefused'
790 Consider "connection refused" a transient error and try again.
791 Normally Wget gives up on a URL when it is unable to connect to the
792 site because failure to connect is taken as a sign that the server
793 is not running at all and that retries would not help. This
794 option is for mirroring unreliable sites whose servers tend to
795 disappear for short periods of time.
796
797`--user=USER'
798`--password=PASSWORD'
799 Specify the username USER and password PASSWORD for both FTP and
800 HTTP file retrieval. These parameters can be overridden using the
801 `--ftp-user' and `--ftp-password' options for FTP connections and
802 the `--http-user' and `--http-password' options for HTTP
803 connections.
804
805
806File: wget.info, Node: Directory Options, Next: HTTP Options, Prev: Download Options, Up: Invoking
807
8082.6 Directory Options
809=====================
810
811`-nd'
812`--no-directories'
813 Do not create a hierarchy of directories when retrieving
814 recursively. With this option turned on, all files will get saved
815 to the current directory, without clobbering (if a name shows up
816 more than once, the filenames will get extensions `.n').
817
818`-x'
819`--force-directories'
820 The opposite of `-nd'--create a hierarchy of directories, even if
821 one would not have been created otherwise. E.g. `wget -x
822 http://fly.srk.fer.hr/robots.txt' will save the downloaded file to
823 `fly.srk.fer.hr/robots.txt'.
824
825`-nH'
826`--no-host-directories'
827 Disable generation of host-prefixed directories. By default,
828 invoking Wget with `-r http://fly.srk.fer.hr/' will create a
829 structure of directories beginning with `fly.srk.fer.hr/'. This
830 option disables such behavior.
831
832`--protocol-directories'
833 Use the protocol name as a directory component of local file
834 names. For example, with this option, `wget -r http://HOST' will
835 save to `http/HOST/...' rather than just to `HOST/...'.
836
837`--cut-dirs=NUMBER'
838 Ignore NUMBER directory components. This is useful for getting a
839 fine-grained control over the directory where recursive retrieval
840 will be saved.
841
842 Take, for example, the directory at
843 `ftp://ftp.xemacs.org/pub/xemacs/'. If you retrieve it with `-r',
844 it will be saved locally under `ftp.xemacs.org/pub/xemacs/'.
845 While the `-nH' option can remove the `ftp.xemacs.org/' part, you
846 are still stuck with `pub/xemacs'. This is where `--cut-dirs'
847 comes in handy; it makes Wget not "see" NUMBER remote directory
848 components. Here are several examples of how `--cut-dirs' option
849 works.
850
851 No options -> ftp.xemacs.org/pub/xemacs/
852 -nH -> pub/xemacs/
853 -nH --cut-dirs=1 -> xemacs/
854 -nH --cut-dirs=2 -> .
855
856 --cut-dirs=1 -> ftp.xemacs.org/xemacs/
857 ...
858
859 If you just want to get rid of the directory structure, this
860 option is similar to a combination of `-nd' and `-P'. However,
861 unlike `-nd', `--cut-dirs' does not lose with subdirectories--for
862 instance, with `-nH --cut-dirs=1', a `beta/' subdirectory will be
863 placed to `xemacs/beta', as one would expect.
864
865`-P PREFIX'
866`--directory-prefix=PREFIX'
867 Set directory prefix to PREFIX. The "directory prefix" is the
868 directory where all other files and subdirectories will be saved
869 to, i.e. the top of the retrieval tree. The default is `.' (the
870 current directory).
871
872
873File: wget.info, Node: HTTP Options, Next: HTTPS (SSL/TLS) Options, Prev: Directory Options, Up: Invoking
874
8752.7 HTTP Options
876================
877
878`-E'
879`--html-extension'
880 If a file of type `application/xhtml+xml' or `text/html' is
881 downloaded and the URL does not end with the regexp
882 `\.[Hh][Tt][Mm][Ll]?', this option will cause the suffix `.html'
883 to be appended to the local filename. This is useful, for
884 instance, when you're mirroring a remote site that uses `.asp'
885 pages, but you want the mirrored pages to be viewable on your
886 stock Apache server. Another good use for this is when you're
887 downloading CGI-generated materials. A URL like
888 `http://site.com/article.cgi?25' will be saved as
889 `article.cgi?25.html'.
890
891 Note that filenames changed in this way will be re-downloaded
892 every time you re-mirror a site, because Wget can't tell that the
893 local `X.html' file corresponds to remote URL `X' (since it
894 doesn't yet know that the URL produces output of type `text/html'
895 or `application/xhtml+xml'. To prevent this re-downloading, you
896 must use `-k' and `-K' so that the original version of the file
897 will be saved as `X.orig' (*note Recursive Retrieval Options::).
898
899`--http-user=USER'
900`--http-password=PASSWORD'
901 Specify the username USER and password PASSWORD on an HTTP server.
902 According to the type of the challenge, Wget will encode them
903 using either the `basic' (insecure) or the `digest' authentication
904 scheme.
905
906 Another way to specify username and password is in the URL itself
907 (*note URL Format::). Either method reveals your password to
908 anyone who bothers to run `ps'. To prevent the passwords from
909 being seen, store them in `.wgetrc' or `.netrc', and make sure to
910 protect those files from other users with `chmod'. If the
911 passwords are really important, do not leave them lying in those
912 files either--edit the files and delete them after Wget has
913 started the download.
914
915`--no-cache'
916 Disable server-side cache. In this case, Wget will send the remote
917 server an appropriate directive (`Pragma: no-cache') to get the
918 file from the remote service, rather than returning the cached
919 version. This is especially useful for retrieving and flushing
920 out-of-date documents on proxy servers.
921
922 Caching is allowed by default.
923
924`--no-cookies'
925 Disable the use of cookies. Cookies are a mechanism for
926 maintaining server-side state. The server sends the client a
927 cookie using the `Set-Cookie' header, and the client responds with
928 the same cookie upon further requests. Since cookies allow the
929 server owners to keep track of visitors and for sites to exchange
930 this information, some consider them a breach of privacy. The
931 default is to use cookies; however, _storing_ cookies is not on by
932 default.
933
934`--load-cookies FILE'
935 Load cookies from FILE before the first HTTP retrieval. FILE is a
936 textual file in the format originally used by Netscape's
937 `cookies.txt' file.
938
939 You will typically use this option when mirroring sites that
940 require that you be logged in to access some or all of their
941 content. The login process typically works by the web server
942 issuing an HTTP cookie upon receiving and verifying your
943 credentials. The cookie is then resent by the browser when
944 accessing that part of the site, and so proves your identity.
945
946 Mirroring such a site requires Wget to send the same cookies your
947 browser sends when communicating with the site. This is achieved
948 by `--load-cookies'--simply point Wget to the location of the
949 `cookies.txt' file, and it will send the same cookies your browser
950 would send in the same situation. Different browsers keep textual
951 cookie files in different locations:
952
953 Netscape 4.x.
954 The cookies are in `~/.netscape/cookies.txt'.
955
956 Mozilla and Netscape 6.x.
957 Mozilla's cookie file is also named `cookies.txt', located
958 somewhere under `~/.mozilla', in the directory of your
959 profile. The full path usually ends up looking somewhat like
960 `~/.mozilla/default/SOME-WEIRD-STRING/cookies.txt'.
961
962 Internet Explorer.
963 You can produce a cookie file Wget can use by using the File
964 menu, Import and Export, Export Cookies. This has been
965 tested with Internet Explorer 5; it is not guaranteed to work
966 with earlier versions.
967
968 Other browsers.
969 If you are using a different browser to create your cookies,
970 `--load-cookies' will only work if you can locate or produce a
971 cookie file in the Netscape format that Wget expects.
972
973 If you cannot use `--load-cookies', there might still be an
974 alternative. If your browser supports a "cookie manager", you can
975 use it to view the cookies used when accessing the site you're
976 mirroring. Write down the name and value of the cookie, and
977 manually instruct Wget to send those cookies, bypassing the
978 "official" cookie support:
979
980 wget --no-cookies --header "Cookie: NAME=VALUE"
981
982`--save-cookies FILE'
983 Save cookies to FILE before exiting. This will not save cookies
984 that have expired or that have no expiry time (so-called "session
985 cookies"), but also see `--keep-session-cookies'.
986
987`--keep-session-cookies'
988 When specified, causes `--save-cookies' to also save session
989 cookies. Session cookies are normally not saved because they are
990 meant to be kept in memory and forgotten when you exit the browser.
991 Saving them is useful on sites that require you to log in or to
992 visit the home page before you can access some pages. With this
993 option, multiple Wget runs are considered a single browser session
994 as far as the site is concerned.
995
996 Since the cookie file format does not normally carry session
997 cookies, Wget marks them with an expiry timestamp of 0. Wget's
998 `--load-cookies' recognizes those as session cookies, but it might
999 confuse other browsers. Also note that cookies so loaded will be
1000 treated as other session cookies, which means that if you want
1001 `--save-cookies' to preserve them again, you must use
1002 `--keep-session-cookies' again.
1003
1004`--ignore-length'
1005 Unfortunately, some HTTP servers (CGI programs, to be more
1006 precise) send out bogus `Content-Length' headers, which makes Wget
1007 go wild, as it thinks not all the document was retrieved. You can
1008 spot this syndrome if Wget retries getting the same document again
1009 and again, each time claiming that the (otherwise normal)
1010 connection has closed on the very same byte.
1011
1012 With this option, Wget will ignore the `Content-Length' header--as
1013 if it never existed.
1014
1015`--header=HEADER-LINE'
1016 Send HEADER-LINE along with the rest of the headers in each HTTP
1017 request. The supplied header is sent as-is, which means it must
1018 contain name and value separated by colon, and must not contain
1019 newlines.
1020
1021 You may define more than one additional header by specifying
1022 `--header' more than once.
1023
1024 wget --header='Accept-Charset: iso-8859-2' \
1025 --header='Accept-Language: hr' \
1026 http://fly.srk.fer.hr/
1027
1028 Specification of an empty string as the header value will clear all
1029 previous user-defined headers.
1030
1031 As of Wget 1.10, this option can be used to override headers
1032 otherwise generated automatically. This example instructs Wget to
1033 connect to localhost, but to specify `foo.bar' in the `Host'
1034 header:
1035
1036 wget --header="Host: foo.bar" http://localhost/
1037
1038 In versions of Wget prior to 1.10 such use of `--header' caused
1039 sending of duplicate headers.
1040
1041`--proxy-user=USER'
1042`--proxy-password=PASSWORD'
1043 Specify the username USER and password PASSWORD for authentication
1044 on a proxy server. Wget will encode them using the `basic'
1045 authentication scheme.
1046
1047 Security considerations similar to those with `--http-password'
1048 pertain here as well.
1049
1050`--referer=URL'
1051 Include `Referer: URL' header in HTTP request. Useful for
1052 retrieving documents with server-side processing that assume they
1053 are always being retrieved by interactive web browsers and only
1054 come out properly when Referer is set to one of the pages that
1055 point to them.
1056
1057`--save-headers'
1058 Save the headers sent by the HTTP server to the file, preceding the
1059 actual contents, with an empty line as the separator.
1060
1061`-U AGENT-STRING'
1062`--user-agent=AGENT-STRING'
1063 Identify as AGENT-STRING to the HTTP server.
1064
1065 The HTTP protocol allows the clients to identify themselves using a
1066 `User-Agent' header field. This enables distinguishing the WWW
1067 software, usually for statistical purposes or for tracing of
1068 protocol violations. Wget normally identifies as `Wget/VERSION',
1069 VERSION being the current version number of Wget.
1070
1071 However, some sites have been known to impose the policy of
1072 tailoring the output according to the `User-Agent'-supplied
1073 information. While this is not such a bad idea in theory, it has
1074 been abused by servers denying information to clients other than
1075 (historically) Netscape or, more frequently, Microsoft Internet
1076 Explorer. This option allows you to change the `User-Agent' line
1077 issued by Wget. Use of this option is discouraged, unless you
1078 really know what you are doing.
1079
1080 Specifying empty user agent with `--user-agent=""' instructs Wget
1081 not to send the `User-Agent' header in HTTP requests.
1082
1083`--post-data=STRING'
1084`--post-file=FILE'
1085 Use POST as the method for all HTTP requests and send the
1086 specified data in the request body. `--post-data' sends STRING as
1087 data, whereas `--post-file' sends the contents of FILE. Other than
1088 that, they work in exactly the same way.
1089
1090 Please be aware that Wget needs to know the size of the POST data
1091 in advance. Therefore the argument to `--post-file' must be a
1092 regular file; specifying a FIFO or something like `/dev/stdin'
1093 won't work. It's not quite clear how to work around this
1094 limitation inherent in HTTP/1.0. Although HTTP/1.1 introduces
1095 "chunked" transfer that doesn't require knowing the request length
1096 in advance, a client can't use chunked unless it knows it's
1097 talking to an HTTP/1.1 server. And it can't know that until it
1098 receives a response, which in turn requires the request to have
1099 been completed - a chicken-and-egg problem.
1100
1101 Note: if Wget is redirected after the POST request is completed, it
1102 will not send the POST data to the redirected URL. This is because
1103 URLs that process POST often respond with a redirection to a
1104 regular page, which does not desire or accept POST. It is not
1105 completely clear that this behavior is optimal; if it doesn't work
1106 out, it might be changed in the future.
1107
1108 This example shows how to log to a server using POST and then
1109 proceed to download the desired pages, presumably only accessible
1110 to authorized users:
1111
1112 # Log in to the server. This can be done only once.
1113 wget --save-cookies cookies.txt \
1114 --post-data 'user=foo&password=bar' \
1115 http://server.com/auth.php
1116
1117 # Now grab the page or pages we care about.
1118 wget --load-cookies cookies.txt \
1119 -p http://server.com/interesting/article.php
1120
1121 If the server is using session cookies to track user
1122 authentication, the above will not work because `--save-cookies'
1123 will not save them (and neither will browsers) and the
1124 `cookies.txt' file will be empty. In that case use
1125 `--keep-session-cookies' along with `--save-cookies' to force
1126 saving of session cookies.
1127
1128
1129File: wget.info, Node: HTTPS (SSL/TLS) Options, Next: FTP Options, Prev: HTTP Options, Up: Invoking
1130
11312.8 HTTPS (SSL/TLS) Options
1132===========================
1133
1134To support encrypted HTTP (HTTPS) downloads, Wget must be compiled with
1135an external SSL library, currently OpenSSL. If Wget is compiled
1136without SSL support, none of these options are available.
1137
1138`--secure-protocol=PROTOCOL'
1139 Choose the secure protocol to be used. Legal values are `auto',
1140 `SSLv2', `SSLv3', and `TLSv1'. If `auto' is used, the SSL library
1141 is given the liberty of choosing the appropriate protocol
1142 automatically, which is achieved by sending an SSLv2 greeting and
1143 announcing support for SSLv3 and TLSv1. This is the default.
1144
1145 Specifying `SSLv2', `SSLv3', or `TLSv1' forces the use of the
1146 corresponding protocol. This is useful when talking to old and
1147 buggy SSL server implementations that make it hard for OpenSSL to
1148 choose the correct protocol version. Fortunately, such servers are
1149 quite rare.
1150
1151`--no-check-certificate'
1152 Don't check the server certificate against the available
1153 certificate authorities. Also don't require the URL host name to
1154 match the common name presented by the certificate.
1155
1156 As of Wget 1.10, the default is to verify the server's certificate
1157 against the recognized certificate authorities, breaking the SSL
1158 handshake and aborting the download if the verification fails.
1159 Although this provides more secure downloads, it does break
1160 interoperability with some sites that worked with previous Wget
1161 versions, particularly those using self-signed, expired, or
1162 otherwise invalid certificates. This option forces an "insecure"
1163 mode of operation that turns the certificate verification errors
1164 into warnings and allows you to proceed.
1165
1166 If you encounter "certificate verification" errors or ones saying
1167 that "common name doesn't match requested host name", you can use
1168 this option to bypass the verification and proceed with the
1169 download. _Only use this option if you are otherwise convinced of
1170 the site's authenticity, or if you really don't care about the
1171 validity of its certificate._ It is almost always a bad idea not
1172 to check the certificates when transmitting confidential or
1173 important data.
1174
1175`--certificate=FILE'
1176 Use the client certificate stored in FILE. This is needed for
1177 servers that are configured to require certificates from the
1178 clients that connect to them. Normally a certificate is not
1179 required and this switch is optional.
1180
1181`--certificate-type=TYPE'
1182 Specify the type of the client certificate. Legal values are
1183 `PEM' (assumed by default) and `DER', also known as `ASN1'.
1184
1185`--private-key=FILE'
1186 Read the private key from FILE. This allows you to provide the
1187 private key in a file separate from the certificate.
1188
1189`--private-key-type=TYPE'
1190 Specify the type of the private key. Accepted values are `PEM'
1191 (the default) and `DER'.
1192
1193`--ca-certificate=FILE'
1194 Use FILE as the file with the bundle of certificate authorities
1195 ("CA") to verify the peers. The certificates must be in PEM
1196 format.
1197
1198 Without this option Wget looks for CA certificates at the
1199 system-specified locations, chosen at OpenSSL installation time.
1200
1201`--ca-directory=DIRECTORY'
1202 Specifies directory containing CA certificates in PEM format. Each
1203 file contains one CA certificate, and the file name is based on a
1204 hash value derived from the certificate. This is achieved by
1205 processing a certificate directory with the `c_rehash' utility
1206 supplied with OpenSSL. Using `--ca-directory' is more efficient
1207 than `--ca-certificate' when many certificates are installed
1208 because it allows Wget to fetch certificates on demand.
1209
1210 Without this option Wget looks for CA certificates at the
1211 system-specified locations, chosen at OpenSSL installation time.
1212
1213`--random-file=FILE'
1214 Use FILE as the source of random data for seeding the
1215 pseudo-random number generator on systems without `/dev/random'.
1216
1217 On such systems the SSL library needs an external source of
1218 randomness to initialize. Randomness may be provided by EGD (see
1219 `--egd-file' below) or read from an external source specified by
1220 the user. If this option is not specified, Wget looks for random
1221 data in `$RANDFILE' or, if that is unset, in `$HOME/.rnd'. If
1222 none of those are available, it is likely that SSL encryption will
1223 not be usable.
1224
1225 If you're getting the "Could not seed OpenSSL PRNG; disabling SSL."
1226 error, you should provide random data using some of the methods
1227 described above.
1228
1229`--egd-file=FILE'
1230 Use FILE as the EGD socket. EGD stands for "Entropy Gathering
1231 Daemon", a user-space program that collects data from various
1232 unpredictable system sources and makes it available to other
1233 programs that might need it. Encryption software, such as the SSL
1234 library, needs sources of non-repeating randomness to seed the
1235 random number generator used to produce cryptographically strong
1236 keys.
1237
1238 OpenSSL allows the user to specify his own source of entropy using
1239 the `RAND_FILE' environment variable. If this variable is unset,
1240 or if the specified file does not produce enough randomness,
1241 OpenSSL will read random data from EGD socket specified using this
1242 option.
1243
1244 If this option is not specified (and the equivalent startup
1245 command is not used), EGD is never contacted. EGD is not needed
1246 on modern Unix systems that support `/dev/random'.
1247
1248
1249File: wget.info, Node: FTP Options, Next: Recursive Retrieval Options, Prev: HTTPS (SSL/TLS) Options, Up: Invoking
1250
12512.9 FTP Options
1252===============
1253
1254`--ftp-user=USER'
1255`--ftp-password=PASSWORD'
1256 Specify the username USER and password PASSWORD on an FTP server.
1257 Without this, or the corresponding startup option, the password
1258 defaults to `-wget@', normally used for anonymous FTP.
1259
1260 Another way to specify username and password is in the URL itself
1261 (*note URL Format::). Either method reveals your password to
1262 anyone who bothers to run `ps'. To prevent the passwords from
1263 being seen, store them in `.wgetrc' or `.netrc', and make sure to
1264 protect those files from other users with `chmod'. If the
1265 passwords are really important, do not leave them lying in those
1266 files either--edit the files and delete them after Wget has
1267 started the download.
1268
1269`--no-remove-listing'
1270 Don't remove the temporary `.listing' files generated by FTP
1271 retrievals. Normally, these files contain the raw directory
1272 listings received from FTP servers. Not removing them can be
1273 useful for debugging purposes, or when you want to be able to
1274 easily check on the contents of remote server directories (e.g. to
1275 verify that a mirror you're running is complete).
1276
1277 Note that even though Wget writes to a known filename for this
1278 file, this is not a security hole in the scenario of a user making
1279 `.listing' a symbolic link to `/etc/passwd' or something and
1280 asking `root' to run Wget in his or her directory. Depending on
1281 the options used, either Wget will refuse to write to `.listing',
1282 making the globbing/recursion/time-stamping operation fail, or the
1283 symbolic link will be deleted and replaced with the actual
1284 `.listing' file, or the listing will be written to a
1285 `.listing.NUMBER' file.
1286
1287 Even though this situation isn't a problem, though, `root' should
1288 never run Wget in a non-trusted user's directory. A user could do
1289 something as simple as linking `index.html' to `/etc/passwd' and
1290 asking `root' to run Wget with `-N' or `-r' so the file will be
1291 overwritten.
1292
1293`--no-glob'
1294 Turn off FTP globbing. Globbing refers to the use of shell-like
1295 special characters ("wildcards"), like `*', `?', `[' and `]' to
1296 retrieve more than one file from the same directory at once, like:
1297
1298 wget ftp://gnjilux.srk.fer.hr/*.msg
1299
1300 By default, globbing will be turned on if the URL contains a
1301 globbing character. This option may be used to turn globbing on
1302 or off permanently.
1303
1304 You may have to quote the URL to protect it from being expanded by
1305 your shell. Globbing makes Wget look for a directory listing,
1306 which is system-specific. This is why it currently works only
1307 with Unix FTP servers (and the ones emulating Unix `ls' output).
1308
1309`--no-passive-ftp'
1310 Disable the use of the "passive" FTP transfer mode. Passive FTP
1311 mandates that the client connect to the server to establish the
1312 data connection rather than the other way around.
1313
1314 If the machine is connected to the Internet directly, both passive
1315 and active FTP should work equally well. Behind most firewall and
1316 NAT configurations passive FTP has a better chance of working.
1317 However, in some rare firewall configurations, active FTP actually
1318 works when passive FTP doesn't. If you suspect this to be the
1319 case, use this option, or set `passive_ftp=off' in your init file.
1320
1321`--retr-symlinks'
1322 Usually, when retrieving FTP directories recursively and a symbolic
1323 link is encountered, the linked-to file is not downloaded.
1324 Instead, a matching symbolic link is created on the local
1325 filesystem. The pointed-to file will not be downloaded unless
1326 this recursive retrieval would have encountered it separately and
1327 downloaded it anyway.
1328
1329 When `--retr-symlinks' is specified, however, symbolic links are
1330 traversed and the pointed-to files are retrieved. At this time,
1331 this option does not cause Wget to traverse symlinks to
1332 directories and recurse through them, but in the future it should
1333 be enhanced to do this.
1334
1335 Note that when retrieving a file (not a directory) because it was
1336 specified on the command-line, rather than because it was recursed
1337 to, this option has no effect. Symbolic links are always
1338 traversed in this case.
1339
1340`--no-http-keep-alive'
1341 Turn off the "keep-alive" feature for HTTP downloads. Normally,
1342 Wget asks the server to keep the connection open so that, when you
1343 download more than one document from the same server, they get
1344 transferred over the same TCP connection. This saves time and at
1345 the same time reduces the load on the server.
1346
1347 This option is useful when, for some reason, persistent
1348 (keep-alive) connections don't work for you, for example due to a
1349 server bug or due to the inability of server-side scripts to cope
1350 with the connections.
1351
1352
1353File: wget.info, Node: Recursive Retrieval Options, Next: Recursive Accept/Reject Options, Prev: FTP Options, Up: Invoking
1354
13552.10 Recursive Retrieval Options
1356================================
1357
1358`-r'
1359`--recursive'
1360 Turn on recursive retrieving. *Note Recursive Download::, for more
1361 details.
1362
1363`-l DEPTH'
1364`--level=DEPTH'
1365 Specify recursion maximum depth level DEPTH (*note Recursive
1366 Download::). The default maximum depth is 5.
1367
1368`--delete-after'
1369 This option tells Wget to delete every single file it downloads,
1370 _after_ having done so. It is useful for pre-fetching popular
1371 pages through a proxy, e.g.:
1372
1373 wget -r -nd --delete-after http://whatever.com/~popular/page/
1374
1375 The `-r' option is to retrieve recursively, and `-nd' to not
1376 create directories.
1377
1378 Note that `--delete-after' deletes files on the local machine. It
1379 does not issue the `DELE' command to remote FTP sites, for
1380 instance. Also note that when `--delete-after' is specified,
1381 `--convert-links' is ignored, so `.orig' files are simply not
1382 created in the first place.
1383
1384`-k'
1385`--convert-links'
1386 After the download is complete, convert the links in the document
1387 to make them suitable for local viewing. This affects not only
1388 the visible hyperlinks, but any part of the document that links to
1389 external content, such as embedded images, links to style sheets,
1390 hyperlinks to non-HTML content, etc.
1391
1392 Each link will be changed in one of the two ways:
1393
1394 * The links to files that have been downloaded by Wget will be
1395 changed to refer to the file they point to as a relative link.
1396
1397 Example: if the downloaded file `/foo/doc.html' links to
1398 `/bar/img.gif', also downloaded, then the link in `doc.html'
1399 will be modified to point to `../bar/img.gif'. This kind of
1400 transformation works reliably for arbitrary combinations of
1401 directories.
1402
1403 * The links to files that have not been downloaded by Wget will
1404 be changed to include host name and absolute path of the
1405 location they point to.
1406
1407 Example: if the downloaded file `/foo/doc.html' links to
1408 `/bar/img.gif' (or to `../bar/img.gif'), then the link in
1409 `doc.html' will be modified to point to
1410 `http://HOSTNAME/bar/img.gif'.
1411
1412 Because of this, local browsing works reliably: if a linked file
1413 was downloaded, the link will refer to its local name; if it was
1414 not downloaded, the link will refer to its full Internet address
1415 rather than presenting a broken link. The fact that the former
1416 links are converted to relative links ensures that you can move
1417 the downloaded hierarchy to another directory.
1418
1419 Note that only at the end of the download can Wget know which
1420 links have been downloaded. Because of that, the work done by
1421 `-k' will be performed at the end of all the downloads.
1422
1423`-K'
1424`--backup-converted'
1425 When converting a file, back up the original version with a `.orig'
1426 suffix. Affects the behavior of `-N' (*note HTTP Time-Stamping
1427 Internals::).
1428
1429`-m'
1430`--mirror'
1431 Turn on options suitable for mirroring. This option turns on
1432 recursion and time-stamping, sets infinite recursion depth and
1433 keeps FTP directory listings. It is currently equivalent to `-r
1434 -N -l inf --no-remove-listing'.
1435
1436`-p'
1437`--page-requisites'
1438 This option causes Wget to download all the files that are
1439 necessary to properly display a given HTML page. This includes
1440 such things as inlined images, sounds, and referenced stylesheets.
1441
1442 Ordinarily, when downloading a single HTML page, any requisite
1443 documents that may be needed to display it properly are not
1444 downloaded. Using `-r' together with `-l' can help, but since
1445 Wget does not ordinarily distinguish between external and inlined
1446 documents, one is generally left with "leaf documents" that are
1447 missing their requisites.
1448
1449 For instance, say document `1.html' contains an `<IMG>' tag
1450 referencing `1.gif' and an `<A>' tag pointing to external document
1451 `2.html'. Say that `2.html' is similar but that its image is
1452 `2.gif' and it links to `3.html'. Say this continues up to some
1453 arbitrarily high number.
1454
1455 If one executes the command:
1456
1457 wget -r -l 2 http://SITE/1.html
1458
1459 then `1.html', `1.gif', `2.html', `2.gif', and `3.html' will be
1460 downloaded. As you can see, `3.html' is without its requisite
1461 `3.gif' because Wget is simply counting the number of hops (up to
1462 2) away from `1.html' in order to determine where to stop the
1463 recursion. However, with this command:
1464
1465 wget -r -l 2 -p http://SITE/1.html
1466
1467 all the above files _and_ `3.html''s requisite `3.gif' will be
1468 downloaded. Similarly,
1469
1470 wget -r -l 1 -p http://SITE/1.html
1471
1472 will cause `1.html', `1.gif', `2.html', and `2.gif' to be
1473 downloaded. One might think that:
1474
1475 wget -r -l 0 -p http://SITE/1.html
1476
1477 would download just `1.html' and `1.gif', but unfortunately this
1478 is not the case, because `-l 0' is equivalent to `-l inf'--that
1479 is, infinite recursion. To download a single HTML page (or a
1480 handful of them, all specified on the command-line or in a `-i'
1481 URL input file) and its (or their) requisites, simply leave off
1482 `-r' and `-l':
1483
1484 wget -p http://SITE/1.html
1485
1486 Note that Wget will behave as if `-r' had been specified, but only
1487 that single page and its requisites will be downloaded. Links
1488 from that page to external documents will not be followed.
1489 Actually, to download a single page and all its requisites (even
1490 if they exist on separate websites), and make sure the lot
1491 displays properly locally, this author likes to use a few options
1492 in addition to `-p':
1493
1494 wget -E -H -k -K -p http://SITE/DOCUMENT
1495
1496 To finish off this topic, it's worth knowing that Wget's idea of an
1497 external document link is any URL specified in an `<A>' tag, an
1498 `<AREA>' tag, or a `<LINK>' tag other than `<LINK
1499 REL="stylesheet">'.
1500
1501`--strict-comments'
1502 Turn on strict parsing of HTML comments. The default is to
1503 terminate comments at the first occurrence of `-->'.
1504
1505 According to specifications, HTML comments are expressed as SGML
1506 "declarations". Declaration is special markup that begins with
1507 `<!' and ends with `>', such as `<!DOCTYPE ...>', that may contain
1508 comments between a pair of `--' delimiters. HTML comments are
1509 "empty declarations", SGML declarations without any non-comment
1510 text. Therefore, `<!--foo-->' is a valid comment, and so is
1511 `<!--one-- --two-->', but `<!--1--2-->' is not.
1512
1513 On the other hand, most HTML writers don't perceive comments as
1514 anything other than text delimited with `<!--' and `-->', which is
1515 not quite the same. For example, something like `<!------------>'
1516 works as a valid comment as long as the number of dashes is a
1517 multiple of four (!). If not, the comment technically lasts until
1518 the next `--', which may be at the other end of the document.
1519 Because of this, many popular browsers completely ignore the
1520 specification and implement what users have come to expect:
1521 comments delimited with `<!--' and `-->'.
1522
1523 Until version 1.9, Wget interpreted comments strictly, which
1524 resulted in missing links in many web pages that displayed fine in
1525 browsers, but had the misfortune of containing non-compliant
1526 comments. Beginning with version 1.9, Wget has joined the ranks
1527 of clients that implements "naive" comments, terminating each
1528 comment at the first occurrence of `-->'.
1529
1530 If, for whatever reason, you want strict comment parsing, use this
1531 option to turn it on.
1532
1533
1534File: wget.info, Node: Recursive Accept/Reject Options, Prev: Recursive Retrieval Options, Up: Invoking
1535
15362.11 Recursive Accept/Reject Options
1537====================================
1538
1539`-A ACCLIST --accept ACCLIST'
1540`-R REJLIST --reject REJLIST'
1541 Specify comma-separated lists of file name suffixes or patterns to
1542 accept or reject (*note Types of Files:: for more details).
1543
1544`-D DOMAIN-LIST'
1545`--domains=DOMAIN-LIST'
1546 Set domains to be followed. DOMAIN-LIST is a comma-separated list
1547 of domains. Note that it does _not_ turn on `-H'.
1548
1549`--exclude-domains DOMAIN-LIST'
1550 Specify the domains that are _not_ to be followed. (*note
1551 Spanning Hosts::).
1552
1553`--follow-ftp'
1554 Follow FTP links from HTML documents. Without this option, Wget
1555 will ignore all the FTP links.
1556
1557`--follow-tags=LIST'
1558 Wget has an internal table of HTML tag / attribute pairs that it
1559 considers when looking for linked documents during a recursive
1560 retrieval. If a user wants only a subset of those tags to be
1561 considered, however, he or she should be specify such tags in a
1562 comma-separated LIST with this option.
1563
1564`--ignore-tags=LIST'
1565 This is the opposite of the `--follow-tags' option. To skip
1566 certain HTML tags when recursively looking for documents to
1567 download, specify them in a comma-separated LIST.
1568
1569 In the past, this option was the best bet for downloading a single
1570 page and its requisites, using a command-line like:
1571
1572 wget --ignore-tags=a,area -H -k -K -r http://SITE/DOCUMENT
1573
1574 However, the author of this option came across a page with tags
1575 like `<LINK REL="home" HREF="/">' and came to the realization that
1576 specifying tags to ignore was not enough. One can't just tell
1577 Wget to ignore `<LINK>', because then stylesheets will not be
1578 downloaded. Now the best bet for downloading a single page and
1579 its requisites is the dedicated `--page-requisites' option.
1580
1581`-H'
1582`--span-hosts'
1583 Enable spanning across hosts when doing recursive retrieving
1584 (*note Spanning Hosts::).
1585
1586`-L'
1587`--relative'
1588 Follow relative links only. Useful for retrieving a specific home
1589 page without any distractions, not even those from the same hosts
1590 (*note Relative Links::).
1591
1592`-I LIST'
1593`--include-directories=LIST'
1594 Specify a comma-separated list of directories you wish to follow
1595 when downloading (*note Directory-Based Limits:: for more
1596 details.) Elements of LIST may contain wildcards.
1597
1598`-X LIST'
1599`--exclude-directories=LIST'
1600 Specify a comma-separated list of directories you wish to exclude
1601 from download (*note Directory-Based Limits:: for more details.)
1602 Elements of LIST may contain wildcards.
1603
1604`-np'
1605
1606`--no-parent'
1607 Do not ever ascend to the parent directory when retrieving
1608 recursively. This is a useful option, since it guarantees that
1609 only the files _below_ a certain hierarchy will be downloaded.
1610 *Note Directory-Based Limits::, for more details.
1611
1612
1613File: wget.info, Node: Recursive Download, Next: Following Links, Prev: Invoking, Up: Top
1614
16153 Recursive Download
1616********************
1617
1618GNU Wget is capable of traversing parts of the Web (or a single HTTP or
1619FTP server), following links and directory structure. We refer to this
1620as to "recursive retrieval", or "recursion".
1621
1622 With HTTP URLs, Wget retrieves and parses the HTML from the given
1623URL, documents, retrieving the files the HTML document was referring
1624to, through markup like `href', or `src'. If the freshly downloaded
1625file is also of type `text/html' or `application/xhtml+xml', it will be
1626parsed and followed further.
1627
1628 Recursive retrieval of HTTP and HTML content is "breadth-first".
1629This means that Wget first downloads the requested HTML document, then
1630the documents linked from that document, then the documents linked by
1631them, and so on. In other words, Wget first downloads the documents at
1632depth 1, then those at depth 2, and so on until the specified maximum
1633depth.
1634
1635 The maximum "depth" to which the retrieval may descend is specified
1636with the `-l' option. The default maximum depth is five layers.
1637
1638 When retrieving an FTP URL recursively, Wget will retrieve all the
1639data from the given directory tree (including the subdirectories up to
1640the specified depth) on the remote server, creating its mirror image
1641locally. FTP retrieval is also limited by the `depth' parameter.
1642Unlike HTTP recursion, FTP recursion is performed depth-first.
1643
1644 By default, Wget will create a local directory tree, corresponding to
1645the one found on the remote server.
1646
1647 Recursive retrieving can find a number of applications, the most
1648important of which is mirroring. It is also useful for WWW
1649presentations, and any other opportunities where slow network
1650connections should be bypassed by storing the files locally.
1651
1652 You should be warned that recursive downloads can overload the remote
1653servers. Because of that, many administrators frown upon them and may
1654ban access from your site if they detect very fast downloads of big
1655amounts of content. When downloading from Internet servers, consider
1656using the `-w' option to introduce a delay between accesses to the
1657server. The download will take a while longer, but the server
1658administrator will not be alarmed by your rudeness.
1659
1660 Of course, recursive download may cause problems on your machine. If
1661left to run unchecked, it can easily fill up the disk. If downloading
1662from local network, it can also take bandwidth on the system, as well as
1663consume memory and CPU.
1664
1665 Try to specify the criteria that match the kind of download you are
1666trying to achieve. If you want to download only one page, use
1667`--page-requisites' without any additional recursion. If you want to
1668download things under one directory, use `-np' to avoid downloading
1669things from other directories. If you want to download all the files
1670from one directory, use `-l 1' to make sure the recursion depth never
1671exceeds one. *Note Following Links::, for more information about this.
1672
1673 Recursive retrieval should be used with care. Don't say you were not
1674warned.
1675
1676
1677File: wget.info, Node: Following Links, Next: Time-Stamping, Prev: Recursive Download, Up: Top
1678
16794 Following Links
1680*****************
1681
1682When retrieving recursively, one does not wish to retrieve loads of
1683unnecessary data. Most of the time the users bear in mind exactly what
1684they want to download, and want Wget to follow only specific links.
1685
1686 For example, if you wish to download the music archive from
1687`fly.srk.fer.hr', you will not want to download all the home pages that
1688happen to be referenced by an obscure part of the archive.
1689
1690 Wget possesses several mechanisms that allows you to fine-tune which
1691links it will follow.
1692
1693* Menu:
1694
1695* Spanning Hosts:: (Un)limiting retrieval based on host name.
1696* Types of Files:: Getting only certain files.
1697* Directory-Based Limits:: Getting only certain directories.
1698* Relative Links:: Follow relative links only.
1699* FTP Links:: Following FTP links.
1700
1701
1702File: wget.info, Node: Spanning Hosts, Next: Types of Files, Up: Following Links
1703
17044.1 Spanning Hosts
1705==================
1706
1707Wget's recursive retrieval normally refuses to visit hosts different
1708than the one you specified on the command line. This is a reasonable
1709default; without it, every retrieval would have the potential to turn
1710your Wget into a small version of google.
1711
1712 However, visiting different hosts, or "host spanning," is sometimes
1713a useful option. Maybe the images are served from a different server.
1714Maybe you're mirroring a site that consists of pages interlinked between
1715three servers. Maybe the server has two equivalent names, and the HTML
1716pages refer to both interchangeably.
1717
1718Span to any host--`-H'
1719 The `-H' option turns on host spanning, thus allowing Wget's
1720 recursive run to visit any host referenced by a link. Unless
1721 sufficient recursion-limiting criteria are applied depth, these
1722 foreign hosts will typically link to yet more hosts, and so on
1723 until Wget ends up sucking up much more data than you have
1724 intended.
1725
1726Limit spanning to certain domains--`-D'
1727 The `-D' option allows you to specify the domains that will be
1728 followed, thus limiting the recursion only to the hosts that
1729 belong to these domains. Obviously, this makes sense only in
1730 conjunction with `-H'. A typical example would be downloading the
1731 contents of `www.server.com', but allowing downloads from
1732 `images.server.com', etc.:
1733
1734 wget -rH -Dserver.com http://www.server.com/
1735
1736 You can specify more than one address by separating them with a
1737 comma, e.g. `-Ddomain1.com,domain2.com'.
1738
1739Keep download off certain domains--`--exclude-domains'
1740 If there are domains you want to exclude specifically, you can do
1741 it with `--exclude-domains', which accepts the same type of
1742 arguments of `-D', but will _exclude_ all the listed domains. For
1743 example, if you want to download all the hosts from `foo.edu'
1744 domain, with the exception of `sunsite.foo.edu', you can do it like
1745 this:
1746
1747 wget -rH -Dfoo.edu --exclude-domains sunsite.foo.edu \
1748 http://www.foo.edu/
1749
1750
1751
1752File: wget.info, Node: Types of Files, Next: Directory-Based Limits, Prev: Spanning Hosts, Up: Following Links
1753
17544.2 Types of Files
1755==================
1756
1757When downloading material from the web, you will often want to restrict
1758the retrieval to only certain file types. For example, if you are
1759interested in downloading GIFs, you will not be overjoyed to get loads
1760of PostScript documents, and vice versa.
1761
1762 Wget offers two options to deal with this problem. Each option
1763description lists a short name, a long name, and the equivalent command
1764in `.wgetrc'.
1765
1766`-A ACCLIST'
1767`--accept ACCLIST'
1768`accept = ACCLIST'
1769 The argument to `--accept' option is a list of file suffixes or
1770 patterns that Wget will download during recursive retrieval. A
1771 suffix is the ending part of a file, and consists of "normal"
1772 letters, e.g. `gif' or `.jpg'. A matching pattern contains
1773 shell-like wildcards, e.g. `books*' or `zelazny*196[0-9]*'.
1774
1775 So, specifying `wget -A gif,jpg' will make Wget download only the
1776 files ending with `gif' or `jpg', i.e. GIFs and JPEGs. On the
1777 other hand, `wget -A "zelazny*196[0-9]*"' will download only files
1778 beginning with `zelazny' and containing numbers from 1960 to 1969
1779 anywhere within. Look up the manual of your shell for a
1780 description of how pattern matching works.
1781
1782 Of course, any number of suffixes and patterns can be combined
1783 into a comma-separated list, and given as an argument to `-A'.
1784
1785`-R REJLIST'
1786`--reject REJLIST'
1787`reject = REJLIST'
1788 The `--reject' option works the same way as `--accept', only its
1789 logic is the reverse; Wget will download all files _except_ the
1790 ones matching the suffixes (or patterns) in the list.
1791
1792 So, if you want to download a whole page except for the cumbersome
1793 MPEGs and .AU files, you can use `wget -R mpg,mpeg,au'.
1794 Analogously, to download all files except the ones beginning with
1795 `bjork', use `wget -R "bjork*"'. The quotes are to prevent
1796 expansion by the shell.
1797
1798 The `-A' and `-R' options may be combined to achieve even better
1799fine-tuning of which files to retrieve. E.g. `wget -A "*zelazny*" -R
1800.ps' will download all the files having `zelazny' as a part of their
1801name, but _not_ the PostScript files.
1802
1803 Note that these two options do not affect the downloading of HTML
1804files; Wget must load all the HTMLs to know where to go at
1805all--recursive retrieval would make no sense otherwise.
1806
1807
1808File: wget.info, Node: Directory-Based Limits, Next: Relative Links, Prev: Types of Files, Up: Following Links
1809
18104.3 Directory-Based Limits
1811==========================
1812
1813Regardless of other link-following facilities, it is often useful to
1814place the restriction of what files to retrieve based on the directories
1815those files are placed in. There can be many reasons for this--the
1816home pages may be organized in a reasonable directory structure; or some
1817directories may contain useless information, e.g. `/cgi-bin' or `/dev'
1818directories.
1819
1820 Wget offers three different options to deal with this requirement.
1821Each option description lists a short name, a long name, and the
1822equivalent command in `.wgetrc'.
1823
1824`-I LIST'
1825`--include LIST'
1826`include_directories = LIST'
1827 `-I' option accepts a comma-separated list of directories included
1828 in the retrieval. Any other directories will simply be ignored.
1829 The directories are absolute paths.
1830
1831 So, if you wish to download from `http://host/people/bozo/'
1832 following only links to bozo's colleagues in the `/people'
1833 directory and the bogus scripts in `/cgi-bin', you can specify:
1834
1835 wget -I /people,/cgi-bin http://host/people/bozo/
1836
1837`-X LIST'
1838`--exclude LIST'
1839`exclude_directories = LIST'
1840 `-X' option is exactly the reverse of `-I'--this is a list of
1841 directories _excluded_ from the download. E.g. if you do not want
1842 Wget to download things from `/cgi-bin' directory, specify `-X
1843 /cgi-bin' on the command line.
1844
1845 The same as with `-A'/`-R', these two options can be combined to
1846 get a better fine-tuning of downloading subdirectories. E.g. if
1847 you want to load all the files from `/pub' hierarchy except for
1848 `/pub/worthless', specify `-I/pub -X/pub/worthless'.
1849
1850`-np'
1851`--no-parent'
1852`no_parent = on'
1853 The simplest, and often very useful way of limiting directories is
1854 disallowing retrieval of the links that refer to the hierarchy
1855 "above" than the beginning directory, i.e. disallowing ascent to
1856 the parent directory/directories.
1857
1858 The `--no-parent' option (short `-np') is useful in this case.
1859 Using it guarantees that you will never leave the existing
1860 hierarchy. Supposing you issue Wget with:
1861
1862 wget -r --no-parent http://somehost/~luzer/my-archive/
1863
1864 You may rest assured that none of the references to
1865 `/~his-girls-homepage/' or `/~luzer/all-my-mpegs/' will be
1866 followed. Only the archive you are interested in will be
1867 downloaded. Essentially, `--no-parent' is similar to
1868 `-I/~luzer/my-archive', only it handles redirections in a more
1869 intelligent fashion.
1870
1871
1872File: wget.info, Node: Relative Links, Next: FTP Links, Prev: Directory-Based Limits, Up: Following Links
1873
18744.4 Relative Links
1875==================
1876
1877When `-L' is turned on, only the relative links are ever followed.
1878Relative links are here defined those that do not refer to the web
1879server root. For example, these links are relative:
1880
1881 <a href="foo.gif">
1882 <a href="foo/bar.gif">
1883 <a href="../foo/bar.gif">
1884
1885 These links are not relative:
1886
1887 <a href="/foo.gif">
1888 <a href="/foo/bar.gif">
1889 <a href="http://www.server.com/foo/bar.gif">
1890
1891 Using this option guarantees that recursive retrieval will not span
1892hosts, even without `-H'. In simple cases it also allows downloads to
1893"just work" without having to convert links.
1894
1895 This option is probably not very useful and might be removed in a
1896future release.
1897
1898
1899File: wget.info, Node: FTP Links, Prev: Relative Links, Up: Following Links
1900
19014.5 Following FTP Links
1902=======================
1903
1904The rules for FTP are somewhat specific, as it is necessary for them to
1905be. FTP links in HTML documents are often included for purposes of
1906reference, and it is often inconvenient to download them by default.
1907
1908 To have FTP links followed from HTML documents, you need to specify
1909the `--follow-ftp' option. Having done that, FTP links will span hosts
1910regardless of `-H' setting. This is logical, as FTP links rarely point
1911to the same host where the HTTP server resides. For similar reasons,
1912the `-L' options has no effect on such downloads. On the other hand,
1913domain acceptance (`-D') and suffix rules (`-A' and `-R') apply
1914normally.
1915
1916 Also note that followed links to FTP directories will not be
1917retrieved recursively further.
1918
1919
1920File: wget.info, Node: Time-Stamping, Next: Startup File, Prev: Following Links, Up: Top
1921
19225 Time-Stamping
1923***************
1924
1925One of the most important aspects of mirroring information from the
1926Internet is updating your archives.
1927
1928 Downloading the whole archive again and again, just to replace a few
1929changed files is expensive, both in terms of wasted bandwidth and money,
1930and the time to do the update. This is why all the mirroring tools
1931offer the option of incremental updating.
1932
1933 Such an updating mechanism means that the remote server is scanned in
1934search of "new" files. Only those new files will be downloaded in the
1935place of the old ones.
1936
1937 A file is considered new if one of these two conditions are met:
1938
1939 1. A file of that name does not already exist locally.
1940
1941 2. A file of that name does exist, but the remote file was modified
1942 more recently than the local file.
1943
1944 To implement this, the program needs to be aware of the time of last
1945modification of both local and remote files. We call this information
1946the "time-stamp" of a file.
1947
1948 The time-stamping in GNU Wget is turned on using `--timestamping'
1949(`-N') option, or through `timestamping = on' directive in `.wgetrc'.
1950With this option, for each file it intends to download, Wget will check
1951whether a local file of the same name exists. If it does, and the
1952remote file is older, Wget will not download it.
1953
1954 If the local file does not exist, or the sizes of the files do not
1955match, Wget will download the remote file no matter what the time-stamps
1956say.
1957
1958* Menu:
1959
1960* Time-Stamping Usage::
1961* HTTP Time-Stamping Internals::
1962* FTP Time-Stamping Internals::
1963
1964
1965File: wget.info, Node: Time-Stamping Usage, Next: HTTP Time-Stamping Internals, Up: Time-Stamping
1966
19675.1 Time-Stamping Usage
1968=======================
1969
1970The usage of time-stamping is simple. Say you would like to download a
1971file so that it keeps its date of modification.
1972
1973 wget -S http://www.gnu.ai.mit.edu/
1974
1975 A simple `ls -l' shows that the time stamp on the local file equals
1976the state of the `Last-Modified' header, as returned by the server. As
1977you can see, the time-stamping info is preserved locally, even without
1978`-N' (at least for HTTP).
1979
1980 Several days later, you would like Wget to check if the remote file
1981has changed, and download it if it has.
1982
1983 wget -N http://www.gnu.ai.mit.edu/
1984
1985 Wget will ask the server for the last-modified date. If the local
1986file has the same timestamp as the server, or a newer one, the remote
1987file will not be re-fetched. However, if the remote file is more
1988recent, Wget will proceed to fetch it.
1989
1990 The same goes for FTP. For example:
1991
1992 wget "ftp://ftp.ifi.uio.no/pub/emacs/gnus/*"
1993
1994 (The quotes around that URL are to prevent the shell from trying to
1995interpret the `*'.)
1996
1997 After download, a local directory listing will show that the
1998timestamps match those on the remote server. Reissuing the command
1999with `-N' will make Wget re-fetch _only_ the files that have been
2000modified since the last download.
2001
2002 If you wished to mirror the GNU archive every week, you would use a
2003command like the following, weekly:
2004
2005 wget --timestamping -r ftp://ftp.gnu.org/pub/gnu/
2006
2007 Note that time-stamping will only work for files for which the server
2008gives a timestamp. For HTTP, this depends on getting a `Last-Modified'
2009header. For FTP, this depends on getting a directory listing with
2010dates in a format that Wget can parse (*note FTP Time-Stamping
2011Internals::).
2012
2013
2014File: wget.info, Node: HTTP Time-Stamping Internals, Next: FTP Time-Stamping Internals, Prev: Time-Stamping Usage, Up: Time-Stamping
2015
20165.2 HTTP Time-Stamping Internals
2017================================
2018
2019Time-stamping in HTTP is implemented by checking of the `Last-Modified'
2020header. If you wish to retrieve the file `foo.html' through HTTP, Wget
2021will check whether `foo.html' exists locally. If it doesn't,
2022`foo.html' will be retrieved unconditionally.
2023
2024 If the file does exist locally, Wget will first check its local
2025time-stamp (similar to the way `ls -l' checks it), and then send a
2026`HEAD' request to the remote server, demanding the information on the
2027remote file.
2028
2029 The `Last-Modified' header is examined to find which file was
2030modified more recently (which makes it "newer"). If the remote file is
2031newer, it will be downloaded; if it is older, Wget will give up.(1)
2032
2033 When `--backup-converted' (`-K') is specified in conjunction with
2034`-N', server file `X' is compared to local file `X.orig', if extant,
2035rather than being compared to local file `X', which will always differ
2036if it's been converted by `--convert-links' (`-k').
2037
2038 Arguably, HTTP time-stamping should be implemented using the
2039`If-Modified-Since' request.
2040
2041 ---------- Footnotes ----------
2042
2043 (1) As an additional check, Wget will look at the `Content-Length'
2044header, and compare the sizes; if they are not the same, the remote
2045file will be downloaded no matter what the time-stamp says.
2046
2047
2048File: wget.info, Node: FTP Time-Stamping Internals, Prev: HTTP Time-Stamping Internals, Up: Time-Stamping
2049
20505.3 FTP Time-Stamping Internals
2051===============================
2052
2053In theory, FTP time-stamping works much the same as HTTP, only FTP has
2054no headers--time-stamps must be ferreted out of directory listings.
2055
2056 If an FTP download is recursive or uses globbing, Wget will use the
2057FTP `LIST' command to get a file listing for the directory containing
2058the desired file(s). It will try to analyze the listing, treating it
2059like Unix `ls -l' output, extracting the time-stamps. The rest is
2060exactly the same as for HTTP. Note that when retrieving individual
2061files from an FTP server without using globbing or recursion, listing
2062files will not be downloaded (and thus files will not be time-stamped)
2063unless `-N' is specified.
2064
2065 Assumption that every directory listing is a Unix-style listing may
2066sound extremely constraining, but in practice it is not, as many
2067non-Unix FTP servers use the Unixoid listing format because most (all?)
2068of the clients understand it. Bear in mind that RFC959 defines no
2069standard way to get a file list, let alone the time-stamps. We can
2070only hope that a future standard will define this.
2071
2072 Another non-standard solution includes the use of `MDTM' command
2073that is supported by some FTP servers (including the popular
2074`wu-ftpd'), which returns the exact time of the specified file. Wget
2075may support this command in the future.
2076
2077
2078File: wget.info, Node: Startup File, Next: Examples, Prev: Time-Stamping, Up: Top
2079
20806 Startup File
2081**************
2082
2083Once you know how to change default settings of Wget through command
2084line arguments, you may wish to make some of those settings permanent.
2085You can do that in a convenient way by creating the Wget startup
2086file--`.wgetrc'.
2087
2088 Besides `.wgetrc' is the "main" initialization file, it is
2089convenient to have a special facility for storing passwords. Thus Wget
2090reads and interprets the contents of `$HOME/.netrc', if it finds it.
2091You can find `.netrc' format in your system manuals.
2092
2093 Wget reads `.wgetrc' upon startup, recognizing a limited set of
2094commands.
2095
2096* Menu:
2097
2098* Wgetrc Location:: Location of various wgetrc files.
2099* Wgetrc Syntax:: Syntax of wgetrc.
2100* Wgetrc Commands:: List of available commands.
2101* Sample Wgetrc:: A wgetrc example.
2102
2103
2104File: wget.info, Node: Wgetrc Location, Next: Wgetrc Syntax, Up: Startup File
2105
21066.1 Wgetrc Location
2107===================
2108
2109When initializing, Wget will look for a "global" startup file,
2110`/usr/local/etc/wgetrc' by default (or some prefix other than
2111`/usr/local', if Wget was not installed there) and read commands from
2112there, if it exists.
2113
2114 Then it will look for the user's file. If the environmental variable
2115`WGETRC' is set, Wget will try to load that file. Failing that, no
2116further attempts will be made.
2117
2118 If `WGETRC' is not set, Wget will try to load `$HOME/.wgetrc'.
2119
2120 The fact that user's settings are loaded after the system-wide ones
2121means that in case of collision user's wgetrc _overrides_ the
2122system-wide wgetrc (in `/usr/local/etc/wgetrc' by default). Fascist
2123admins, away!
2124
2125
2126File: wget.info, Node: Wgetrc Syntax, Next: Wgetrc Commands, Prev: Wgetrc Location, Up: Startup File
2127
21286.2 Wgetrc Syntax
2129=================
2130
2131The syntax of a wgetrc command is simple:
2132
2133 variable = value
2134
2135 The "variable" will also be called "command". Valid "values" are
2136different for different commands.
2137
2138 The commands are case-insensitive and underscore-insensitive. Thus
2139`DIr__PrefiX' is the same as `dirprefix'. Empty lines, lines beginning
2140with `#' and lines containing white-space only are discarded.
2141
2142 Commands that expect a comma-separated list will clear the list on an
2143empty command. So, if you wish to reset the rejection list specified in
2144global `wgetrc', you can do it with:
2145
2146 reject =
2147
2148
2149File: wget.info, Node: Wgetrc Commands, Next: Sample Wgetrc, Prev: Wgetrc Syntax, Up: Startup File
2150
21516.3 Wgetrc Commands
2152===================
2153
2154The complete set of commands is listed below. Legal values are listed
2155after the `='. Simple Boolean values can be set or unset using `on'
2156and `off' or `1' and `0'. A fancier kind of Boolean allowed in some
2157cases is the "lockable Boolean", which may be set to `on', `off',
2158`always', or `never'. If an option is set to `always' or `never', that
2159value will be locked in for the duration of the Wget
2160invocation--command-line options will not override.
2161
2162 Some commands take pseudo-arbitrary values. ADDRESS values can be
2163hostnames or dotted-quad IP addresses. N can be any positive integer,
2164or `inf' for infinity, where appropriate. STRING values can be any
2165non-empty string.
2166
2167 Most of these commands have direct command-line equivalents. Also,
2168any wgetrc command can be specified on the command line using the
2169`--execute' switch (*note Basic Startup Options::.)
2170
2171accept/reject = STRING
2172 Same as `-A'/`-R' (*note Types of Files::).
2173
2174add_hostdir = on/off
2175 Enable/disable host-prefixed file names. `-nH' disables it.
2176
2177continue = on/off
2178 If set to on, force continuation of preexistent partially retrieved
2179 files. See `-c' before setting it.
2180
2181background = on/off
2182 Enable/disable going to background--the same as `-b' (which
2183 enables it).
2184
2185backup_converted = on/off
2186 Enable/disable saving pre-converted files with the suffix
2187 `.orig'--the same as `-K' (which enables it).
2188
2189base = STRING
2190 Consider relative URLs in URL input files forced to be interpreted
2191 as HTML as being relative to STRING--the same as `--base=STRING'.
2192
2193bind_address = ADDRESS
2194 Bind to ADDRESS, like the `--bind-address=ADDRESS'.
2195
2196ca_certificate = FILE
2197 Set the certificate authority bundle file to FILE. The same as
2198 `--ca-certificate=FILE'.
2199
2200ca_directory = DIRECTORY
2201 Set the directory used for certificate authorities. The same as
2202 `--ca-directory=DIRECTORY'.
2203
2204cache = on/off
2205 When set to off, disallow server-caching. See the `--no-cache'
2206 option.
2207
2208certificate = FILE
2209 Set the client certificate file name to FILE. The same as
2210 `--certificate=FILE'.
2211
2212certificate_type = STRING
2213 Specify the type of the client certificate, legal values being
2214 `PEM' (the default) and `DER' (aka ASN1). The same as
2215 `--certificate-type=STRING'.
2216
2217check_certificate = on/off
2218 If this is set to off, the server certificate is not checked
2219 against the specified client authorities. The default is "on".
2220 The same as `--check-certificate'.
2221
2222convert_links = on/off
2223 Convert non-relative links locally. The same as `-k'.
2224
2225cookies = on/off
2226 When set to off, disallow cookies. See the `--cookies' option.
2227
2228connect_timeout = N
2229 Set the connect timeout--the same as `--connect-timeout'.
2230
2231cut_dirs = N
2232 Ignore N remote directory components. Equivalent to
2233 `--cut-dirs=N'.
2234
2235debug = on/off
2236 Debug mode, same as `-d'.
2237
2238delete_after = on/off
2239 Delete after download--the same as `--delete-after'.
2240
2241dir_prefix = STRING
2242 Top of directory tree--the same as `-P STRING'.
2243
2244dirstruct = on/off
2245 Turning dirstruct on or off--the same as `-x' or `-nd',
2246 respectively.
2247
2248dns_cache = on/off
2249 Turn DNS caching on/off. Since DNS caching is on by default, this
2250 option is normally used to turn it off and is equivalent to
2251 `--no-dns-cache'.
2252
2253dns_timeout = N
2254 Set the DNS timeout--the same as `--dns-timeout'.
2255
2256domains = STRING
2257 Same as `-D' (*note Spanning Hosts::).
2258
2259dot_bytes = N
2260 Specify the number of bytes "contained" in a dot, as seen
2261 throughout the retrieval (1024 by default). You can postfix the
2262 value with `k' or `m', representing kilobytes and megabytes,
2263 respectively. With dot settings you can tailor the dot retrieval
2264 to suit your needs, or you can use the predefined "styles" (*note
2265 Download Options::).
2266
2267dots_in_line = N
2268 Specify the number of dots that will be printed in each line
2269 throughout the retrieval (50 by default).
2270
2271dot_spacing = N
2272 Specify the number of dots in a single cluster (10 by default).
2273
2274egd_file = FILE
2275 Use STRING as the EGD socket file name. The same as
2276 `--egd-file=FILE'.
2277
2278exclude_directories = STRING
2279 Specify a comma-separated list of directories you wish to exclude
2280 from download--the same as `-X STRING' (*note Directory-Based
2281 Limits::).
2282
2283exclude_domains = STRING
2284 Same as `--exclude-domains=STRING' (*note Spanning Hosts::).
2285
2286follow_ftp = on/off
2287 Follow FTP links from HTML documents--the same as `--follow-ftp'.
2288
2289follow_tags = STRING
2290 Only follow certain HTML tags when doing a recursive retrieval,
2291 just like `--follow-tags=STRING'.
2292
2293force_html = on/off
2294 If set to on, force the input filename to be regarded as an HTML
2295 document--the same as `-F'.
2296
2297ftp_password = STRING
2298 Set your FTP password to STRING. Without this setting, the
2299 password defaults to `-wget@', which is a useful default for
2300 anonymous FTP access.
2301
2302 This command used to be named `passwd' prior to Wget 1.10.
2303
2304ftp_proxy = STRING
2305 Use STRING as FTP proxy, instead of the one specified in
2306 environment.
2307
2308ftp_user = STRING
2309 Set FTP user to STRING.
2310
2311 This command used to be named `login' prior to Wget 1.10.
2312
2313glob = on/off
2314 Turn globbing on/off--the same as `--glob' and `--no-glob'.
2315
2316header = STRING
2317 Define a header for HTTP doewnloads, like using `--header=STRING'.
2318
2319html_extension = on/off
2320 Add a `.html' extension to `text/html' or `application/xhtml+xml'
2321 files without it, like `-E'.
2322
2323http_keep_alive = on/off
2324 Turn the keep-alive feature on or off (defaults to on). Turning it
2325 off is equivalent to `--no-http-keep-alive'.
2326
2327http_password = STRING
2328 Set HTTP password, equivalent to `--http-password=STRING'.
2329
2330http_proxy = STRING
2331 Use STRING as HTTP proxy, instead of the one specified in
2332 environment.
2333
2334http_user = STRING
2335 Set HTTP user to STRING, equivalent to `--http-user=STRING'.
2336
2337ignore_length = on/off
2338 When set to on, ignore `Content-Length' header; the same as
2339 `--ignore-length'.
2340
2341ignore_tags = STRING
2342 Ignore certain HTML tags when doing a recursive retrieval, like
2343 `--ignore-tags=STRING'.
2344
2345include_directories = STRING
2346 Specify a comma-separated list of directories you wish to follow
2347 when downloading--the same as `-I STRING'.
2348
2349inet4_only = on/off
2350 Force connecting to IPv4 addresses, off by default. You can put
2351 this in the global init file to disable Wget's attempts to resolve
2352 and connect to IPv6 hosts. Available only if Wget was compiled
2353 with IPv6 support. The same as `--inet4-only' or `-4'.
2354
2355inet6_only = on/off
2356 Force connecting to IPv6 addresses, off by default. Available
2357 only if Wget was compiled with IPv6 support. The same as
2358 `--inet6-only' or `-6'.
2359
2360input = FILE
2361 Read the URLs from STRING, like `-i FILE'.
2362
2363limit_rate = RATE
2364 Limit the download speed to no more than RATE bytes per second.
2365 The same as `--limit-rate=RATE'.
2366
2367load_cookies = FILE
2368 Load cookies from FILE. See `--load-cookies FILE'.
2369
2370logfile = FILE
2371 Set logfile to FILE, the same as `-o FILE'.
2372
2373mirror = on/off
2374 Turn mirroring on/off. The same as `-m'.
2375
2376netrc = on/off
2377 Turn reading netrc on or off.
2378
2379noclobber = on/off
2380 Same as `-nc'.
2381
2382no_parent = on/off
2383 Disallow retrieving outside the directory hierarchy, like
2384 `--no-parent' (*note Directory-Based Limits::).
2385
2386no_proxy = STRING
2387 Use STRING as the comma-separated list of domains to avoid in
2388 proxy loading, instead of the one specified in environment.
2389
2390output_document = FILE
2391 Set the output filename--the same as `-O FILE'.
2392
2393page_requisites = on/off
2394 Download all ancillary documents necessary for a single HTML page
2395 to display properly--the same as `-p'.
2396
2397passive_ftp = on/off/always/never
2398 Change setting of passive FTP, equivalent to the `--passive-ftp'
2399 option. Some scripts and `.pm' (Perl module) files download files
2400 using `wget --passive-ftp'. If your firewall does not allow this,
2401 you can set `passive_ftp = never' to override the command-line.
2402
2403password = STRING
2404 Specify password STRING for both FTP and HTTP file retrieval.
2405 This command can be overridden using the `ftp_password' and
2406 `http_password' command for FTP and HTTP respectively.
2407
2408post_data = STRING
2409 Use POST as the method for all HTTP requests and send STRING in
2410 the request body. The same as `--post-data=STRING'.
2411
2412post_file = FILE
2413 Use POST as the method for all HTTP requests and send the contents
2414 of FILE in the request body. The same as `--post-file=FILE'.
2415
2416prefer_family = IPv4/IPv6/none
2417 When given a choice of several addresses, connect to the addresses
2418 with specified address family first. IPv4 addresses are preferred
2419 by default. The same as `--prefer-family', which see for a
2420 detailed discussion of why this is useful.
2421
2422private_key = FILE
2423 Set the private key file to FILE. The same as
2424 `--private-key=FILE'.
2425
2426private_key_type = STRING
2427 Specify the type of the private key, legal values being `PEM' (the
2428 default) and `DER' (aka ASN1). The same as
2429 `--private-type=STRING'.
2430
2431progress = STRING
2432 Set the type of the progress indicator. Legal types are `dot' and
2433 `bar'. Equivalent to `--progress=STRING'.
2434
2435protocol_directories = on/off
2436 When set, use the protocol name as a directory component of local
2437 file names. The same as `--protocol-directories'.
2438
2439proxy_user = STRING
2440 Set proxy authentication user name to STRING, like
2441 `--proxy-user=STRING'.
2442
2443proxy_password = STRING
2444 Set proxy authentication password to STRING, like
2445 `--proxy-password=STRING'.
2446
2447quiet = on/off
2448 Quiet mode--the same as `-q'.
2449
2450quota = QUOTA
2451 Specify the download quota, which is useful to put in the global
2452 `wgetrc'. When download quota is specified, Wget will stop
2453 retrieving after the download sum has become greater than quota.
2454 The quota can be specified in bytes (default), kbytes `k'
2455 appended) or mbytes (`m' appended). Thus `quota = 5m' will set
2456 the quota to 5 megabytes. Note that the user's startup file
2457 overrides system settings.
2458
2459random_file = FILE
2460 Use FILE as a source of randomness on systems lacking
2461 `/dev/random'.
2462
2463read_timeout = N
2464 Set the read (and write) timeout--the same as `--read-timeout=N'.
2465
2466reclevel = N
2467 Recursion level (depth)--the same as `-l N'.
2468
2469recursive = on/off
2470 Recursive on/off--the same as `-r'.
2471
2472referer = STRING
2473 Set HTTP `Referer:' header just like `--referer=STRING'. (Note it
2474 was the folks who wrote the HTTP spec who got the spelling of
2475 "referrer" wrong.)
2476
2477relative_only = on/off
2478 Follow only relative links--the same as `-L' (*note Relative
2479 Links::).
2480
2481remove_listing = on/off
2482 If set to on, remove FTP listings downloaded by Wget. Setting it
2483 to off is the same as `--no-remove-listing'.
2484
2485restrict_file_names = unix/windows
2486 Restrict the file names generated by Wget from URLs. See
2487 `--restrict-file-names' for a more detailed description.
2488
2489retr_symlinks = on/off
2490 When set to on, retrieve symbolic links as if they were plain
2491 files; the same as `--retr-symlinks'.
2492
2493retry_connrefused = on/off
2494 When set to on, consider "connection refused" a transient
2495 error--the same as `--retry-connrefused'.
2496
2497robots = on/off
2498 Specify whether the norobots convention is respected by Wget, "on"
2499 by default. This switch controls both the `/robots.txt' and the
2500 `nofollow' aspect of the spec. *Note Robot Exclusion::, for more
2501 details about this. Be sure you know what you are doing before
2502 turning this off.
2503
2504save_cookies = FILE
2505 Save cookies to FILE. The same as `--save-cookies FILE'.
2506
2507secure_protocol = STRING
2508 Choose the secure protocol to be used. Legal values are `auto'
2509 (the default), `SSLv2', `SSLv3', and `TLSv1'. The same as
2510 `--secure-protocol=STRING'.
2511
2512server_response = on/off
2513 Choose whether or not to print the HTTP and FTP server
2514 responses--the same as `-S'.
2515
2516span_hosts = on/off
2517 Same as `-H'.
2518
2519strict_comments = on/off
2520 Same as `--strict-comments'.
2521
2522timeout = N
2523 Set all applicable timeout values to N, the same as `-T N'.
2524
2525timestamping = on/off
2526 Turn timestamping on/off. The same as `-N' (*note
2527 Time-Stamping::).
2528
2529tries = N
2530 Set number of retries per URL--the same as `-t N'.
2531
2532use_proxy = on/off
2533 When set to off, don't use proxy even when proxy-related
2534 environment variables are set. In that case it is the same as
2535 using `--no-proxy'.
2536
2537user = STRING
2538 Specify username STRING for both FTP and HTTP file retrieval.
2539 This command can be overridden using the `ftp_user' and
2540 `http_user' command for FTP and HTTP respectively.
2541
2542verbose = on/off
2543 Turn verbose on/off--the same as `-v'/`-nv'.
2544
2545wait = N
2546 Wait N seconds between retrievals--the same as `-w N'.
2547
2548waitretry = N
2549 Wait up to N seconds between retries of failed retrievals
2550 only--the same as `--waitretry=N'. Note that this is turned on by
2551 default in the global `wgetrc'.
2552
2553randomwait = on/off
2554 Turn random between-request wait times on or off. The same as
2555 `--random-wait'.
2556
2557
2558File: wget.info, Node: Sample Wgetrc, Prev: Wgetrc Commands, Up: Startup File
2559
25606.4 Sample Wgetrc
2561=================
2562
2563This is the sample initialization file, as given in the distribution.
2564It is divided in two section--one for global usage (suitable for global
2565startup file), and one for local usage (suitable for `$HOME/.wgetrc').
2566Be careful about the things you change.
2567
2568 Note that almost all the lines are commented out. For a command to
2569have any effect, you must remove the `#' character at the beginning of
2570its line.
2571
2572 ###
2573 ### Sample Wget initialization file .wgetrc
2574 ###
2575
2576 ## You can use this file to change the default behaviour of wget or to
2577 ## avoid having to type many many command-line options. This file does
2578 ## not contain a comprehensive list of commands -- look at the manual
2579 ## to find out what you can put into this file.
2580 ##
2581 ## Wget initialization file can reside in /usr/local/etc/wgetrc
2582 ## (global, for all users) or $HOME/.wgetrc (for a single user).
2583 ##
2584 ## To use the settings in this file, you will have to uncomment them,
2585 ## as well as change them, in most cases, as the values on the
2586 ## commented-out lines are the default values (e.g. "off").
2587
2588
2589 ##
2590 ## Global settings (useful for setting up in /usr/local/etc/wgetrc).
2591 ## Think well before you change them, since they may reduce wget's
2592 ## functionality, and make it behave contrary to the documentation:
2593 ##
2594
2595 # You can set retrieve quota for beginners by specifying a value
2596 # optionally followed by 'K' (kilobytes) or 'M' (megabytes). The
2597 # default quota is unlimited.
2598 #quota = inf
2599
2600 # You can lower (or raise) the default number of retries when
2601 # downloading a file (default is 20).
2602 #tries = 20
2603
2604 # Lowering the maximum depth of the recursive retrieval is handy to
2605 # prevent newbies from going too "deep" when they unwittingly start
2606 # the recursive retrieval. The default is 5.
2607 #reclevel = 5
2608
2609 # By default Wget uses "passive FTP" transfer where the client
2610 # initiates the data connection to the server rather than the other
2611 # way around. That is required on systems behind NAT where the client
2612 # computer cannot be easily reached from the Internet. However, some
2613 # firewalls software explicitly supports active FTP and in fact has
2614 # problems supporting passive transfer. If you are in such
2615 # environment, use "passive_ftp = off" to revert to active FTP.
2616 #passive_ftp = off
2617
2618 # The "wait" command below makes Wget wait between every connection.
2619 # If, instead, you want Wget to wait only between retries of failed
2620 # downloads, set waitretry to maximum number of seconds to wait (Wget
2621 # will use "linear backoff", waiting 1 second after the first failure
2622 # on a file, 2 seconds after the second failure, etc. up to this max).
2623 waitretry = 10
2624
2625
2626 ##
2627 ## Local settings (for a user to set in his $HOME/.wgetrc). It is
2628 ## *highly* undesirable to put these settings in the global file, since
2629 ## they are potentially dangerous to "normal" users.
2630 ##
2631 ## Even when setting up your own ~/.wgetrc, you should know what you
2632 ## are doing before doing so.
2633 ##
2634
2635 # Set this to on to use timestamping by default:
2636 #timestamping = off
2637
2638 # It is a good idea to make Wget send your email address in a `From:'
2639 # header with your request (so that server administrators can contact
2640 # you in case of errors). Wget does *not* send `From:' by default.
2641 #header = From: Your Name <username@site.domain>
2642
2643 # You can set up other headers, like Accept-Language. Accept-Language
2644 # is *not* sent by default.
2645 #header = Accept-Language: en
2646
2647 # You can set the default proxies for Wget to use for http and ftp.
2648 # They will override the value in the environment.
2649 #http_proxy = http://proxy.yoyodyne.com:18023/
2650 #ftp_proxy = http://proxy.yoyodyne.com:18023/
2651
2652 # If you do not want to use proxy at all, set this to off.
2653 #use_proxy = on
2654
2655 # You can customize the retrieval outlook. Valid options are default,
2656 # binary, mega and micro.
2657 #dot_style = default
2658
2659 # Setting this to off makes Wget not download /robots.txt. Be sure to
2660 # know *exactly* what /robots.txt is and how it is used before changing
2661 # the default!
2662 #robots = on
2663
2664 # It can be useful to make Wget wait between connections. Set this to
2665 # the number of seconds you want Wget to wait.
2666 #wait = 0
2667
2668 # You can force creating directory structure, even if a single is being
2669 # retrieved, by setting this to on.
2670 #dirstruct = off
2671
2672 # You can turn on recursive retrieving by default (don't do this if
2673 # you are not sure you know what it means) by setting this to on.
2674 #recursive = off
2675
2676 # To always back up file X as X.orig before converting its links (due
2677 # to -k / --convert-links / convert_links = on having been specified),
2678 # set this variable to on:
2679 #backup_converted = off
2680
2681 # To have Wget follow FTP links from HTML files by default, set this
2682 # to on:
2683 #follow_ftp = off
2684
2685
2686File: wget.info, Node: Examples, Next: Various, Prev: Startup File, Up: Top
2687
26887 Examples
2689**********
2690
2691The examples are divided into three sections loosely based on their
2692complexity.
2693
2694* Menu:
2695
2696* Simple Usage:: Simple, basic usage of the program.
2697* Advanced Usage:: Advanced tips.
2698* Very Advanced Usage:: The hairy stuff.
2699
2700
2701File: wget.info, Node: Simple Usage, Next: Advanced Usage, Up: Examples
2702
27037.1 Simple Usage
2704================
2705
2706 * Say you want to download a URL. Just type:
2707
2708 wget http://fly.srk.fer.hr/
2709
2710 * But what will happen if the connection is slow, and the file is
2711 lengthy? The connection will probably fail before the whole file
2712 is retrieved, more than once. In this case, Wget will try getting
2713 the file until it either gets the whole of it, or exceeds the
2714 default number of retries (this being 20). It is easy to change
2715 the number of tries to 45, to insure that the whole file will
2716 arrive safely:
2717
2718 wget --tries=45 http://fly.srk.fer.hr/jpg/flyweb.jpg
2719
2720 * Now let's leave Wget to work in the background, and write its
2721 progress to log file `log'. It is tiring to type `--tries', so we
2722 shall use `-t'.
2723
2724 wget -t 45 -o log http://fly.srk.fer.hr/jpg/flyweb.jpg &
2725
2726 The ampersand at the end of the line makes sure that Wget works in
2727 the background. To unlimit the number of retries, use `-t inf'.
2728
2729 * The usage of FTP is as simple. Wget will take care of login and
2730 password.
2731
2732 wget ftp://gnjilux.srk.fer.hr/welcome.msg
2733
2734 * If you specify a directory, Wget will retrieve the directory
2735 listing, parse it and convert it to HTML. Try:
2736
2737 wget ftp://ftp.gnu.org/pub/gnu/
2738 links index.html
2739
2740
2741File: wget.info, Node: Advanced Usage, Next: Very Advanced Usage, Prev: Simple Usage, Up: Examples
2742
27437.2 Advanced Usage
2744==================
2745
2746 * You have a file that contains the URLs you want to download? Use
2747 the `-i' switch:
2748
2749 wget -i FILE
2750
2751 If you specify `-' as file name, the URLs will be read from
2752 standard input.
2753
2754 * Create a five levels deep mirror image of the GNU web site, with
2755 the same directory structure the original has, with only one try
2756 per document, saving the log of the activities to `gnulog':
2757
2758 wget -r http://www.gnu.org/ -o gnulog
2759
2760 * The same as the above, but convert the links in the HTML files to
2761 point to local files, so you can view the documents off-line:
2762
2763 wget --convert-links -r http://www.gnu.org/ -o gnulog
2764
2765 * Retrieve only one HTML page, but make sure that all the elements
2766 needed for the page to be displayed, such as inline images and
2767 external style sheets, are also downloaded. Also make sure the
2768 downloaded page references the downloaded links.
2769
2770 wget -p --convert-links http://www.server.com/dir/page.html
2771
2772 The HTML page will be saved to `www.server.com/dir/page.html', and
2773 the images, stylesheets, etc., somewhere under `www.server.com/',
2774 depending on where they were on the remote server.
2775
2776 * The same as the above, but without the `www.server.com/' directory.
2777 In fact, I don't want to have all those random server directories
2778 anyway--just save _all_ those files under a `download/'
2779 subdirectory of the current directory.
2780
2781 wget -p --convert-links -nH -nd -Pdownload \
2782 http://www.server.com/dir/page.html
2783
2784 * Retrieve the index.html of `www.lycos.com', showing the original
2785 server headers:
2786
2787 wget -S http://www.lycos.com/
2788
2789 * Save the server headers with the file, perhaps for post-processing.
2790
2791 wget --save-headers http://www.lycos.com/
2792 more index.html
2793
2794 * Retrieve the first two levels of `wuarchive.wustl.edu', saving them
2795 to `/tmp'.
2796
2797 wget -r -l2 -P/tmp ftp://wuarchive.wustl.edu/
2798
2799 * You want to download all the GIFs from a directory on an HTTP
2800 server. You tried `wget http://www.server.com/dir/*.gif', but that
2801 didn't work because HTTP retrieval does not support globbing. In
2802 that case, use:
2803
2804 wget -r -l1 --no-parent -A.gif http://www.server.com/dir/
2805
2806 More verbose, but the effect is the same. `-r -l1' means to
2807 retrieve recursively (*note Recursive Download::), with maximum
2808 depth of 1. `--no-parent' means that references to the parent
2809 directory are ignored (*note Directory-Based Limits::), and
2810 `-A.gif' means to download only the GIF files. `-A "*.gif"' would
2811 have worked too.
2812
2813 * Suppose you were in the middle of downloading, when Wget was
2814 interrupted. Now you do not want to clobber the files already
2815 present. It would be:
2816
2817 wget -nc -r http://www.gnu.org/
2818
2819 * If you want to encode your own username and password to HTTP or
2820 FTP, use the appropriate URL syntax (*note URL Format::).
2821
2822 wget ftp://hniksic:mypassword@unix.server.com/.emacs
2823
2824 Note, however, that this usage is not advisable on multi-user
2825 systems because it reveals your password to anyone who looks at
2826 the output of `ps'.
2827
2828 * You would like the output documents to go to standard output
2829 instead of to files?
2830
2831 wget -O - http://jagor.srce.hr/ http://www.srce.hr/
2832
2833 You can also combine the two options and make pipelines to
2834 retrieve the documents from remote hotlists:
2835
2836 wget -O - http://cool.list.com/ | wget --force-html -i -
2837
2838
2839File: wget.info, Node: Very Advanced Usage, Prev: Advanced Usage, Up: Examples
2840
28417.3 Very Advanced Usage
2842=======================
2843
2844 * If you wish Wget to keep a mirror of a page (or FTP
2845 subdirectories), use `--mirror' (`-m'), which is the shorthand for
2846 `-r -l inf -N'. You can put Wget in the crontab file asking it to
2847 recheck a site each Sunday:
2848
2849 crontab
2850 0 0 * * 0 wget --mirror http://www.gnu.org/ -o /home/me/weeklog
2851
2852 * In addition to the above, you want the links to be converted for
2853 local viewing. But, after having read this manual, you know that
2854 link conversion doesn't play well with timestamping, so you also
2855 want Wget to back up the original HTML files before the
2856 conversion. Wget invocation would look like this:
2857
2858 wget --mirror --convert-links --backup-converted \
2859 http://www.gnu.org/ -o /home/me/weeklog
2860
2861 * But you've also noticed that local viewing doesn't work all that
2862 well when HTML files are saved under extensions other than `.html',
2863 perhaps because they were served as `index.cgi'. So you'd like
2864 Wget to rename all the files served with content-type `text/html'
2865 or `application/xhtml+xml' to `NAME.html'.
2866
2867 wget --mirror --convert-links --backup-converted \
2868 --html-extension -o /home/me/weeklog \
2869 http://www.gnu.org/
2870
2871 Or, with less typing:
2872
2873 wget -m -k -K -E http://www.gnu.org/ -o /home/me/weeklog
2874
2875
2876File: wget.info, Node: Various, Next: Appendices, Prev: Examples, Up: Top
2877
28788 Various
2879*********
2880
2881This chapter contains all the stuff that could not fit anywhere else.
2882
2883* Menu:
2884
2885* Proxies:: Support for proxy servers
2886* Distribution:: Getting the latest version.
2887* Mailing List:: Wget mailing list for announcements and discussion.
2888* Reporting Bugs:: How and where to report bugs.
2889* Portability:: The systems Wget works on.
2890* Signals:: Signal-handling performed by Wget.
2891
2892
2893File: wget.info, Node: Proxies, Next: Distribution, Up: Various
2894
28958.1 Proxies
2896===========
2897
2898"Proxies" are special-purpose HTTP servers designed to transfer data
2899from remote servers to local clients. One typical use of proxies is
2900lightening network load for users behind a slow connection. This is
2901achieved by channeling all HTTP and FTP requests through the proxy
2902which caches the transferred data. When a cached resource is requested
2903again, proxy will return the data from cache. Another use for proxies
2904is for companies that separate (for security reasons) their internal
2905networks from the rest of Internet. In order to obtain information
2906from the Web, their users connect and retrieve remote data using an
2907authorized proxy.
2908
2909 Wget supports proxies for both HTTP and FTP retrievals. The
2910standard way to specify proxy location, which Wget recognizes, is using
2911the following environment variables:
2912
2913`http_proxy'
2914 This variable should contain the URL of the proxy for HTTP
2915 connections.
2916
2917`ftp_proxy'
2918 This variable should contain the URL of the proxy for FTP
2919 connections. It is quite common that HTTP_PROXY and FTP_PROXY are
2920 set to the same URL.
2921
2922`no_proxy'
2923 This variable should contain a comma-separated list of domain
2924 extensions proxy should _not_ be used for. For instance, if the
2925 value of `no_proxy' is `.mit.edu', proxy will not be used to
2926 retrieve documents from MIT.
2927
2928 In addition to the environment variables, proxy location and settings
2929may be specified from within Wget itself.
2930
2931`--no-proxy'
2932`proxy = on/off'
2933 This option and the corresponding command may be used to suppress
2934 the use of proxy, even if the appropriate environment variables
2935 are set.
2936
2937`http_proxy = URL'
2938`ftp_proxy = URL'
2939`no_proxy = STRING'
2940 These startup file variables allow you to override the proxy
2941 settings specified by the environment.
2942
2943 Some proxy servers require authorization to enable you to use them.
2944The authorization consists of "username" and "password", which must be
2945sent by Wget. As with HTTP authorization, several authentication
2946schemes exist. For proxy authorization only the `Basic' authentication
2947scheme is currently implemented.
2948
2949 You may specify your username and password either through the proxy
2950URL or through the command-line options. Assuming that the company's
2951proxy is located at `proxy.company.com' at port 8001, a proxy URL
2952location containing authorization data might look like this:
2953
2954 http://hniksic:mypassword@proxy.company.com:8001/
2955
2956 Alternatively, you may use the `proxy-user' and `proxy-password'
2957options, and the equivalent `.wgetrc' settings `proxy_user' and
2958`proxy_password' to set the proxy username and password.
2959
2960
2961File: wget.info, Node: Distribution, Next: Mailing List, Prev: Proxies, Up: Various
2962
29638.2 Distribution
2964================
2965
2966Like all GNU utilities, the latest version of Wget can be found at the
2967master GNU archive site ftp.gnu.org, and its mirrors. For example,
2968Wget 1.10.2 can be found at
2969`ftp://ftp.gnu.org/pub/gnu/wget/wget-1.10.2.tar.gz'
2970
2971
2972File: wget.info, Node: Mailing List, Next: Reporting Bugs, Prev: Distribution, Up: Various
2973
29748.3 Mailing List
2975================
2976
2977There are several Wget-related mailing lists, all hosted by SunSITE.dk.
2978The general discussion list is at <wget@sunsite.dk>. It is the
2979preferred place for bug reports and suggestions, as well as for
2980discussion of development. You are invited to subscribe.
2981
2982 To subscribe, simply send mail to <wget-subscribe@sunsite.dk> and
2983follow the instructions. Unsubscribe by mailing to
2984<wget-unsubscribe@sunsite.dk>. The mailing list is archived at
2985`http://www.mail-archive.com/wget%40sunsite.dk/' and at
2986`http://news.gmane.org/gmane.comp.web.wget.general'.
2987
2988 The second mailing list is at <wget-patches@sunsite.dk>, and is used
2989to submit patches for review by Wget developers. A "patch" is a
2990textual representation of change to source code, readable by both
2991humans and programs. The file `PATCHES' that comes with Wget covers
2992the creation and submitting of patches in detail. Please don't send
2993general suggestions or bug reports to `wget-patches'; use it only for
2994patch submissions.
2995
2996 To subscribe, simply send mail to <wget-subscribe@sunsite.dk> and
2997follow the instructions. Unsubscribe by mailing to
2998<wget-unsubscribe@sunsite.dk>. The mailing list is archived at
2999`http://news.gmane.org/gmane.comp.web.wget.patches'.
3000
3001
3002File: wget.info, Node: Reporting Bugs, Next: Portability, Prev: Mailing List, Up: Various
3003
30048.4 Reporting Bugs
3005==================
3006
3007You are welcome to send bug reports about GNU Wget to
3008<bug-wget@gnu.org>.
3009
3010 Before actually submitting a bug report, please try to follow a few
3011simple guidelines.
3012
3013 1. Please try to ascertain that the behavior you see really is a bug.
3014 If Wget crashes, it's a bug. If Wget does not behave as
3015 documented, it's a bug. If things work strange, but you are not
3016 sure about the way they are supposed to work, it might well be a
3017 bug.
3018
3019 2. Try to repeat the bug in as simple circumstances as possible.
3020 E.g. if Wget crashes while downloading `wget -rl0 -kKE -t5 -Y0
3021 http://yoyodyne.com -o /tmp/log', you should try to see if the
3022 crash is repeatable, and if will occur with a simpler set of
3023 options. You might even try to start the download at the page
3024 where the crash occurred to see if that page somehow triggered the
3025 crash.
3026
3027 Also, while I will probably be interested to know the contents of
3028 your `.wgetrc' file, just dumping it into the debug message is
3029 probably a bad idea. Instead, you should first try to see if the
3030 bug repeats with `.wgetrc' moved out of the way. Only if it turns
3031 out that `.wgetrc' settings affect the bug, mail me the relevant
3032 parts of the file.
3033
3034 3. Please start Wget with `-d' option and send us the resulting
3035 output (or relevant parts thereof). If Wget was compiled without
3036 debug support, recompile it--it is _much_ easier to trace bugs
3037 with debug support on.
3038
3039 Note: please make sure to remove any potentially sensitive
3040 information from the debug log before sending it to the bug
3041 address. The `-d' won't go out of its way to collect sensitive
3042 information, but the log _will_ contain a fairly complete
3043 transcript of Wget's communication with the server, which may
3044 include passwords and pieces of downloaded data. Since the bug
3045 address is publically archived, you may assume that all bug
3046 reports are visible to the public.
3047
3048 4. If Wget has crashed, try to run it in a debugger, e.g. `gdb `which
3049 wget` core' and type `where' to get the backtrace. This may not
3050 work if the system administrator has disabled core files, but it is
3051 safe to try.
3052
3053
3054File: wget.info, Node: Portability, Next: Signals, Prev: Reporting Bugs, Up: Various
3055
30568.5 Portability
3057===============
3058
3059Like all GNU software, Wget works on the GNU system. However, since it
3060uses GNU Autoconf for building and configuring, and mostly avoids using
3061"special" features of any particular Unix, it should compile (and work)
3062on all common Unix flavors.
3063
3064 Various Wget versions have been compiled and tested under many kinds
3065of Unix systems, including GNU/Linux, Solaris, SunOS 4.x, OSF (aka
3066Digital Unix or Tru64), Ultrix, *BSD, IRIX, AIX, and others. Some of
3067those systems are no longer in widespread use and may not be able to
3068support recent versions of Wget. If Wget fails to compile on your
3069system, we would like to know about it.
3070
3071 Thanks to kind contributors, this version of Wget compiles and works
3072on 32-bit Microsoft Windows platforms. It has been compiled
3073successfully using MS Visual C++ 6.0, Watcom, Borland C, and GCC
3074compilers. Naturally, it is crippled of some features available on
3075Unix, but it should work as a substitute for people stuck with Windows.
3076Note that Windows-specific portions of Wget are not guaranteed to be
3077supported in the future, although this has been the case in practice
3078for many years now. All questions and problems in Windows usage should
3079be reported to Wget mailing list at <wget@sunsite.dk> where the
3080volunteers who maintain the Windows-related features might look at them.
3081
3082
3083File: wget.info, Node: Signals, Prev: Portability, Up: Various
3084
30858.6 Signals
3086===========
3087
3088Since the purpose of Wget is background work, it catches the hangup
3089signal (`SIGHUP') and ignores it. If the output was on standard
3090output, it will be redirected to a file named `wget-log'. Otherwise,
3091`SIGHUP' is ignored. This is convenient when you wish to redirect the
3092output of Wget after having started it.
3093
3094 $ wget http://www.gnus.org/dist/gnus.tar.gz &
3095 ...
3096 $ kill -HUP %%
3097 SIGHUP received, redirecting output to `wget-log'.
3098
3099 Other than that, Wget will not try to interfere with signals in any
3100way. `C-c', `kill -TERM' and `kill -KILL' should kill it alike.
3101
3102
3103File: wget.info, Node: Appendices, Next: Copying, Prev: Various, Up: Top
3104
31059 Appendices
3106************
3107
3108This chapter contains some references I consider useful.
3109
3110* Menu:
3111
3112* Robot Exclusion:: Wget's support for RES.
3113* Security Considerations:: Security with Wget.
3114* Contributors:: People who helped.
3115
3116
3117File: wget.info, Node: Robot Exclusion, Next: Security Considerations, Up: Appendices
3118
31199.1 Robot Exclusion
3120===================
3121
3122It is extremely easy to make Wget wander aimlessly around a web site,
3123sucking all the available data in progress. `wget -r SITE', and you're
3124set. Great? Not for the server admin.
3125
3126 As long as Wget is only retrieving static pages, and doing it at a
3127reasonable rate (see the `--wait' option), there's not much of a
3128problem. The trouble is that Wget can't tell the difference between the
3129smallest static page and the most demanding CGI. A site I know has a
3130section handled by a CGI Perl script that converts Info files to HTML on
3131the fly. The script is slow, but works well enough for human users
3132viewing an occasional Info file. However, when someone's recursive Wget
3133download stumbles upon the index page that links to all the Info files
3134through the script, the system is brought to its knees without providing
3135anything useful to the user (This task of converting Info files could be
3136done locally and access to Info documentation for all installed GNU
3137software on a system is available from the `info' command).
3138
3139 To avoid this kind of accident, as well as to preserve privacy for
3140documents that need to be protected from well-behaved robots, the
3141concept of "robot exclusion" was invented. The idea is that the server
3142administrators and document authors can specify which portions of the
3143site they wish to protect from robots and those they will permit access.
3144
3145 The most popular mechanism, and the de facto standard supported by
3146all the major robots, is the "Robots Exclusion Standard" (RES) written
3147by Martijn Koster et al. in 1994. It specifies the format of a text
3148file containing directives that instruct the robots which URL paths to
3149avoid. To be found by the robots, the specifications must be placed in
3150`/robots.txt' in the server root, which the robots are expected to
3151download and parse.
3152
3153 Although Wget is not a web robot in the strictest sense of the word,
3154it can downloads large parts of the site without the user's
3155intervention to download an individual page. Because of that, Wget
3156honors RES when downloading recursively. For instance, when you issue:
3157
3158 wget -r http://www.server.com/
3159
3160 First the index of `www.server.com' will be downloaded. If Wget
3161finds that it wants to download more documents from that server, it will
3162request `http://www.server.com/robots.txt' and, if found, use it for
3163further downloads. `robots.txt' is loaded only once per each server.
3164
3165 Until version 1.8, Wget supported the first version of the standard,
3166written by Martijn Koster in 1994 and available at
3167`http://www.robotstxt.org/wc/norobots.html'. As of version 1.8, Wget
3168has supported the additional directives specified in the internet draft
3169`<draft-koster-robots-00.txt>' titled "A Method for Web Robots
3170Control". The draft, which has as far as I know never made to an RFC,
3171is available at `http://www.robotstxt.org/wc/norobots-rfc.txt'.
3172
3173 This manual no longer includes the text of the Robot Exclusion
3174Standard.
3175
3176 The second, less known mechanism, enables the author of an individual
3177document to specify whether they want the links from the file to be
3178followed by a robot. This is achieved using the `META' tag, like this:
3179
3180 <meta name="robots" content="nofollow">
3181
3182 This is explained in some detail at
3183`http://www.robotstxt.org/wc/meta-user.html'. Wget supports this
3184method of robot exclusion in addition to the usual `/robots.txt'
3185exclusion.
3186
3187 If you know what you are doing and really really wish to turn off the
3188robot exclusion, set the `robots' variable to `off' in your `.wgetrc'.
3189You can achieve the same effect from the command line using the `-e'
3190switch, e.g. `wget -e robots=off URL...'.
3191
3192
3193File: wget.info, Node: Security Considerations, Next: Contributors, Prev: Robot Exclusion, Up: Appendices
3194
31959.2 Security Considerations
3196===========================
3197
3198When using Wget, you must be aware that it sends unencrypted passwords
3199through the network, which may present a security problem. Here are the
3200main issues, and some solutions.
3201
3202 1. The passwords on the command line are visible using `ps'. The best
3203 way around it is to use `wget -i -' and feed the URLs to Wget's
3204 standard input, each on a separate line, terminated by `C-d'.
3205 Another workaround is to use `.netrc' to store passwords; however,
3206 storing unencrypted passwords is also considered a security risk.
3207
3208 2. Using the insecure "basic" authentication scheme, unencrypted
3209 passwords are transmitted through the network routers and gateways.
3210
3211 3. The FTP passwords are also in no way encrypted. There is no good
3212 solution for this at the moment.
3213
3214 4. Although the "normal" output of Wget tries to hide the passwords,
3215 debugging logs show them, in all forms. This problem is avoided by
3216 being careful when you send debug logs (yes, even when you send
3217 them to me).
3218
3219
3220File: wget.info, Node: Contributors, Prev: Security Considerations, Up: Appendices
3221
32229.3 Contributors
3223================
3224
3225GNU Wget was written by Hrvoje Niksic <hniksic@xemacs.org>. However,
3226its development could never have gone as far as it has, were it not for
3227the help of many people, either with bug reports, feature proposals,
3228patches, or letters saying "Thanks!".
3229
3230Special thanks goes to the following people (no particular order):
3231
3232 * Karsten Thygesen--donated system resources such as the mailing
3233 list, web space, and FTP space, along with a lot of time to make
3234 these actually work.
3235
3236 * Shawn McHorse--bug reports and patches.
3237
3238 * Kaveh R. Ghazi--on-the-fly `ansi2knr'-ization. Lots of
3239 portability fixes.
3240
3241 * Gordon Matzigkeit--`.netrc' support.
3242
3243 * Zlatko Calusic, Tomislav Vujec and Drazen Kacar--feature
3244 suggestions and "philosophical" discussions.
3245
3246 * Darko Budor--initial port to Windows.
3247
3248 * Antonio Rosella--help and suggestions, plus the Italian
3249 translation.
3250
3251 * Tomislav Petrovic, Mario Mikocevic--many bug reports and
3252 suggestions.
3253
3254 * Francois Pinard--many thorough bug reports and discussions.
3255
3256 * Karl Eichwalder--lots of help with internationalization and other
3257 things.
3258
3259 * Junio Hamano--donated support for Opie and HTTP `Digest'
3260 authentication.
3261
3262 * The people who provided donations for development, including Brian
3263 Gough.
3264
3265 The following people have provided patches, bug/build reports, useful
3266suggestions, beta testing services, fan mail and all the other things
3267that make maintenance so much fun:
3268
3269 Ian Abbott Tim Adam, Adrian Aichner, Martin Baehr, Dieter Baron,
3270Roger Beeman, Dan Berger, T. Bharath, Christian Biere, Paul Bludov,
3271Daniel Bodea, Mark Boyns, John Burden, Wanderlei Cavassin, Gilles Cedoc,
3272Tim Charron, Noel Cragg, Kristijan Conkas, John Daily, Andreas Damm,
3273Ahmon Dancy, Andrew Davison, Bertrand Demiddelaer, Andrew Deryabin,
3274Ulrich Drepper, Marc Duponcheel, Damir Dzeko, Alan Eldridge,
3275Hans-Andreas Engel, Aleksandar Erkalovic, Andy Eskilsson, Christian
3276Fraenkel, David Fritz, Charles C. Fu, FUJISHIMA Satsuki, Masashi Fujita,
3277Howard Gayle, Marcel Gerrits, Lemble Gregory, Hans Grobler, Mathieu
3278Guillaume, Dan Harkless, Aaron Hawley, Herold Heiko, Jochen Hein, Karl
3279Heuer, HIROSE Masaaki, Ulf Harnhammar, Gregor Hoffleit, Erik Magnus
3280Hulthen, Richard Huveneers, Jonas Jensen, Larry Jones, Simon Josefsson,
3281Mario Juric, Hack Kampbjorn, Const Kaplinsky, Goran Kezunovic, Igor
3282Khristophorov, Robert Kleine, KOJIMA Haime, Fila Kolodny, Alexander
3283Kourakos, Martin Kraemer, Sami Krank, Simos KSenitellis, Christian
3284Lackas, Hrvoje Lacko, Daniel S. Lewart, Nicolas Lichtmeier, Dave Love,
3285Alexander V. Lukyanov, Thomas Lussnig, Andre Majorel, Aurelien Marchand,
3286Matthew J. Mellon, Jordan Mendelson, Lin Zhe Min, Jan Minar, Tim Mooney,
3287Keith Moore, Adam D. Moss, Simon Munton, Charlie Negyesi, R. K. Owen,
3288Leonid Petrov, Simone Piunno, Andrew Pollock, Steve Pothier, Jan
3289Prikryl, Marin Purgar, Csaba Raduly, Keith Refson, Bill Richardson,
3290Tyler Riddle, Tobias Ringstrom, Juan Jose Rodriguez, Maciej W. Rozycki,
3291Edward J. Sabol, Heinz Salzmann, Robert Schmidt, Nicolas Schodet,
3292Andreas Schwab, Chris Seawood, Dennis Smit, Toomas Soome, Tage
3293Stabell-Kulo, Philip Stadermann, Daniel Stenberg, Sven Sternberger,
3294Markus Strasser, John Summerfield, Szakacsits Szabolcs, Mike Thomas,
3295Philipp Thomas, Mauro Tortonesi, Dave Turner, Gisle Vanem, Russell
3296Vincent, Zeljko Vrba, Charles G Waldman, Douglas E. Wegscheid, YAMAZAKI
3297Makoto, Jasmin Zainul, Bojan Zdrnja, Kristijan Zimmer.
3298
3299 Apologies to all who I accidentally left out, and many thanks to all
3300the subscribers of the Wget mailing list.
3301
3302
3303File: wget.info, Node: Copying, Next: Concept Index, Prev: Appendices, Up: Top
3304
330510 Copying
3306**********
3307
3308GNU Wget is licensed under the GNU General Public License (GNU GPL),
3309which makes it "free software". Please note that "free" in "free
3310software" refers to liberty, not price. As some people like to point
3311out, it's the "free" of "free speech", not the "free" of "free beer".
3312
3313 The exact and legally binding distribution terms are spelled out
3314below. The GPL guarantees that you have the right (freedom) to run and
3315change GNU Wget and distribute it to others, and even--if you
3316want--charge money for doing any of those things. With these rights
3317comes the obligation to distribute the source code along with the
3318software and to grant your recipients the same rights and impose the
3319same restrictions.
3320
3321 This licensing model is also known as "open source" because it,
3322among other things, makes sure that all recipients will receive the
3323source code along with the program, and be able to improve it. The GNU
3324project prefers the term "free software" for reasons outlined at
3325`http://www.gnu.org/philosophy/free-software-for-freedom.html'.
3326
3327 The exact license terms are defined by this paragraph and the GNU
3328General Public License it refers to:
3329
3330 GNU Wget is free software; you can redistribute it and/or modify it
3331 under the terms of the GNU General Public License as published by
3332 the Free Software Foundation; either version 2 of the License, or
3333 (at your option) any later version.
3334
3335 GNU Wget is distributed in the hope that it will be useful, but
3336 WITHOUT ANY WARRANTY; without even the implied warranty of
3337 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
3338 General Public License for more details.
3339
3340 A copy of the GNU General Public License is included as part of
3341 this manual; if you did not receive it, write to the Free Software
3342 Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
3343
3344 In addition to this, this manual is free in the same sense:
3345
3346 Permission is granted to copy, distribute and/or modify this
3347 document under the terms of the GNU Free Documentation License,
3348 Version 1.2 or any later version published by the Free Software
3349 Foundation; with the Invariant Sections being "GNU General Public
3350 License" and "GNU Free Documentation License", with no Front-Cover
3351 Texts, and with no Back-Cover Texts. A copy of the license is
3352 included in the section entitled "GNU Free Documentation License".
3353
3354 The full texts of the GNU General Public License and of the GNU Free
3355Documentation License are available below.
3356
3357* Menu:
3358
3359* GNU General Public License::
3360* GNU Free Documentation License::
3361
3362
3363File: wget.info, Node: GNU General Public License, Next: GNU Free Documentation License, Up: Copying
3364
336510.1 GNU General Public License
3366===============================
3367
3368 Version 2, June 1991
3369
3370 Copyright (C) 1989, 1991 Free Software Foundation, Inc.
3371 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA
3372
3373 Everyone is permitted to copy and distribute verbatim copies
3374 of this license document, but changing it is not allowed.
3375
3376Preamble
3377========
3378
3379The licenses for most software are designed to take away your freedom
3380to share and change it. By contrast, the GNU General Public License is
3381intended to guarantee your freedom to share and change free
3382software--to make sure the software is free for all its users. This
3383General Public License applies to most of the Free Software
3384Foundation's software and to any other program whose authors commit to
3385using it. (Some other Free Software Foundation software is covered by
3386the GNU Lesser General Public License instead.) You can apply it to
3387your programs, too.
3388
3389 When we speak of free software, we are referring to freedom, not
3390price. Our General Public Licenses are designed to make sure that you
3391have the freedom to distribute copies of free software (and charge for
3392this service if you wish), that you receive source code or can get it
3393if you want it, that you can change the software or use pieces of it in
3394new free programs; and that you know you can do these things.
3395
3396 To protect your rights, we need to make restrictions that forbid
3397anyone to deny you these rights or to ask you to surrender the rights.
3398These restrictions translate to certain responsibilities for you if you
3399distribute copies of the software, or if you modify it.
3400
3401 For example, if you distribute copies of such a program, whether
3402gratis or for a fee, you must give the recipients all the rights that
3403you have. You must make sure that they, too, receive or can get the
3404source code. And you must show them these terms so they know their
3405rights.
3406
3407 We protect your rights with two steps: (1) copyright the software,
3408and (2) offer you this license which gives you legal permission to copy,
3409distribute and/or modify the software.
3410
3411 Also, for each author's protection and ours, we want to make certain
3412that everyone understands that there is no warranty for this free
3413software. If the software is modified by someone else and passed on, we
3414want its recipients to know that what they have is not the original, so
3415that any problems introduced by others will not reflect on the original
3416authors' reputations.
3417
3418 Finally, any free program is threatened constantly by software
3419patents. We wish to avoid the danger that redistributors of a free
3420program will individually obtain patent licenses, in effect making the
3421program proprietary. To prevent this, we have made it clear that any
3422patent must be licensed for everyone's free use or not licensed at all.
3423
3424 The precise terms and conditions for copying, distribution and
3425modification follow.
3426
3427 TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
3428 0. This License applies to any program or other work which contains a
3429 notice placed by the copyright holder saying it may be distributed
3430 under the terms of this General Public License. The "Program",
3431 below, refers to any such program or work, and a "work based on
3432 the Program" means either the Program or any derivative work under
3433 copyright law: that is to say, a work containing the Program or a
3434 portion of it, either verbatim or with modifications and/or
3435 translated into another language. (Hereinafter, translation is
3436 included without limitation in the term "modification".) Each
3437 licensee is addressed as "you".
3438
3439 Activities other than copying, distribution and modification are
3440 not covered by this License; they are outside its scope. The act
3441 of running the Program is not restricted, and the output from the
3442 Program is covered only if its contents constitute a work based on
3443 the Program (independent of having been made by running the
3444 Program). Whether that is true depends on what the Program does.
3445
3446 1. You may copy and distribute verbatim copies of the Program's
3447 source code as you receive it, in any medium, provided that you
3448 conspicuously and appropriately publish on each copy an appropriate
3449 copyright notice and disclaimer of warranty; keep intact all the
3450 notices that refer to this License and to the absence of any
3451 warranty; and give any other recipients of the Program a copy of
3452 this License along with the Program.
3453
3454 You may charge a fee for the physical act of transferring a copy,
3455 and you may at your option offer warranty protection in exchange
3456 for a fee.
3457
3458 2. You may modify your copy or copies of the Program or any portion
3459 of it, thus forming a work based on the Program, and copy and
3460 distribute such modifications or work under the terms of Section 1
3461 above, provided that you also meet all of these conditions:
3462
3463 a. You must cause the modified files to carry prominent notices
3464 stating that you changed the files and the date of any change.
3465
3466 b. You must cause any work that you distribute or publish, that
3467 in whole or in part contains or is derived from the Program
3468 or any part thereof, to be licensed as a whole at no charge
3469 to all third parties under the terms of this License.
3470
3471 c. If the modified program normally reads commands interactively
3472 when run, you must cause it, when started running for such
3473 interactive use in the most ordinary way, to print or display
3474 an announcement including an appropriate copyright notice and
3475 a notice that there is no warranty (or else, saying that you
3476 provide a warranty) and that users may redistribute the
3477 program under these conditions, and telling the user how to
3478 view a copy of this License. (Exception: if the Program
3479 itself is interactive but does not normally print such an
3480 announcement, your work based on the Program is not required
3481 to print an announcement.)
3482
3483 These requirements apply to the modified work as a whole. If
3484 identifiable sections of that work are not derived from the
3485 Program, and can be reasonably considered independent and separate
3486 works in themselves, then this License, and its terms, do not
3487 apply to those sections when you distribute them as separate
3488 works. But when you distribute the same sections as part of a
3489 whole which is a work based on the Program, the distribution of
3490 the whole must be on the terms of this License, whose permissions
3491 for other licensees extend to the entire whole, and thus to each
3492 and every part regardless of who wrote it.
3493
3494 Thus, it is not the intent of this section to claim rights or
3495 contest your rights to work written entirely by you; rather, the
3496 intent is to exercise the right to control the distribution of
3497 derivative or collective works based on the Program.
3498
3499 In addition, mere aggregation of another work not based on the
3500 Program with the Program (or with a work based on the Program) on
3501 a volume of a storage or distribution medium does not bring the
3502 other work under the scope of this License.
3503
3504 3. You may copy and distribute the Program (or a work based on it,
3505 under Section 2) in object code or executable form under the terms
3506 of Sections 1 and 2 above provided that you also do one of the
3507 following:
3508
3509 a. Accompany it with the complete corresponding machine-readable
3510 source code, which must be distributed under the terms of
3511 Sections 1 and 2 above on a medium customarily used for
3512 software interchange; or,
3513
3514 b. Accompany it with a written offer, valid for at least three
3515 years, to give any third party, for a charge no more than your
3516 cost of physically performing source distribution, a complete
3517 machine-readable copy of the corresponding source code, to be
3518 distributed under the terms of Sections 1 and 2 above on a
3519 medium customarily used for software interchange; or,
3520
3521 c. Accompany it with the information you received as to the offer
3522 to distribute corresponding source code. (This alternative is
3523 allowed only for noncommercial distribution and only if you
3524 received the program in object code or executable form with
3525 such an offer, in accord with Subsection b above.)
3526
3527 The source code for a work means the preferred form of the work for
3528 making modifications to it. For an executable work, complete
3529 source code means all the source code for all modules it contains,
3530 plus any associated interface definition files, plus the scripts
3531 used to control compilation and installation of the executable.
3532 However, as a special exception, the source code distributed need
3533 not include anything that is normally distributed (in either
3534 source or binary form) with the major components (compiler,
3535 kernel, and so on) of the operating system on which the executable
3536 runs, unless that component itself accompanies the executable.
3537
3538 If distribution of executable or object code is made by offering
3539 access to copy from a designated place, then offering equivalent
3540 access to copy the source code from the same place counts as
3541 distribution of the source code, even though third parties are not
3542 compelled to copy the source along with the object code.
3543
3544 4. You may not copy, modify, sublicense, or distribute the Program
3545 except as expressly provided under this License. Any attempt
3546 otherwise to copy, modify, sublicense or distribute the Program is
3547 void, and will automatically terminate your rights under this
3548 License. However, parties who have received copies, or rights,
3549 from you under this License will not have their licenses
3550 terminated so long as such parties remain in full compliance.
3551
3552 5. You are not required to accept this License, since you have not
3553 signed it. However, nothing else grants you permission to modify
3554 or distribute the Program or its derivative works. These actions
3555 are prohibited by law if you do not accept this License.
3556 Therefore, by modifying or distributing the Program (or any work
3557 based on the Program), you indicate your acceptance of this
3558 License to do so, and all its terms and conditions for copying,
3559 distributing or modifying the Program or works based on it.
3560
3561 6. Each time you redistribute the Program (or any work based on the
3562 Program), the recipient automatically receives a license from the
3563 original licensor to copy, distribute or modify the Program
3564 subject to these terms and conditions. You may not impose any
3565 further restrictions on the recipients' exercise of the rights
3566 granted herein. You are not responsible for enforcing compliance
3567 by third parties to this License.
3568
3569 7. If, as a consequence of a court judgment or allegation of patent
3570 infringement or for any other reason (not limited to patent
3571 issues), conditions are imposed on you (whether by court order,
3572 agreement or otherwise) that contradict the conditions of this
3573 License, they do not excuse you from the conditions of this
3574 License. If you cannot distribute so as to satisfy simultaneously
3575 your obligations under this License and any other pertinent
3576 obligations, then as a consequence you may not distribute the
3577 Program at all. For example, if a patent license would not permit
3578 royalty-free redistribution of the Program by all those who
3579 receive copies directly or indirectly through you, then the only
3580 way you could satisfy both it and this License would be to refrain
3581 entirely from distribution of the Program.
3582
3583 If any portion of this section is held invalid or unenforceable
3584 under any particular circumstance, the balance of the section is
3585 intended to apply and the section as a whole is intended to apply
3586 in other circumstances.
3587
3588 It is not the purpose of this section to induce you to infringe any
3589 patents or other property right claims or to contest validity of
3590 any such claims; this section has the sole purpose of protecting
3591 the integrity of the free software distribution system, which is
3592 implemented by public license practices. Many people have made
3593 generous contributions to the wide range of software distributed
3594 through that system in reliance on consistent application of that
3595 system; it is up to the author/donor to decide if he or she is
3596 willing to distribute software through any other system and a
3597 licensee cannot impose that choice.
3598
3599 This section is intended to make thoroughly clear what is believed
3600 to be a consequence of the rest of this License.
3601
3602 8. If the distribution and/or use of the Program is restricted in
3603 certain countries either by patents or by copyrighted interfaces,
3604 the original copyright holder who places the Program under this
3605 License may add an explicit geographical distribution limitation
3606 excluding those countries, so that distribution is permitted only
3607 in or among countries not thus excluded. In such case, this
3608 License incorporates the limitation as if written in the body of
3609 this License.
3610
3611 9. The Free Software Foundation may publish revised and/or new
3612 versions of the General Public License from time to time. Such
3613 new versions will be similar in spirit to the present version, but
3614 may differ in detail to address new problems or concerns.
3615
3616 Each version is given a distinguishing version number. If the
3617 Program specifies a version number of this License which applies
3618 to it and "any later version", you have the option of following
3619 the terms and conditions either of that version or of any later
3620 version published by the Free Software Foundation. If the Program
3621 does not specify a version number of this License, you may choose
3622 any version ever published by the Free Software Foundation.
3623
3624 10. If you wish to incorporate parts of the Program into other free
3625 programs whose distribution conditions are different, write to the
3626 author to ask for permission. For software which is copyrighted
3627 by the Free Software Foundation, write to the Free Software
3628 Foundation; we sometimes make exceptions for this. Our decision
3629 will be guided by the two goals of preserving the free status of
3630 all derivatives of our free software and of promoting the sharing
3631 and reuse of software generally.
3632
3633 NO WARRANTY
3634 11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO
3635 WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE
3636 LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT
3637 HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT
3638 WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT
3639 NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND
3640 FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE
3641 QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE
3642 PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY
3643 SERVICING, REPAIR OR CORRECTION.
3644
3645 12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN
3646 WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY
3647 MODIFY AND/OR REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE
3648 LIABLE TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL,
3649 INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR
3650 INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF
3651 DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU
3652 OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY
3653 OTHER PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN
3654 ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.
3655
3656 END OF TERMS AND CONDITIONS
3657Appendix: How to Apply These Terms to Your New Programs
3658=======================================================
3659
3660If you develop a new program, and you want it to be of the greatest
3661possible use to the public, the best way to achieve this is to make it
3662free software which everyone can redistribute and change under these
3663terms.
3664
3665 To do so, attach the following notices to the program. It is safest
3666to attach them to the start of each source file to most effectively
3667convey the exclusion of warranty; and each file should have at least
3668the "copyright" line and a pointer to where the full notice is found.
3669
3670 ONE LINE TO GIVE THE PROGRAM'S NAME AND A BRIEF IDEA OF WHAT IT DOES.
3671 Copyright (C) YYYY NAME OF AUTHOR
3672
3673 This program is free software; you can redistribute it and/or modify
3674 it under the terms of the GNU General Public License as published by
3675 the Free Software Foundation; either version 2 of the License, or
3676 (at your option) any later version.
3677
3678 This program is distributed in the hope that it will be useful,
3679 but WITHOUT ANY WARRANTY; without even the implied warranty of
3680 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
3681 GNU General Public License for more details.
3682
3683 You should have received a copy of the GNU General Public License
3684 along with this program; if not, write to the Free Software
3685 Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
3686
3687 Also add information on how to contact you by electronic and paper
3688mail.
3689
3690 If the program is interactive, make it output a short notice like
3691this when it starts in an interactive mode:
3692
3693 Gnomovision version 69, Copyright (C) 19YY NAME OF AUTHOR
3694 Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'.
3695 This is free software, and you are welcome to redistribute it
3696 under certain conditions; type `show c' for details.
3697
3698 The hypothetical commands `show w' and `show c' should show the
3699appropriate parts of the General Public License. Of course, the
3700commands you use may be called something other than `show w' and `show
3701c'; they could even be mouse-clicks or menu items--whatever suits your
3702program.
3703
3704 You should also get your employer (if you work as a programmer) or
3705your school, if any, to sign a "copyright disclaimer" for the program,
3706if necessary. Here is a sample; alter the names:
3707
3708 Yoyodyne, Inc., hereby disclaims all copyright interest in the program
3709 `Gnomovision' (which makes passes at compilers) written by James Hacker.
3710
3711 SIGNATURE OF TY COON, 1 April 1989
3712 Ty Coon, President of Vice
3713
3714 This General Public License does not permit incorporating your
3715program into proprietary programs. If your program is a subroutine
3716library, you may consider it more useful to permit linking proprietary
3717applications with the library. If this is what you want to do, use the
3718GNU Lesser General Public License instead of this License.
3719
3720
3721File: wget.info, Node: GNU Free Documentation License, Prev: GNU General Public License, Up: Copying
3722
372310.2 GNU Free Documentation License
3724===================================
3725
3726 Version 1.2, November 2002
3727
3728 Copyright (C) 2000,2001,2002 Free Software Foundation, Inc.
3729 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA
3730
3731 Everyone is permitted to copy and distribute verbatim copies
3732 of this license document, but changing it is not allowed.
3733
3734 0. PREAMBLE
3735
3736 The purpose of this License is to make a manual, textbook, or other
3737 functional and useful document "free" in the sense of freedom: to
3738 assure everyone the effective freedom to copy and redistribute it,
3739 with or without modifying it, either commercially or
3740 noncommercially. Secondarily, this License preserves for the
3741 author and publisher a way to get credit for their work, while not
3742 being considered responsible for modifications made by others.
3743
3744 This License is a kind of "copyleft", which means that derivative
3745 works of the document must themselves be free in the same sense.
3746 It complements the GNU General Public License, which is a copyleft
3747 license designed for free software.
3748
3749 We have designed this License in order to use it for manuals for
3750 free software, because free software needs free documentation: a
3751 free program should come with manuals providing the same freedoms
3752 that the software does. But this License is not limited to
3753 software manuals; it can be used for any textual work, regardless
3754 of subject matter or whether it is published as a printed book.
3755 We recommend this License principally for works whose purpose is
3756 instruction or reference.
3757
3758 1. APPLICABILITY AND DEFINITIONS
3759
3760 This License applies to any manual or other work, in any medium,
3761 that contains a notice placed by the copyright holder saying it
3762 can be distributed under the terms of this License. Such a notice
3763 grants a world-wide, royalty-free license, unlimited in duration,
3764 to use that work under the conditions stated herein. The
3765 "Document", below, refers to any such manual or work. Any member
3766 of the public is a licensee, and is addressed as "you". You
3767 accept the license if you copy, modify or distribute the work in a
3768 way requiring permission under copyright law.
3769
3770 A "Modified Version" of the Document means any work containing the
3771 Document or a portion of it, either copied verbatim, or with
3772 modifications and/or translated into another language.
3773
3774 A "Secondary Section" is a named appendix or a front-matter section
3775 of the Document that deals exclusively with the relationship of the
3776 publishers or authors of the Document to the Document's overall
3777 subject (or to related matters) and contains nothing that could
3778 fall directly within that overall subject. (Thus, if the Document
3779 is in part a textbook of mathematics, a Secondary Section may not
3780 explain any mathematics.) The relationship could be a matter of
3781 historical connection with the subject or with related matters, or
3782 of legal, commercial, philosophical, ethical or political position
3783 regarding them.
3784
3785 The "Invariant Sections" are certain Secondary Sections whose
3786 titles are designated, as being those of Invariant Sections, in
3787 the notice that says that the Document is released under this
3788 License. If a section does not fit the above definition of
3789 Secondary then it is not allowed to be designated as Invariant.
3790 The Document may contain zero Invariant Sections. If the Document
3791 does not identify any Invariant Sections then there are none.
3792
3793 The "Cover Texts" are certain short passages of text that are
3794 listed, as Front-Cover Texts or Back-Cover Texts, in the notice
3795 that says that the Document is released under this License. A
3796 Front-Cover Text may be at most 5 words, and a Back-Cover Text may
3797 be at most 25 words.
3798
3799 A "Transparent" copy of the Document means a machine-readable copy,
3800 represented in a format whose specification is available to the
3801 general public, that is suitable for revising the document
3802 straightforwardly with generic text editors or (for images
3803 composed of pixels) generic paint programs or (for drawings) some
3804 widely available drawing editor, and that is suitable for input to
3805 text formatters or for automatic translation to a variety of
3806 formats suitable for input to text formatters. A copy made in an
3807 otherwise Transparent file format whose markup, or absence of
3808 markup, has been arranged to thwart or discourage subsequent
3809 modification by readers is not Transparent. An image format is
3810 not Transparent if used for any substantial amount of text. A
3811 copy that is not "Transparent" is called "Opaque".
3812
3813 Examples of suitable formats for Transparent copies include plain
3814 ASCII without markup, Texinfo input format, LaTeX input format,
3815 SGML or XML using a publicly available DTD, and
3816 standard-conforming simple HTML, PostScript or PDF designed for
3817 human modification. Examples of transparent image formats include
3818 PNG, XCF and JPG. Opaque formats include proprietary formats that
3819 can be read and edited only by proprietary word processors, SGML or
3820 XML for which the DTD and/or processing tools are not generally
3821 available, and the machine-generated HTML, PostScript or PDF
3822 produced by some word processors for output purposes only.
3823
3824 The "Title Page" means, for a printed book, the title page itself,
3825 plus such following pages as are needed to hold, legibly, the
3826 material this License requires to appear in the title page. For
3827 works in formats which do not have any title page as such, "Title
3828 Page" means the text near the most prominent appearance of the
3829 work's title, preceding the beginning of the body of the text.
3830
3831 A section "Entitled XYZ" means a named subunit of the Document
3832 whose title either is precisely XYZ or contains XYZ in parentheses
3833 following text that translates XYZ in another language. (Here XYZ
3834 stands for a specific section name mentioned below, such as
3835 "Acknowledgements", "Dedications", "Endorsements", or "History".)
3836 To "Preserve the Title" of such a section when you modify the
3837 Document means that it remains a section "Entitled XYZ" according
3838 to this definition.
3839
3840 The Document may include Warranty Disclaimers next to the notice
3841 which states that this License applies to the Document. These
3842 Warranty Disclaimers are considered to be included by reference in
3843 this License, but only as regards disclaiming warranties: any other
3844 implication that these Warranty Disclaimers may have is void and
3845 has no effect on the meaning of this License.
3846
3847 2. VERBATIM COPYING
3848
3849 You may copy and distribute the Document in any medium, either
3850 commercially or noncommercially, provided that this License, the
3851 copyright notices, and the license notice saying this License
3852 applies to the Document are reproduced in all copies, and that you
3853 add no other conditions whatsoever to those of this License. You
3854 may not use technical measures to obstruct or control the reading
3855 or further copying of the copies you make or distribute. However,
3856 you may accept compensation in exchange for copies. If you
3857 distribute a large enough number of copies you must also follow
3858 the conditions in section 3.
3859
3860 You may also lend copies, under the same conditions stated above,
3861 and you may publicly display copies.
3862
3863 3. COPYING IN QUANTITY
3864
3865 If you publish printed copies (or copies in media that commonly
3866 have printed covers) of the Document, numbering more than 100, and
3867 the Document's license notice requires Cover Texts, you must
3868 enclose the copies in covers that carry, clearly and legibly, all
3869 these Cover Texts: Front-Cover Texts on the front cover, and
3870 Back-Cover Texts on the back cover. Both covers must also clearly
3871 and legibly identify you as the publisher of these copies. The
3872 front cover must present the full title with all words of the
3873 title equally prominent and visible. You may add other material
3874 on the covers in addition. Copying with changes limited to the
3875 covers, as long as they preserve the title of the Document and
3876 satisfy these conditions, can be treated as verbatim copying in
3877 other respects.
3878
3879 If the required texts for either cover are too voluminous to fit
3880 legibly, you should put the first ones listed (as many as fit
3881 reasonably) on the actual cover, and continue the rest onto
3882 adjacent pages.
3883
3884 If you publish or distribute Opaque copies of the Document
3885 numbering more than 100, you must either include a
3886 machine-readable Transparent copy along with each Opaque copy, or
3887 state in or with each Opaque copy a computer-network location from
3888 which the general network-using public has access to download
3889 using public-standard network protocols a complete Transparent
3890 copy of the Document, free of added material. If you use the
3891 latter option, you must take reasonably prudent steps, when you
3892 begin distribution of Opaque copies in quantity, to ensure that
3893 this Transparent copy will remain thus accessible at the stated
3894 location until at least one year after the last time you
3895 distribute an Opaque copy (directly or through your agents or
3896 retailers) of that edition to the public.
3897
3898 It is requested, but not required, that you contact the authors of
3899 the Document well before redistributing any large number of
3900 copies, to give them a chance to provide you with an updated
3901 version of the Document.
3902
3903 4. MODIFICATIONS
3904
3905 You may copy and distribute a Modified Version of the Document
3906 under the conditions of sections 2 and 3 above, provided that you
3907 release the Modified Version under precisely this License, with
3908 the Modified Version filling the role of the Document, thus
3909 licensing distribution and modification of the Modified Version to
3910 whoever possesses a copy of it. In addition, you must do these
3911 things in the Modified Version:
3912
3913 A. Use in the Title Page (and on the covers, if any) a title
3914 distinct from that of the Document, and from those of
3915 previous versions (which should, if there were any, be listed
3916 in the History section of the Document). You may use the
3917 same title as a previous version if the original publisher of
3918 that version gives permission.
3919
3920 B. List on the Title Page, as authors, one or more persons or
3921 entities responsible for authorship of the modifications in
3922 the Modified Version, together with at least five of the
3923 principal authors of the Document (all of its principal
3924 authors, if it has fewer than five), unless they release you
3925 from this requirement.
3926
3927 C. State on the Title page the name of the publisher of the
3928 Modified Version, as the publisher.
3929
3930 D. Preserve all the copyright notices of the Document.
3931
3932 E. Add an appropriate copyright notice for your modifications
3933 adjacent to the other copyright notices.
3934
3935 F. Include, immediately after the copyright notices, a license
3936 notice giving the public permission to use the Modified
3937 Version under the terms of this License, in the form shown in
3938 the Addendum below.
3939
3940 G. Preserve in that license notice the full lists of Invariant
3941 Sections and required Cover Texts given in the Document's
3942 license notice.
3943
3944 H. Include an unaltered copy of this License.
3945
3946 I. Preserve the section Entitled "History", Preserve its Title,
3947 and add to it an item stating at least the title, year, new
3948 authors, and publisher of the Modified Version as given on
3949 the Title Page. If there is no section Entitled "History" in
3950 the Document, create one stating the title, year, authors,
3951 and publisher of the Document as given on its Title Page,
3952 then add an item describing the Modified Version as stated in
3953 the previous sentence.
3954
3955 J. Preserve the network location, if any, given in the Document
3956 for public access to a Transparent copy of the Document, and
3957 likewise the network locations given in the Document for
3958 previous versions it was based on. These may be placed in
3959 the "History" section. You may omit a network location for a
3960 work that was published at least four years before the
3961 Document itself, or if the original publisher of the version
3962 it refers to gives permission.
3963
3964 K. For any section Entitled "Acknowledgements" or "Dedications",
3965 Preserve the Title of the section, and preserve in the
3966 section all the substance and tone of each of the contributor
3967 acknowledgements and/or dedications given therein.
3968
3969 L. Preserve all the Invariant Sections of the Document,
3970 unaltered in their text and in their titles. Section numbers
3971 or the equivalent are not considered part of the section
3972 titles.
3973
3974 M. Delete any section Entitled "Endorsements". Such a section
3975 may not be included in the Modified Version.
3976
3977 N. Do not retitle any existing section to be Entitled
3978 "Endorsements" or to conflict in title with any Invariant
3979 Section.
3980
3981 O. Preserve any Warranty Disclaimers.
3982
3983 If the Modified Version includes new front-matter sections or
3984 appendices that qualify as Secondary Sections and contain no
3985 material copied from the Document, you may at your option
3986 designate some or all of these sections as invariant. To do this,
3987 add their titles to the list of Invariant Sections in the Modified
3988 Version's license notice. These titles must be distinct from any
3989 other section titles.
3990
3991 You may add a section Entitled "Endorsements", provided it contains
3992 nothing but endorsements of your Modified Version by various
3993 parties--for example, statements of peer review or that the text
3994 has been approved by an organization as the authoritative
3995 definition of a standard.
3996
3997 You may add a passage of up to five words as a Front-Cover Text,
3998 and a passage of up to 25 words as a Back-Cover Text, to the end
3999 of the list of Cover Texts in the Modified Version. Only one
4000 passage of Front-Cover Text and one of Back-Cover Text may be
4001 added by (or through arrangements made by) any one entity. If the
4002 Document already includes a cover text for the same cover,
4003 previously added by you or by arrangement made by the same entity
4004 you are acting on behalf of, you may not add another; but you may
4005 replace the old one, on explicit permission from the previous
4006 publisher that added the old one.
4007
4008 The author(s) and publisher(s) of the Document do not by this
4009 License give permission to use their names for publicity for or to
4010 assert or imply endorsement of any Modified Version.
4011
4012 5. COMBINING DOCUMENTS
4013
4014 You may combine the Document with other documents released under
4015 this License, under the terms defined in section 4 above for
4016 modified versions, provided that you include in the combination
4017 all of the Invariant Sections of all of the original documents,
4018 unmodified, and list them all as Invariant Sections of your
4019 combined work in its license notice, and that you preserve all
4020 their Warranty Disclaimers.
4021
4022 The combined work need only contain one copy of this License, and
4023 multiple identical Invariant Sections may be replaced with a single
4024 copy. If there are multiple Invariant Sections with the same name
4025 but different contents, make the title of each such section unique
4026 by adding at the end of it, in parentheses, the name of the
4027 original author or publisher of that section if known, or else a
4028 unique number. Make the same adjustment to the section titles in
4029 the list of Invariant Sections in the license notice of the
4030 combined work.
4031
4032 In the combination, you must combine any sections Entitled
4033 "History" in the various original documents, forming one section
4034 Entitled "History"; likewise combine any sections Entitled
4035 "Acknowledgements", and any sections Entitled "Dedications". You
4036 must delete all sections Entitled "Endorsements."
4037
4038 6. COLLECTIONS OF DOCUMENTS
4039
4040 You may make a collection consisting of the Document and other
4041 documents released under this License, and replace the individual
4042 copies of this License in the various documents with a single copy
4043 that is included in the collection, provided that you follow the
4044 rules of this License for verbatim copying of each of the
4045 documents in all other respects.
4046
4047 You may extract a single document from such a collection, and
4048 distribute it individually under this License, provided you insert
4049 a copy of this License into the extracted document, and follow
4050 this License in all other respects regarding verbatim copying of
4051 that document.
4052
4053 7. AGGREGATION WITH INDEPENDENT WORKS
4054
4055 A compilation of the Document or its derivatives with other
4056 separate and independent documents or works, in or on a volume of
4057 a storage or distribution medium, is called an "aggregate" if the
4058 copyright resulting from the compilation is not used to limit the
4059 legal rights of the compilation's users beyond what the individual
4060 works permit. When the Document is included in an aggregate, this
4061 License does not apply to the other works in the aggregate which
4062 are not themselves derivative works of the Document.
4063
4064 If the Cover Text requirement of section 3 is applicable to these
4065 copies of the Document, then if the Document is less than one half
4066 of the entire aggregate, the Document's Cover Texts may be placed
4067 on covers that bracket the Document within the aggregate, or the
4068 electronic equivalent of covers if the Document is in electronic
4069 form. Otherwise they must appear on printed covers that bracket
4070 the whole aggregate.
4071
4072 8. TRANSLATION
4073
4074 Translation is considered a kind of modification, so you may
4075 distribute translations of the Document under the terms of section
4076 4. Replacing Invariant Sections with translations requires special
4077 permission from their copyright holders, but you may include
4078 translations of some or all Invariant Sections in addition to the
4079 original versions of these Invariant Sections. You may include a
4080 translation of this License, and all the license notices in the
4081 Document, and any Warranty Disclaimers, provided that you also
4082 include the original English version of this License and the
4083 original versions of those notices and disclaimers. In case of a
4084 disagreement between the translation and the original version of
4085 this License or a notice or disclaimer, the original version will
4086 prevail.
4087
4088 If a section in the Document is Entitled "Acknowledgements",
4089 "Dedications", or "History", the requirement (section 4) to
4090 Preserve its Title (section 1) will typically require changing the
4091 actual title.
4092
4093 9. TERMINATION
4094
4095 You may not copy, modify, sublicense, or distribute the Document
4096 except as expressly provided for under this License. Any other
4097 attempt to copy, modify, sublicense or distribute the Document is
4098 void, and will automatically terminate your rights under this
4099 License. However, parties who have received copies, or rights,
4100 from you under this License will not have their licenses
4101 terminated so long as such parties remain in full compliance.
4102
4103 10. FUTURE REVISIONS OF THIS LICENSE
4104
4105 The Free Software Foundation may publish new, revised versions of
4106 the GNU Free Documentation License from time to time. Such new
4107 versions will be similar in spirit to the present version, but may
4108 differ in detail to address new problems or concerns. See
4109 `http://www.gnu.org/copyleft/'.
4110
4111 Each version of the License is given a distinguishing version
4112 number. If the Document specifies that a particular numbered
4113 version of this License "or any later version" applies to it, you
4114 have the option of following the terms and conditions either of
4115 that specified version or of any later version that has been
4116 published (not as a draft) by the Free Software Foundation. If
4117 the Document does not specify a version number of this License,
4118 you may choose any version ever published (not as a draft) by the
4119 Free Software Foundation.
4120
412110.2.1 ADDENDUM: How to use this License for your documents
4122-----------------------------------------------------------
4123
4124To use this License in a document you have written, include a copy of
4125the License in the document and put the following copyright and license
4126notices just after the title page:
4127
4128 Copyright (C) YEAR YOUR NAME.
4129 Permission is granted to copy, distribute and/or modify this document
4130 under the terms of the GNU Free Documentation License, Version 1.2
4131 or any later version published by the Free Software Foundation;
4132 with no Invariant Sections, no Front-Cover Texts, and no Back-Cover
4133 Texts. A copy of the license is included in the section entitled ``GNU
4134 Free Documentation License''.
4135
4136 If you have Invariant Sections, Front-Cover Texts and Back-Cover
4137Texts, replace the "with...Texts." line with this:
4138
4139 with the Invariant Sections being LIST THEIR TITLES, with
4140 the Front-Cover Texts being LIST, and with the Back-Cover Texts
4141 being LIST.
4142
4143 If you have Invariant Sections without Cover Texts, or some other
4144combination of the three, merge those two alternatives to suit the
4145situation.
4146
4147 If your document contains nontrivial examples of program code, we
4148recommend releasing these examples in parallel under your choice of
4149free software license, such as the GNU General Public License, to
4150permit their use in free software.
4151
4152
4153File: wget.info, Node: Concept Index, Prev: Copying, Up: Top
4154
4155Concept Index
4156*************
4157
4158[index]
4159* Menu:
4160
4161* .html extension: HTTP Options. (line 6)
4162* .listing files, removing: FTP Options. (line 21)
4163* .netrc: Startup File. (line 6)
4164* .wgetrc: Startup File. (line 6)
4165* accept directories: Directory-Based Limits.
4166 (line 17)
4167* accept suffixes: Types of Files. (line 15)
4168* accept wildcards: Types of Files. (line 15)
4169* append to log: Logging and Input File Options.
4170 (line 11)
4171* arguments: Invoking. (line 6)
4172* authentication <1>: HTTP Options. (line 27)
4173* authentication: Download Options. (line 409)
4174* backing up converted files: Recursive Retrieval Options.
4175 (line 71)
4176* bandwidth, limit: Download Options. (line 216)
4177* base for relative links in input file: Logging and Input File Options.
4178 (line 68)
4179* bind address: Download Options. (line 6)
4180* bug reports: Reporting Bugs. (line 6)
4181* bugs: Reporting Bugs. (line 6)
4182* cache: HTTP Options. (line 43)
4183* caching of DNS lookups: Download Options. (line 302)
4184* client IP address: Download Options. (line 6)
4185* clobbering, file: Download Options. (line 30)
4186* command line: Invoking. (line 6)
4187* comments, HTML: Recursive Retrieval Options.
4188 (line 149)
4189* connect timeout: Download Options. (line 199)
4190* Content-Length, ignore: HTTP Options. (line 132)
4191* continue retrieval: Download Options. (line 64)
4192* contributors: Contributors. (line 6)
4193* conversion of links: Recursive Retrieval Options.
4194 (line 32)
4195* cookies: HTTP Options. (line 52)
4196* cookies, loading: HTTP Options. (line 62)
4197* cookies, saving: HTTP Options. (line 110)
4198* cookies, session: HTTP Options. (line 115)
4199* copying: Copying. (line 6)
4200* cut directories: Directory Options. (line 32)
4201* debug: Logging and Input File Options.
4202 (line 17)
4203* delete after retrieval: Recursive Retrieval Options.
4204 (line 16)
4205* directories: Directory-Based Limits.
4206 (line 6)
4207* directories, exclude: Directory-Based Limits.
4208 (line 30)
4209* directories, include: Directory-Based Limits.
4210 (line 17)
4211* directory limits: Directory-Based Limits.
4212 (line 6)
4213* directory prefix: Directory Options. (line 60)
4214* DNS cache: Download Options. (line 302)
4215* DNS timeout: Download Options. (line 193)
4216* dot style: Download Options. (line 125)
4217* downloading multiple times: Download Options. (line 30)
4218* EGD: HTTPS (SSL/TLS) Options.
4219 (line 101)
4220* entropy, specifying source of: HTTPS (SSL/TLS) Options.
4221 (line 85)
4222* examples: Examples. (line 6)
4223* exclude directories: Directory-Based Limits.
4224 (line 30)
4225* execute wgetrc command: Basic Startup Options.
4226 (line 19)
4227* FDL, GNU Free Documentation License: GNU Free Documentation License.
4228 (line 6)
4229* features: Overview. (line 6)
4230* file names, restrict: Download Options. (line 321)
4231* filling proxy cache: Recursive Retrieval Options.
4232 (line 16)
4233* follow FTP links: Recursive Accept/Reject Options.
4234 (line 20)
4235* following ftp links: FTP Links. (line 6)
4236* following links: Following Links. (line 6)
4237* force html: Logging and Input File Options.
4238 (line 61)
4239* free software: Copying. (line 6)
4240* ftp authentication: FTP Options. (line 6)
4241* ftp password: FTP Options. (line 6)
4242* ftp time-stamping: FTP Time-Stamping Internals.
4243 (line 6)
4244* ftp user: FTP Options. (line 6)
4245* GFDL: Copying. (line 6)
4246* globbing, toggle: FTP Options. (line 45)
4247* GPL: Copying. (line 6)
4248* hangup: Signals. (line 6)
4249* header, add: HTTP Options. (line 143)
4250* hosts, spanning: Spanning Hosts. (line 6)
4251* HTML comments: Recursive Retrieval Options.
4252 (line 149)
4253* http password: HTTP Options. (line 27)
4254* http referer: HTTP Options. (line 178)
4255* http time-stamping: HTTP Time-Stamping Internals.
4256 (line 6)
4257* http user: HTTP Options. (line 27)
4258* ignore length: HTTP Options. (line 132)
4259* include directories: Directory-Based Limits.
4260 (line 17)
4261* incomplete downloads: Download Options. (line 64)
4262* incremental updating: Time-Stamping. (line 6)
4263* input-file: Logging and Input File Options.
4264 (line 43)
4265* invoking: Invoking. (line 6)
4266* IP address, client: Download Options. (line 6)
4267* IPv6: Download Options. (line 356)
4268* Keep-Alive, turning off: FTP Options. (line 92)
4269* latest version: Distribution. (line 6)
4270* limit bandwidth: Download Options. (line 216)
4271* link conversion: Recursive Retrieval Options.
4272 (line 32)
4273* links: Following Links. (line 6)
4274* list: Mailing List. (line 6)
4275* loading cookies: HTTP Options. (line 62)
4276* location of wgetrc: Wgetrc Location. (line 6)
4277* log file: Logging and Input File Options.
4278 (line 6)
4279* mailing list: Mailing List. (line 6)
4280* mirroring: Very Advanced Usage. (line 6)
4281* no parent: Directory-Based Limits.
4282 (line 43)
4283* no-clobber: Download Options. (line 30)
4284* nohup: Invoking. (line 6)
4285* number of retries: Download Options. (line 12)
4286* operating systems: Portability. (line 6)
4287* option syntax: Option Syntax. (line 6)
4288* output file: Logging and Input File Options.
4289 (line 6)
4290* overview: Overview. (line 6)
4291* page requisites: Recursive Retrieval Options.
4292 (line 84)
4293* passive ftp: FTP Options. (line 61)
4294* password: Download Options. (line 409)
4295* pause: Download Options. (line 236)
4296* Persistent Connections, disabling: FTP Options. (line 92)
4297* portability: Portability. (line 6)
4298* POST: HTTP Options. (line 211)
4299* progress indicator: Download Options. (line 125)
4300* proxies: Proxies. (line 6)
4301* proxy <1>: HTTP Options. (line 43)
4302* proxy: Download Options. (line 279)
4303* proxy authentication: HTTP Options. (line 169)
4304* proxy filling: Recursive Retrieval Options.
4305 (line 16)
4306* proxy password: HTTP Options. (line 169)
4307* proxy user: HTTP Options. (line 169)
4308* quiet: Logging and Input File Options.
4309 (line 28)
4310* quota: Download Options. (line 286)
4311* random wait: Download Options. (line 261)
4312* randomness, specifying source of: HTTPS (SSL/TLS) Options.
4313 (line 85)
4314* rate, limit: Download Options. (line 216)
4315* read timeout: Download Options. (line 204)
4316* recursion: Recursive Download. (line 6)
4317* recursive download: Recursive Download. (line 6)
4318* redirecting output: Advanced Usage. (line 88)
4319* referer, http: HTTP Options. (line 178)
4320* reject directories: Directory-Based Limits.
4321 (line 30)
4322* reject suffixes: Types of Files. (line 34)
4323* reject wildcards: Types of Files. (line 34)
4324* relative links: Relative Links. (line 6)
4325* reporting bugs: Reporting Bugs. (line 6)
4326* required images, downloading: Recursive Retrieval Options.
4327 (line 84)
4328* resume download: Download Options. (line 64)
4329* retries: Download Options. (line 12)
4330* retries, waiting between: Download Options. (line 249)
4331* retrieving: Recursive Download. (line 6)
4332* robot exclusion: Robot Exclusion. (line 6)
4333* robots.txt: Robot Exclusion. (line 6)
4334* sample wgetrc: Sample Wgetrc. (line 6)
4335* saving cookies: HTTP Options. (line 110)
4336* security: Security Considerations.
4337 (line 6)
4338* server maintenance: Robot Exclusion. (line 6)
4339* server response, print: Download Options. (line 159)
4340* server response, save: HTTP Options. (line 185)
4341* session cookies: HTTP Options. (line 115)
4342* signal handling: Signals. (line 6)
4343* spanning hosts: Spanning Hosts. (line 6)
4344* spider: Download Options. (line 164)
4345* SSL: HTTPS (SSL/TLS) Options.
4346 (line 6)
4347* SSL certificate: HTTPS (SSL/TLS) Options.
4348 (line 47)
4349* SSL certificate authority: HTTPS (SSL/TLS) Options.
4350 (line 73)
4351* SSL certificate type, specify: HTTPS (SSL/TLS) Options.
4352 (line 53)
4353* SSL certificate, check: HTTPS (SSL/TLS) Options.
4354 (line 23)
4355* SSL protocol, choose: HTTPS (SSL/TLS) Options.
4356 (line 10)
4357* startup: Startup File. (line 6)
4358* startup file: Startup File. (line 6)
4359* suffixes, accept: Types of Files. (line 15)
4360* suffixes, reject: Types of Files. (line 34)
4361* symbolic links, retrieving: FTP Options. (line 73)
4362* syntax of options: Option Syntax. (line 6)
4363* syntax of wgetrc: Wgetrc Syntax. (line 6)
4364* tag-based recursive pruning: Recursive Accept/Reject Options.
4365 (line 24)
4366* time-stamping: Time-Stamping. (line 6)
4367* time-stamping usage: Time-Stamping Usage. (line 6)
4368* timeout: Download Options. (line 175)
4369* timeout, connect: Download Options. (line 199)
4370* timeout, DNS: Download Options. (line 193)
4371* timeout, read: Download Options. (line 204)
4372* timestamping: Time-Stamping. (line 6)
4373* tries: Download Options. (line 12)
4374* types of files: Types of Files. (line 6)
4375* updating the archives: Time-Stamping. (line 6)
4376* URL: URL Format. (line 6)
4377* URL syntax: URL Format. (line 6)
4378* usage, time-stamping: Time-Stamping Usage. (line 6)
4379* user: Download Options. (line 409)
4380* user-agent: HTTP Options. (line 189)
4381* various: Various. (line 6)
4382* verbose: Logging and Input File Options.
4383 (line 32)
4384* wait: Download Options. (line 236)
4385* wait, random: Download Options. (line 261)
4386* waiting between retries: Download Options. (line 249)
4387* Wget as spider: Download Options. (line 164)
4388* wgetrc: Startup File. (line 6)
4389* wgetrc commands: Wgetrc Commands. (line 6)
4390* wgetrc location: Wgetrc Location. (line 6)
4391* wgetrc syntax: Wgetrc Syntax. (line 6)
4392* wildcards, accept: Types of Files. (line 15)
4393* wildcards, reject: Types of Files. (line 34)
4394* Windows file names: Download Options. (line 321)
4395
4396
4397
4398Tag Table:
4399Node: Top974
4400Node: Overview1845
4401Node: Invoking5380
4402Node: URL Format6217
4403Ref: URL Format-Footnote-18790
4404Node: Option Syntax8892
4405Node: Basic Startup Options11571
4406Node: Logging and Input File Options12376
4407Node: Download Options15029
4408Node: Directory Options34648
4409Node: HTTP Options37353
4410Node: HTTPS (SSL/TLS) Options49262
4411Node: FTP Options54937
4412Node: Recursive Retrieval Options59990
4413Node: Recursive Accept/Reject Options67858
4414Node: Recursive Download70857
4415Node: Following Links73968
4416Node: Spanning Hosts74905
4417Node: Types of Files77078
4418Node: Directory-Based Limits79538
4419Node: Relative Links82188
4420Node: FTP Links83025
4421Node: Time-Stamping83892
4422Node: Time-Stamping Usage85536
4423Node: HTTP Time-Stamping Internals87362
4424Ref: HTTP Time-Stamping Internals-Footnote-188638
4425Node: FTP Time-Stamping Internals88837
4426Node: Startup File90303
4427Node: Wgetrc Location91177
4428Node: Wgetrc Syntax91976
4429Node: Wgetrc Commands92696
4430Node: Sample Wgetrc106122
4431Node: Examples111324
4432Node: Simple Usage111664
4433Node: Advanced Usage113068
4434Node: Very Advanced Usage116769
4435Node: Various118264
4436Node: Proxies118789
4437Node: Distribution121516
4438Node: Mailing List121862
4439Node: Reporting Bugs123219
4440Node: Portability125576
4441Node: Signals127018
4442Node: Appendices127701
4443Node: Robot Exclusion128023
4444Node: Security Considerations131800
4445Node: Contributors132984
4446Node: Copying136681
4447Node: GNU General Public License139393
4448Node: GNU Free Documentation License158604
4449Node: Concept Index181038
4450
4451End Tag Table
Note: See TracBrowser for help on using the repository browser.