source: vendor/wget/current/NEWS@ 3442

Last change on this file since 3442 was 3440, checked in by bird, 18 years ago

wget 1.10.2

File size: 21.0 KB
Line 
1GNU Wget NEWS -- history of user-visible changes.
2
3Copyright (C) 2005 Free Software Foundation, Inc.
4See the end for copying conditions.
5
6Please send GNU Wget bug reports to <bug-wget@gnu.org>.
7
8
9* Wget 1.10.1 is a bugfix release with no user-visible changes.
10
11
12* Changes in Wget 1.10.
13
14** Downloading files larger than 2GB, sometimes referred to as "large
15files", now works on systems that support them. This includes the
16majority of modern Unixes, as well as MS Windows.
17
18** IPv6 is now supported by Wget. Unlike the experimental code in
191.9, this version supports dual-family systems. The new flags
20`--inet4' and `--inet6' (or `-4' and `-6' for short) force the use of
21IPv4 and IPv6 respectively. Note that IPv6 support has not yet been
22tested on Windows.
23
24** Microsoft's proprietary "NTLM" method of HTTP authentication is now
25supported. This authentication method is undocumented and only used
26by IIS. Note that *proxy* authentication is not supported in this
27release; you can only authenticate to the target web site.
28
29** Wget no longer truncates partially downloaded files when download
30has to start over because the server doesn't support Range. Instead,
31with such servers Wget now simply ignores the data up to the byte
32where the last attempt left off, and only then continues appending to
33the file. That way the downloaded file never shrinks, and download
34retries from servers without support for partial downloads work even
35when downloading to stdout.
36
37** SSL/TLS changes:
38
39*** SSL/TLS downloads now attempt to verify the server's certificate
40against the recognized certificate authorities. This requires CA
41certificates to have been installed in a location visible to the
42OpenSSL library. If this is not the case, you can get the bundle
43yourself from a source you trust (for example, the bundle extracted
44from Mozilla available at http://curl.haxx.se/docs/caextract.html),
45and point Wget to the PEM file using the `--ca-certificate'
46command-line option or the corresponding `.wgetrc' command.
47
48*** Secure downloads now verify that the host name in the URL matches
49the "common name" in the certificate presented by the server.
50
51*** Although the above checks provide more secure downloads, they
52unavoidably break interoperability with some sites that worked with
53previous versions, particularly those using self-signed, expired, or
54otherwise invalid certificates. If you encounter "certificate
55verification" errors or complaints that "common name doesn't match
56requested host name" and are convinced of the site's authenticity, you
57can use `--no-check-certificate' to bypass both checks.
58
59*** Talking to SSL/TLS servers over proxies now actually works.
60Previous versions of Wget erroneously sent GET requests for https
61URLs. Wget 1.10 utilizes the CONNECT method designed for this
62purpose.
63
64*** The SSL/TLS-related options have been redesigned and, for the
65first time, documented in the manual. The old, undocumented, options
66are no longer supported.
67
68** Passive FTP is now the default FTP transfer mode. Use
69`--no-passive-ftp' or specify `passive_ftp = off' in your init file to
70revert to the old behavior.
71
72** The `--header' option can now be used to override generated
73headers. For example, `wget --header="Host: foo.bar"
74http://127.0.0.1' tells Wget to connect to localhost, but to specify
75"foo.bar" in the `Host' header. In previous versions such use of
76`--header' lead to duplicate headers in HTTP requests.
77
78** The responses without headers, aka "HTTP 0.9" responses, are
79detected and handled. Although HTTP 0.9 has long been obsolete, it is
80still occasionally used, sometimes by accident.
81
82** The progress bar is now updated regularly even when the data does
83not arrive from the network.
84
85** Wget no longer preserves permissions of files retrieved by FTP by
86default. Anonymous FTP servers frequently use permissions like "664",
87which might not be what the user wants. The new option
88`--preserve-permissions' and the corresponding `.wgetrc' variable can
89be used to revert to the old behavior.
90
91** The new option `--protocol-directories' instructs Wget to also use
92the protocol name as a directory component of local file names.
93
94** Options that previously unconditionally set or unset various flags
95are now boolean options that can be invoked as either `--OPTION' or
96`--no-OPTION'. Options that required an argument "on" or "off" have
97also been changed this way, but they still accept the old syntax for
98backward compatibility. For example, instead of `--glob=off' you can
99write `--no-glob'.
100
101Allowing `--no-OPTION' for every `--OPTION' and the other way around
102is useful because it allows the user to override non-default behavior
103specified via `.wgetrc'.
104
105** The new option `--keep-session-cookies' causes `--save-cookies' to
106save session cookies (normally only kept in memory) along with the
107permanent ones. This is useful because many sites track important
108information, such as whether the user has authenticated, in session
109cookies. With this option multiple Wget runs are treated as a single
110browser session.
111
112** Wget now supports the --ftp-user and --ftp-password command
113switches to set username and password for FTP, and the --user and
114--password command switches to set username and password for both FTP
115and HTTP. The --http-passwd and --proxy-passwd command switches have
116been renamed to --http-password and --proxy-password respectively, and
117the related http_passwd and proxy_passwd .wgetrc commands to
118http_password and proxy_password respectively. The login and passwd
119.wgetrc commands have been deprecated.
120
121* `wget -b' now works correctly under Windows.
122
123
124* Wget 1.9.1 is a bugfix release with no user-visible changes.
125
126
127* Changes in Wget 1.9.
128
129** It is now possible to specify that POST method be used for HTTP
130requests. For example, `wget --post-data="id=foo&data=bar" URL' will
131send a POST request with the specified contents.
132
133** IPv6 support is available, although it's still experimental.
134
135** The `--timeout' option now also affects DNS lookup and establishing
136the TCP connection. Previously it only affected reading and writing
137data. Those three timeouts can be set separately using
138`--dns-timeout', `--connection-timeout', and `--read-timeout',
139respectively.
140
141** Download speed shown by the progress bar is based on the data
142recently read, rather than the average speed of the entire download.
143The ETA projection is still based on the overall average.
144
145** It is now possible to connect to FTP servers through FWTK
146firewalls. Set ftp_proxy to an FTP URL, and Wget will automatically
147log on to the proxy as "username@host".
148
149** The new option `--retry-connrefused' makes Wget retry downloads
150even in the face of refused connections, which are otherwise
151considered a fatal error.
152
153** The new option `--no-dns-cache' may be used to prevent Wget from
154caching DNS lookups.
155
156** Wget no longer escapes characters in local file names based on
157whether they're appropriate in URLs. Escaping can still occur for
158nonprintable characters or for '/', but no longer for frequent
159characters such as space. You can use the new option
160--restrict-file-names to relax or strengthen these rules, which can be
161useful if you dislike the default or if you're downloading to
162non-native partitions.
163
164** Handling of HTML comments has been dumbed down to conform to what
165users expect and other browsers do: instead of being treated as SGML
166declaration, a comment is terminated at the first occurrence of "-->".
167Use `--strict-comments' to revert to the old behavior.
168
169** Wget now correctly handles relative URIs that begin with "//", such
170as "//img.foo.com/foo.jpg".
171
172** Boolean options in `.wgetrc' and on the command line now accept
173values "yes" and "no" along with the traditional "on" and "off".
174
175** It is now possible to specify decimal values for timeouts, waiting
176periods, and download rate. For instance, `--wait=0.5' now works as
177expected, as does `--dns-timeout=0.5' and even `--limit-rate=2.5k'.
178
179
180* Wget 1.8.2 is a bugfix release with no user-visible changes.
181
182
183* Wget 1.8.1 is a bugfix release with no user-visible changes.
184
185
186* Changes in Wget 1.8.
187
188** A new progress indicator is now available and used by default.
189You can choose the progress bar type with `--progress=TYPE'. Two
190types are available, "bar" (the new default), and "dot" (the old
191dotted indicator). You can permanently revert to the old progress
192indicator by putting `progress = dot' in your `.wgetrc'.
193
194** You can limit the download rate of the retrieval using the
195`--limit-rate' option. For example, `wget --limit-rate=15k URL' will
196tell Wget not to download the body of the URL faster than 15 kilobytes
197per second.
198
199** Recursive retrieval and link conversion have been revamped:
200
201*** Wget now traverses links breadth-first. This makes the
202calculation of depth much more reliable than before. Also, recursive
203downloads are faster and consume *significantly* less memory than
204before.
205
206*** Links are converted only when the entire retrieval is complete.
207This is the only safe thing to do, as only then is it known what URLs
208have been downloaded.
209
210*** BASE tags are handled correctly when converting links. Since Wget
211already resolves <base href="..."> when resolving handling URLs, link
212conversion now makes the BASE tags point to an empty string.
213
214*** HTML anchors are now handled correctly. Links to an anchor in the
215same document (<a href="#anchorname">), which used to confuse Wget,
216are now converted correctly.
217
218*** When in page-requisites (-p) mode, no-parent (-np) is ignored when
219retrieving for inline images, stylesheets, and other documents needed
220to display the page.
221
222*** Page-requisites (-p) mode now works with frames. In other words,
223`wget -p URL-THAT-USES-FRAMES' will now download the frame HTML files,
224and all the files that they need to be displayed properly.
225
226** `--base' now works conjunction with `--input-file', providing a
227base for each URL and thereby allowing the URLs in the file to be
228relative.
229
230** If a host has more than one IP address, Wget uses the other
231addresses when accessing the first one fails.
232
233** Host directories now contain port information if the URL is at a
234non-standard port.
235
236** Wget now supports the robots.txt directives specified in
237<http://www.robotstxt.org/wc/norobots-rfc.txt>.
238
239** URL parser has been fixed, especially the infamous overzealous
240quoting. Wget no longer dequotes reserved characters, e.g. `%3F' is
241no longer translated to `?', nor `%2B' to `+'. Unsafe characters
242which are not reserved are still escaped, of course.
243
244** No more than 20 successive redirections are allowed.
245
246
247* Wget 1.7.1 is a bugfix release with no user-visible changes.
248
249
250* Changes in Wget 1.7.
251
252** SSL (`https') pages now work if you compile Wget with SSL support;
253use the `--with-ssl' configure flag. You need to have OpenSSL
254installed.
255
256** Cookies are now supported. Wget will accept cookies sent by the
257server and return them in later requests. Additionally, it can load
258and save cookies to disk, in the same format that Netscape uses.
259
260** "Keep-alive" (persistent) HTTP connections are now supported.
261Using keep-alive allows Wget to share one TCP/IP connection for
262many retrievals, making multiple-file downloads faster and less
263stressing for the server and the network.
264
265** Wget now recognizes FTP directory listings generated by NT and VMS
266servers.
267
268** It is now possible to recurse through FTP sites where logging in
269puts you in some directory other than '/'.
270
271** You may now use `~' to mean home directory in `.wgetrc'. For
272example, `load_cookies = ~/.netscape/cookies.txt' works as you would
273expect.
274
275** The HTML parser has been rewritten. The new one works more
276reliably, allows finer-grained control over which tags and attributes
277are detected, and has better support for some features like correctly
278skipping comments and declarations, decoding entities, etc. It is
279also more general.
280
281** <meta name="robots"> tags are now respected.
282
283** Wget's internal tables now use hash tables instead of linked lists
284where appropriate. This results in huge speedups when retrieving
285large sites (thousands of documents).
286
287** Wget now has a man page, automatically generated from the Texinfo
288documentation. (The last version that shipped with a man page was
2891.4.5). To get this, you need to have pod2man from the Perl
290distribution installed on your system.
291
292
293* Changes in Wget 1.6
294
295** Administrative changes.
296
297*** Maintainership. Due to Hrvoje being plagued with a "real job",
298Dan Harkless is the most active maintainer (not that he doesn't have a
299real job as well). Hrvoje still participates occasionally, and both
300are being helped by many other people.
301
302*** Web page. Thanks to Jan Prikryl, Wget has an "official" web page.
303Take a look at:
304
305 http://sunsite.dk/wget/
306
307*** Anonymous CVS. Thanks to ever-helpful Karsten Thygesen, Wget
308sources are now available at an anonymous CVS server. Take a look at
309the web page for downloading instructions.
310
311** New -K / --backup-converted / backup_converted = on option causes files
312modified due to -k to be saved with a .orig prefix before being changed. When
313using -N as well, it is these .orig files that are compared against the server.
314
315** New --follow-tags / follow_tags = ... option allows you to restrict
316Wget to following only certain HTML tags when doing a recursive
317retrieval. -G / --ignore-tags / ignore_tags = ... is just the
318opposite -- all tags but the ones you specify will be followed.
319
320** New --waitretry / waitretry = SECONDS option allows waiting between retries
321of failed downloads. Wget will use "linear" backoff, waiting 1 second after the
322first failure, 2 after the second, up to SECONDS. waitretry is set to 10 by
323default in the system wgetrc.
324
325** New -p / --page-requisites / page_requisites = on option causes
326Wget to download all ancillary files necessary to display a given HTML
327page properly (e.g. inlined images).
328
329** New -E / --html-extension / html_extension = on option causes Wget
330to append ".html" to text/html filenames not ending in regexp
331"\.[Hh][Tt][Mm][Ll]?".
332
333** New type of .wgetrc command -- "lockable Boolean". Can be set to on, off,
334always, or never. This allows the .wgetrc to override the commandline. So far,
335passive_ftp is the only .wgetrc command which takes a lockable Boolean.
336
337** A number of new translation files have been added.
338
339** New --bind-address / bind_address = <address> option for people on hosts
340bound to multiple IP addresses.
341
342** wget now accepts (illegal per HTTP spec) relative URLs in HTTP redirects.
343
344
345* Wget 1.5.3 is a bugfix release with no user-visible changes.
346
347
348* Wget 1.5.2 is a bugfix release with no user-visible changes.
349
350
351* Wget 1.5.1 is a bugfix release with no user-visible changes.
352
353
354* Changes in Wget 1.5.0
355
356** Wget speaks many languages!
357
358On systems with gettext(), Wget will output messages in the language
359set by the current locale, if available. At this time we support
360Czech, German, Croatian, Italian, Norwegian and Portuguese.
361
362** Opie (Skey) is now supported with FTP.
363
364** HTTP Digest Access Authentication (RFC2069) is now supported.
365
366** The new `-b' option makes Wget go to background automatically.
367
368** The `-I' and `-X' options now accept wildcard arguments.
369
370** The `-w' option now accepts suffixes `s' for seconds, `m' for
371minutes, `h' for hours, `d' for days and `w' for weeks.
372
373** Upon getting SIGHUP, the whole previous log is now copied to
374`wget-log'.
375
376** Wget now understands proxy settings with explicit usernames and
377passwords, e.g. `http://user:password@proxy.foo.com/'.
378
379** You can use the new `--cut-dirs' option to make Wget create less
380directories.
381
382** The `;type=a' appendix to FTP URLs is now recognized. For
383instance, the following command will retrieve the welcoming message in
384ASCII type transfer:
385
386 wget "ftp://ftp.somewhere.com/welcome.msg;type=a"
387
388** `--help' and `--version' options have been redone to to conform to
389standards set by other GNU utilities.
390
391** Wget should now be compilable under MS Windows environment. MS
392Visual C++ and Watcom C have been used successfully.
393
394** If the file length is known, percentages are displayed during
395download.
396
397** The manual page, now hopelessly out of date, is no longer
398distributed with Wget.
399
400
401* Wget 1.4.5 is a bugfix release with no user-visible changes.
402
403
404* Wget 1.4.4 is a bugfix release with no user-visible changes.
405
406
407* Changes in Wget 1.4.3
408
409** Wget is now a GNU utility.
410
411** Can do passive FTP.
412
413** Reads .netrc.
414
415** Info documentation expanded.
416
417** Compiles on pre-ANSI compilers.
418
419** Global wgetrc now goes to /usr/local/etc (i.e. $sysconfdir).
420
421** Lots of bugfixes.
422
423
424* Changes in Wget 1.4.2
425
426** New mirror site at ftp://sunsite.auc.dk/pub/infosystems/wget/,
427thanks to Karsten Thygesen.
428
429** Mailing list! Mail to wget-request@sunsite.auc.dk to subscribe.
430
431** New option --delete-after for proxy prefetching.
432
433** New option --retr-symlinks to retrieve symbolic links like plain
434files.
435
436** rmold.pl -- script to remove files deleted on the remote server
437
438** --convert-links should work now.
439
440** Minor bugfixes.
441
442
443* Changes in Wget 1.4.1
444
445** Minor bugfixes.
446
447** Added -I (the opposite of -X).
448
449** Dot tracing is now customizable; try wget --dot-style=binary
450
451
452* Changes in Wget 1.4.0
453
454** Wget 1.4.0 [formerly known as Geturl] is an extensive rewrite of
455Geturl. Although many things look suspiciously similar, most of the
456stuff was rewritten, like recursive retrieval, HTTP, FTP and mostly
457everything else. Wget should be now easier to debug, maintain and,
458most importantly, use.
459
460** Recursive HTTP should now work without glitches, even with Location
461changes, server-generated directory listings and other naughty stuff.
462
463** HTTP regetting is supported on servers that support Range
464specification. WWW authorization is supported -- try
465wget http://user:password@hostname/
466
467** FTP support was rewritten and widely enhanced. Globbing should now
468work flawlessly. Symbolic links are created locally. All the
469information the Unix-style ls listing can give is now recognized.
470
471** Recursive FTP is supported, e.g.
472 wget -r ftp://gnjilux.cc.fer.hr/pub/unix/util/
473
474** You can specify "rejected" directories, to which you do not want to
475enter, e.g. with wget -X /pub
476
477** Time-stamping is supported, with both HTTP and FTP. Try wget -N URL.
478
479** A new texinfo reference manual is provided. It can be read with
480Emacs, standalone info, or converted to HTML, dvi or postscript.
481
482** Fixed a long-standing bug, so that Wget now works over SLIP
483connections.
484
485** You can have a system-wide wgetrc (/usr/local/lib/wgetrc by
486default). Settings in $HOME/.wgetrc override the global ones, of
487course :-)
488
489** You can set up quota in .wgetrc to prevent sucking too much
490data. Try `quota = 5M' in .wgetrc (or quota = 100K if you want your
491sysadmin to like you).
492
493** Download rate is printed after retrieval.
494
495** Wget now sends the `Referer' header when retrieving
496recursively.
497
498** With the new --no-parent option Wget can retrieve FTP recursively
499through a proxy server.
500
501** HTML parser, as well as the whole of Wget was rewritten to be much
502faster and less memory-consuming (yes, both).
503
504** Absolute links can be converted to relative links locally. Check
505wget -k.
506
507** Wget catches hangup, filtering the output to a log file and
508resuming work. Try kill -HUP %?wget.
509
510** User-defined headers can be sent. Try
511
512 wget http://fly.cc.her.hr/ --header='Accept-Charset: iso-8859-2'
513
514** Acceptance/Rejection lists may contain wildcards.
515
516** Wget can display HTTP headers and/or FTP server response with the
517new `-S' option. It can save the original HTTP headers with `-s'.
518
519** socks library is now supported (thanks to Antonio Rosella
520<Antonio.Rosella@agip.it>). Configure with --with-socks.
521
522** There is a nicer display of REST-ed output.
523
524** Many new options (like -x to force directory hierarchy, or -m to
525turn on mirroring options).
526
527** Wget is now distributed under GNU General Public License (GPL).
528
529** Lots of small features I can't remember. :-)
530
531** A host of bugfixes.
532
533
534* Changes in Geturl 1.3
535
536** Added FTP globbing support (ftp://fly.cc.fer.hr/*)
537
538** Added support for no_proxy
539
540** Added support for ftp://user:password@host/
541
542** Added support for %xx in URL syntax
543
544** More natural command-line options
545
546** Added -e switch to execute .geturlrc commands from the command-line
547
548** Added support for robots.txt
549
550** Fixed some minor bugs
551
552
553* Geturl 1.2 is a bugfix release with no user-visible changes.
554
555
556* Changes in Geturl 1.1
557
558** REST supported in FTP
559
560** Proxy servers supported
561
562** GNU getopt used, which enables command-line arguments to be ordered
563as you wish, e.g. geturl http://fly.cc.fer.hr/ -vo log is the same as
564geturl -vo log http://fly.cc.fer.hr/
565
566** Netscape-compatible URL syntax for HTTP supported: host[:port]/dir/file
567
568** NcFTP-compatible colon URL syntax for FTP supported: host:/dir/file
569
570** <base href="xxx"> supported
571
572** autoconf supported
573
574
575----------------------------------------------------------------------
576Copyright information:
577
578Copyright (C) 2005 Free Software Foundation, Inc.
579
580 Permission is granted to anyone to make or distribute verbatim
581 copies of this document as received, in any medium, provided that
582 the copyright notice and this permission notice are preserved, thus
583 giving the recipient permission to redistribute in turn.
584
585 Permission is granted to distribute modified versions of this
586 document, or of portions of it, under the above conditions,
587 provided also that they carry prominent notices stating who last
588 changed them.
Note: See TracBrowser for help on using the repository browser.