Very Advanced Usage (GNU Wget 1.21.1-dirty Manual)

From Get docs
Wget/docs/latest/Very-Advanced-Usage /
Revision as of 03:54, 6 December 2021 by Notes (talk | contribs) (Page commit)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Previous: Advanced Usage, Up: Examples   [Contents][Index]

7.3 Very Advanced Usage

  • If you wish Wget to keep a mirror of a page (or FTP subdirectories), use ‘--mirror’ (‘-m’), which is the shorthand for ‘-r -l inf -N’. You can put Wget in the crontab file asking it to recheck a site each Sunday:

    crontab 0 0 * * 0 wget --mirror -o /home/me/weeklog

  • In addition to the above, you want the links to be converted for local viewing. But, after having read this manual, you know that link conversion doesn’t play well with timestamping, so you also want Wget to back up the original HTML files before the conversion. Wget invocation would look like this:

    wget --mirror --convert-links --backup-converted \ -o /home/me/weeklog

  • But you’ve also noticed that local viewing doesn’t work all that well when HTML files are saved under extensions other than ‘.html’, perhaps because they were served as index.cgi. So you’d like Wget to rename all the files served with content-type ‘text/html’ or ‘application/xhtml+xml’ to name.html.

    wget --mirror --convert-links --backup-converted \ --html-extension -o /home/me/weeklog \

    Or, with less typing:

    wget -m -k -K -E -o /home/me/weeklog