1. Home
  2. Linux and UNIX
  3. Downloading files with wget

Downloading files with wget

The wget command is an internet file downloader that can download anything from files and web pages all the way through to entire websites.

Basic Usage

The wget command is in the format of:

wget [options] url

For example, in its most basic form, you would write a command something like this:

wget http://www.domain.com/filename.zip

This will download the filename.zip file from www.domain.com and place it in your current directory.

Redirecting Output

The -O option sets the output file name. If the file was called filename-4.0.1.zip and you wanted to save it directly to filename.zip you would use a command like this:

 wget -O filename.zip http://www.domain.com/filename-4.0.1.zip

The wget program can operate on many different protocols with the most common being ftp:// and http://.

Downloading in the background.

If you want to download a large file and close your connection to the server you can use the command:

wget -b url

Downloading Multiple Files

If you want to download multiple files you can create a text file with the list of target files. Each filename should be on its own line. You would then run the command:

wget -i filename.txt

You can also do this with an HTML file. If you have an HTML file on your server and you want to download all the links within that page you need add --force-html to your command.

To use this, all the links in the file must be full links, if they are relative links you will need to add <base href="/support/knowledge_base/"> following to the HTML file before running the command:

wget --force-html -i filename.html

Limiting the download speed

Usually, you want your downloads to be as fast as possible. However, if you want to continue working while downloading, you want the speed to be throttled.

To do this use the --limit-rate option. You would use it like this:

wget --limit-rate=200k http://www.domain.com/filename.tar.gz

Continuing a failed download

If you are downloading a large file and it fails part way through, you can continue the download in most cases by using the -c option.

For example:

 wget -c http://www.domain.com/filename.tar.gz

Normally when you restart a download of the same filename, it will append a number starting with .1  to the downloaded file and start from the beginning again.

Downloading in the background

If you want to download in the background use the -b option. An example of this is:

wget -b http://domain.com/filename.tar.gz

Checking if remote files exist before a scheduled download

If you want to schedule a large download ahead of time, it is worth checking that the remote files exist. The option to run a check on files is --spider.

In circumstances such as this, you will usually have a file with the list of files to download inside. An example of how this command will look when checking for a list of files is:

 wget --spider -i filename.txt

However, if it is just a single file you want to check, then you can use this formula:

wget --spider http://www.domain.com/filename.tar.gz

Copy an entire website

If you want to copy an entire website you will need to use the --mirror option. As this can be a complicated task there are other options you may need to use such as -p, -P, --convert-links, --reject and --user-agent.

 

-p This option is necessary if you want all additional files necessary to view the page such as CSS files and images
-P This option sets the download directory. Example: -P downloaded
--convert-links This option will fix any links in the downloaded files. For example, it will change any links that refer to other files that were downloaded to local ones.
--reject This option prevents certain file types from downloading. If for instance, you wanted all files except flash video files (flv) you would use --reject=flv
--user-agent This option is for when a site has protection in place to prevent scraping. You would use this to set your user agent to make it look like you were a normal web browser and not wget.

Using all these options to download a website would look like this:

wget --mirror -p --convert-links -P ./local-dir --user-agent="Mozilla/5.0 (Windows NT 6.3; WOW64; rv:40.0" http://www.domain.com/

TIP: Being Nice

It is always best to ask permission before downloading a site belonging to someone else and even if you have permission it is always good to play nice with their server. These two additional options will ensure you don’t harm their server while downloading.

--wait=15 --limit-rate=50K

This will wait 15 seconds between each page and limit the download speed to 50K/sec.

Downloading using FTP

If you want to download a file via FTP and a username and password is required, then you will need to use the --ftp-user and --ftp-password options.

An example of this might look like:

wget --ftp-user=USERNAME --ftp-password=PASSWORD ftp://ftp.domain.com/filename.tar.gz

Retry

If you are getting failures during a download, you can use the -t option to set the number of retries. Such a command may look like this:

wget -t 50 http://www.domain.com/filename.tar.gz

You could also set it to infinite retries using -t inf.

Recursive down to level X

If you want to get only the first level of a website, then you would use the -r option combined with the -l option.

For example, if you wanted only the first level of website you would use:

wget -r -l1 http://www.example.com/

Setting the username and password for authentication

If you need to authenticate an HTTP request you use the command:

wget --http-user=USERNAME --http-password=PASSWORD http://domain.com/filename.html

wget is a very complicated and complete downloading utility. It has many more options and multiple combinations to achieve a specific task. For more details, you can use the man wget command to bring up the wget manual. You can also find the wget manual here in webpage format.

Updated on November 27, 2018

Was this article helpful?

Related Articles