The process for viewing webpages is a matter of downloading files that a browser then interprets. From that perspective, HTTP is just another file transfer function, like FTP. If that's true, what other tools support similar file transfers from web servers or other remote systems? The answer includes cURL and Wget.
This article examines cURL and Wget, provides syntax examples, suggests use cases and notes the benefits of each. These tools are mostly on Linux systems, but network administrators can also use them on Windows and macOS.
What is cURL?
The client URL (cURL) tool is flexible, enabling file transfers to and from remote systems using a wide variety of protocols.
Before looking at basic syntax, let's start with installing cURL.
Many Linux distributions ship with cURL pre-installed. Use the system's package manager to add it if necessary.
On Red Hat and similar distributions, use the following command:
dnf install curl
Users of Debian-derived distributions can run the following command:
apt install curl
Most Macs already have cURL, but users can add or upgrade it with Brew by typing the following command:
brew install curl
Windows users can add cURL to their systems by running the builds found here. Note that Windows 10 and 11 include cURL already.
Users might also need to add the libcurl library to these systems if cURL is not installed by default.
Basic cURL syntax
Most command-line users are familiar with the basic cURL syntax. Type the command, followed by one or more options, and the path to the file to download.
curl -options /path/to/file
Users can also specify a protocol, as seen in a later section. The default protocol is HTTP.
Download a file using cURL
The two primary options for file downloads using the curl command are -o and -O. The -o option saves the file to a location users specify with a determined name. The -O option saves the file to the present location using the file's current name.
For example, suppose a user wants to download a file named project.pdf from a fictitious website called project-products.com. The PDF is stored in a folder named 2023. The initial command looks like this:
curl -O http://project-products.com/2023/projects.pdf
CURL downloads the file to the user's present working directory -- type the pwd command to check the location -- and names it projects.pdf.
However, the user could rename the file during the transfer and place it in a specific directory using the -o option. Here's what the command might look like if a user wanted to give the file a new name and save it to the goals directory in a home folder:
curl -o /home/user/goals/new-projects.pdf http://project-products.com/2023/projects.pdf
Users likely want to download more than one file from this site. Suppose a user has the original scenario but needs to grab two PDFs instead of projects.pdf. The command might look like the following:
curl -O http://project-products.com/2023/projects1.pdf -O http://project-products.com/2023/projects2.pdf
One of cURL's strengths is the wide variety of protocols it supports. FTP is more efficient at transferring large files, so users might want to use it rather than HTTP. Just specify the protocol in the path with the following command:
curl -u username:password -O ftp://project-products.com/2023/projects.pdf
FTP might require authentication, so use the -u option to specify a username and password the remote system recognizes.
Upload a file with cURL
Users can also upload files to a remote location with cURL. This might be helpful with log files or other transfers.
The -d option identifies the transfer as an HTTP POST transfer. Add the local path to the file being sent to the remote server, and then add the remote server's address. The following command sends projects.pdf to the specified directory on the HTTP server:
curl -d @projects.pdf http://project-products.com/uploads
Use the same syntax for an FTP upload but change the protocol, as seen here:
curl -d @projects.pdf ftp://project-products.com/uploads
Additional cURL options
CURL is a flexible tool that has many options to cover users' needs. Here are a few more examples of less common features.
The --compressed option requests that the transfer be compressed at the source and uncompressed at the destination. Use this option when bandwidth is limited.
Use the -C option to resume a file transfer that was interrupted by a network or power outage or something similar. This option continues the transfer using the existing file name rather than duplicating it.
CURL's benefits include the following:
- Supports many protocols.
- Available for many platforms.
- Transfers files up and down.
- Supports compression options.
Manual file transfers, like the ones described above, are certainly an option. Imagine, however, the potential of adding the curl command to a script.
Maybe an orchestrated development process includes automated nightly builds at the headquarters office that must be transferred to remote offices. A curl command-based script could streamline this process. Another use case involves VM images that must be distributed to other sites. Or perhaps the remote servers need to transfer log files or other data to a central storage repository at the HQ site. CURL is an option for those scenarios, too.
|Download resources to the current directory.
|Download resources to a specified directory.
|Specify a username and password for FTP.
|Upload a file.
|Compress the transfer.
|Resume an interrupted transfer.
What is Wget?
The Wget utility has a more specific purpose compared with cURL. It's meant for downloading files from web servers using the most common protocols: HTTP, HTTPS, FTP and FTPS. Users might retrieve single files, groups of files or even tar archives.
Wget is less likely to be pre-installed on a Linux system than cURL. Windows and macOS don't include it by default, either. It's easy to add Wget to these and other platforms.
On a Linux computer, use the system's package manager to add Wget.
On Red Hat and similar distributions, use the following command:
dnf install wget
Users of Debian-derived distributions can type the following command instead:
apt install wget
Add or upgrade Wget on a Mac using Brew by typing the following:
brew install wget
Windows users download the Wget executable file (wget.exe). To run the application, however, place Wget in a folder of choice, and add that folder to the PATH environment variable. Another alternative is to put it in the C:\Windows\System32 directory, though this is not considered a best practice.
Basic Wget command syntax
Let's start with the simple goal of downloading a file from a website -- perhaps the latest version of an application, VM image, Dockerfile, configuration file or archive. Currently, many vendors make the newest version of their products available as a simple download along a consistent path.
The syntax is as follows:
command -options path
A basic download from a pretend site looks like the following:
Wget downloads the PDF to the current directory.
By default, Wget pulls files from the specific URL and places them in the current working directory. Users can specify a different destination location by using the -P option followed by the folder to store the downloaded file.
For example, to retrieve a file from a site and place it in the /projects/updates directory, type the following:
wget -P /projects/updates http://project-products.com/2023/projects.pdf
This option is helpful if users script Wget to pull multiple files and place them in a different location from where the script itself resides.
Wget normally uses the target file's original name, but users can rename the file during the download by running the -O option. Here's what that command looks like:
wget -O 2023projects.pdf http://project-products.com/2023/projects.pdf
This option makes it easier to track the contents of retrieved files when vendors use generic names, such as lastest.doc or current.zip, for their resources.
One particularly useful Wget feature is the ability to specify multiple target sites in a text file. Wget can then process the file and grab the resources from each site. Use the -i option to designate the input file.
The retrieve-resources.txt text file contents might look like the following:
Use the following command to reference the file and download the specified resources:
wget -i retrieve-resources.txt
Use Wget with FTP
While the assumed protocol is HTTP or HTTPS, users can specify FTP to connect to FTP servers for file downloads. Recall that FTP requires some sort of authentication, so users need to provide a username and password recognized by the remote FTP server, as seen here:
wget --ftp-user=NAME --ftp-password=PASSWORD ftp://project-products/2023/projects.pdf
Be careful about putting sensitive authentication information in a text file. Carefully consider the options before calling FTP downloads in text files using the -i option mentioned above.
One of Wget's strengths is its power and flexibility to retrieve files recursively from the source web server. It has many options and combinations of flags. The basic recursive command -r is below -- assume the 2023 directory contains many files:
wget -r http://project-products.com/2023/
Users likely want to add the -np option, otherwise Wget downloads the parent directory and its contents. Modify the above command to the following:
wget -np -r http://project-products.com/2023/
Additional options exist to specify the recursive retrieval level, or the number of subdirectories to retrieve. Other flags enable users to exclude files they don't want to download.
|Specify a file to exclude from the retrieval.
|Specify a number of levels to recursively retrieve from by replacing (number) with a quantity of levels.
The following table reviews the primary options discussed above.
|Download the specified file to the current directory.
|Specify a destination directory different from the current directory.
|Specify a new file name for the retrieved file.
|Get a list of files to retrieve from a specified text file.
Additional Wget options
Users might find the following additional options helpful in specific cases.
Set the Wget file transfer to run in the background using the -b option. If it's a time-consuming download, check the log file with the second command shown below:
wget -b http://project-products.com/2023/vm-image-lastest.iso
tail -f wget -log
Wget is an older utility from the days of less reliable network connections. One feature is its ability to retry interrupted download attempts. The default retry value is 20, but users can adjust this value with the --tries= option. Set it to 0, inf (infinity) or anything in between, as seen here:
wget --tries=42 http://project-products.com/2023/projects.pdf
Similarly, use the -c option to continue interrupted downloads. If a download must restart due to an interruption, the new file has .1 appended to the file name because the original attempted file already exists.
wget -c http://project-products.com/2023/ vm-image-lastest.iso
Another concern with large files is controlling bandwidth consumption. Set Wget to consume only a portion of available bandwidth by using the --limit-rate= option. A limit of 750 KB looks like this:
wget --limit-rate=750k http://project-products.com/2023/ vm-image-lastest.iso
Various other Wget options exist, but these should be plenty to start.
Common use cases for both tools
Both cURL and Wget are useful for file transfers. Other use cases include the following:
- Download the latest image of containers or VMs.
- Download specific OS ISO images.
- Download software packages, such as Microsoft .msi, Brew .sh or Linux .rpm/.deb packages.
- Download configuration files from a central repository.
- Mirror important websites for air-gapped or isolated networks.
- Upload -- in the case of cURL -- any of the above examples to remote destination servers.
Any of these examples might use a public internet-based web server as a source or an internal private web server. In other words, users can easily integrate cURL or Wget into their organizations' file distribution system.
So, how do these two tools compare with each other? Here are a few key points:
- CURL uses more protocols and supports more host platforms.
- CURL uploads and downloads resources, whereas Wget primarily downloads files.
- Wget is a simple executable on the system, whereas cURL is a more complete application with a supporting library.
- Wget only uploads using a limited HTTP POST feature, while cURL is a better choice for pushing files to remote locations.
- Wget's ability to retrieve resources recursively is beneficial and something cURL lacks.
It's worth noting that cURL uses a license similar to the MIT license. However, it's free and open source software. Wget is a GNU utility and relies on the GNU General Public License for licensing.
Wget is more oriented on retrieving webpages or websites, while cURL is a do-it-all file transfer tool. Evaluate the above comparison, and pick the right tool for the job. Also, consider how to integrate the tools into an automation scheme for greater efficiency.