URL (Uniform Resource Locator)
What is a URL?
A URL (Uniform Resource Locator) is a unique identifier used to locate a resource on the Internet. It is also referred to as a web address. URLs consist of multiple parts -- including a protocol and domain name -- that tell a web browser how and where to retrieve a resource.
End users use URLs by typing them directly into the address bar of a browser or by clicking a hyperlink found on a webpage, bookmark list, in an email or from another application.
How is a URL structured?
The URL contains the name of the protocol needed to access a resource, as well as a resource name. The first part of a URL identifies what protocol to use as the primary access medium. The second part identifies the IP address or domain name -- and possibly subdomain -- where the resource is located.
URL protocols include HTTP (Hypertext Transfer Protocol) and HTTPS (HTTP Secure) for web resources, mail to for email addresses, FTP for files on a File Transfer Protocol (FTP) server, and telnet for a session to access remote computers. Most URL protocols are followed by a colon and two forward slashes; "mail to" is followed only by a colon.
Optionally, after the domain, a URL can also specify:
Importance of a URL design
URLs can only be sent over the Internet using the ASCII character-set. Because URLs often contain non-ASCII characters, the URL must be converted into a valid ASCII format. URL encoding replaces unsafe ASCII characters with a "%" followed by two hexadecimal digits. URLs cannot contain spaces.
When designing URLs, there are different theories about how to make the syntax most usable for readers and archivists. For example, in the URL's path, dates, authors, and topics can be included in a section referred to as the "slug." Consider, for example, the URL for this definition:
Look past the protocol (identified as HTTPS) and the permalink (www.techtarget.com) and we see the file path includes two paths (searchnetworking and definition) and the title of the definition (URL).
Additionally, some URL designers choose to put the date of the post, typically, as (YYYY/MM/DD).
Parts of a URL
Using the URL https://www.techtarget.com/whatis/search/query?q=URL as an example, components of a URL can include:
- The protocol or scheme. Used to access a resource on the internet. Protocols include http, https, ftps, mailto and file. The resource is reached through the domain name system (DNS) name. In this example, the protocol is https.
- Host name or domain name. The unique reference the represents a webpage. For this example, whatis.techtarget.com.
- Port name. Usually not visible in URLs, but necessary. Always following a colon, port 80 is the default port for web servers, but there are other options. For example, :port80.
- Path. A path refers to a file or location on the web server. For this example, search/query.
- Query. Found in the URL of dynamic pages. The query consists of a question mark, followed by parameters. For this example, ?.
- Parameters. Pieces of information in a query string of a URL. Multiple parameters can be separated by ampersands (&). For this example, q=URL.
- Fragment. This is an internal page reference, which refers to a section within the webpage. It appears at the end of a URL and begins with a hashtag (#). Although not in the example above, an example could be #history in the URL https://en.wikipedia.org/wiki/Internet#History.
Other examples of parts of a URL can include:
- The URL mailto:[email protected] initiates a new email addressed to the mailbox president in the domain whitehouse.gov.
- The URL ftp://www.companyname.com/whitepapers/widgets.ps specifies the use of the FTP protocol to download a file.
HTTP vs. HTTPs
Both HTTP and HTTPS are used to retrieve data from a web server to view content in a browser. The difference between them is that HTTPS uses a Secure Sockets Layer (SSL) certificate to encrypt the connection between the end user and the server.
HTTPS is vital to protecting sensitive information, such as passwords, credit card numbers and identity data, from unauthorized access.
HTTPS uses TCP/IP port number 443 by default, whereas HTTP uses port 80.
URL vs. URI
A URL is the most common type of Uniform Resource Identifier (URI). URIs are strings of characters used to identify a resource over a network. URLs are essential to navigating the internet.
URL shortening is a technique in which an URL may be made substantially shorter in length and still direct to the required page. A shortener achieves this using a redirect on a domain name that is short.
There are many URL shortener services available. While many are free, those that offer capabilities such as Web analytics, charge a fee. Companies that offer URL shorteners include Rebrandly, Bitly, Ow.ly, clicky.me and Budurl.com.
Some Web site hosts, such as GoDaddy.com, offer URL shorteners. Other service providers, including search engines, have begun turning away from URL shorteners because they are often subject to abuse by spammers, who hide malware inside shortened URLs.
The retention of data related to Web usage has become a huge privacy concern. There has been increased public demand for search engine and application service providers to be transparent in what information they collect, retain and sell.
However, Google also collects and retains data for various lengths of time. Some data can be deleted whenever a person wants, some data is deleted automatically, and some data Google retains for longer periods of time when necessary.