Browse Definitions :
Definition

file extension (file format)

What is a file extension (file format)?

In computing, a file extension is a suffix added to the name of a file to indicate the file's layout, in terms of how the data within the file is organized. A file's data must be organized in the correct format to ensure that it can be accessed by the software program associated with the specific file type. File extensions also provide users with quick insight into the types of files they're working with.

A file extension comes after the period in a filename and is typically made up of three or four alphanumeric characters that identify the file's format. For example, in a file named testfile1.txt, the extension is txt, which indicates that the underlying file is a plain text document. However, in a file named testfile2.jpeg, the extension is jpeg, which indicates that this is a graphic file that conforms to the Joint Photographic Experts Group (JPEG) format.

A filename can include multiple periods, as in testfile.3.2.csv. In most cases, the extension includes only the characters after the final period. There are exceptions, however, such as the extension tar.gz, which is used for a certain type of compressed archive file. Sometimes, a file might appear to have a two-part extension, as in testfile4.xlsx.exe, but this is often a ploy used by hackers to send what appears to be a legitimate file that is actually an executable file whose purpose is to damage or infiltrate a system.

A file extension can be as short as one or two characters, or it can be much longer than average, such as the .catproduct extension. Whatever the extension, an operating system must be able to recognize it in order to associate it with the correct program. If the OS cannot determine the correct program, the user must specify which one to use.

Operating systems and file extensions

An operating system might rely solely on the file extension to determine which application to use, or it might also rely on file metadata. Each OS varies in terms of how it uses extensions when matching files to applications and the degree to which it uses them. Windows, for example, relies heavily on file extensions and cannot open files without them. Linux relies on extensions when they're available, but it can also use the Multipurpose Internet Mail Extensions (MIME) identifier that is associated with each file.

MIME provides a system for identifying different file formats so the files can be exchanged across the internet and opened on different systems. For instance, when a web browser accesses a document, it can tell from the MIME type how to display that document even if the file was created by an application running on another OS.

The MIME identifiers make it possible for Linux to open a file in the appropriate application even if the filename lacks an extension. For example, the MIME identifier for a text document is text/plain. If Linux comes across this identifier in a file without an extension, the OS knows to open the file in the default text editor. If an extension is provided, however, Linux will use that when determining which application to use, rather than the MIME type.

How the email delivery process works.
Because standard email protocols like SMTP, POP or IMAP can transmit emails with MIME identifiers, Linux can open a file in an application even when a filename doesn't have a file extension.

The macOS operating system takes a similar approach to Linux but adds another layer: the Uniform Type Identifiers (UTI) framework. The UTI provides a system for uniquely identifying each file type and mapping them to MIME identifiers. The UTI also helps to address issues that come with handling files created under legacy file-tagging systems. Like Linux, macOS still relies on file extensions to a certain degree, but not to the extent that Windows relies on them, which means that macOS can also open files without extensions.

Regardless of how an OS handles file extensions, the extensions themselves do nothing more than indicate what a file's underlying format is supposed to be. An extension does not guarantee a file's actual format, nor does changing the extension affect that format. If the name of a PDF file contains a .pdf extension, the OS will open that file in the default PDF viewer. If the file's extension is then changed to .txt, the OS will instead try to open the file in a text editor. Even if it succeeds, however, most of the file's content will be displayed as gibberish.

Types of file extensions

The world of computing is full of file extensions, too many to list in a single article. Each one attempts to telegraph the format of the underling file so the OS knows how to handle that file. Here is just a small sampling of some of the more common file extensions:

  • Text and word processing files. doc, docx, odt, pages, rtf, txt, wpd, wps.
  • Spreadsheet files. csv, numbers, ods, xls, xlsx.
  • Web-related files. asp, aspx, css, htm, html, jsp, php, xml.
  • Image files. bmp, gif, ico, jpeg, jpg, png, raw, tif, tiff.
  • Audio and video files. aif, mov, mp3, mp4, mpg, wav, wma, wmv.
  • Draw program files. afdesign, ai, cad, cdr, drw, dwg, eps, odg, svg, vsdx.
  • Page layout files. afpub, indd, pdf, pdfxml, pmd, pub, qxp.
  • Programming files. c, cpp, cs, java, js, json, py, sql, swift, vb.
  • Compression and archive files. 7z, rar, tar, tar.gz, zip.
  • System files. bak, cfg, conf, ini, msi, sys, tmp.
  • Executable program files. app, bat, bin, cmd, com, exe, vbs, x86.

There are thousands of other file extensions as well. They're used for databases, vector images, disk images, presentation software, email programs, virtual environments, file encoding, GPS software and a variety of other purposes. FileInfo.com maintains a searchable database that contains over 10,000 file extensions. Developers can register their file extensions on this site if they're building applications that require unique file formats.

There are also thousands of software programs, so it's not surprising that some file extensions are associated with multiple file formats and applications. For instance, the .prf extension might be used for Microsoft Outlook, Windows system files, QuarkXPress, Apple ClarisWorks, IBM FileNet eForms or another type of software.

See how to check and verify file integrity.

This was last updated in May 2023

Continue Reading About file extension (file format)

Networking
  • User Datagram Protocol (UDP)

    User Datagram Protocol (UDP) is a communications protocol primarily used to establish low-latency and loss-tolerating connections...

  • Telnet

    Telnet is a network protocol used to virtually access a computer and provide a two-way, collaborative and text-based ...

  • big-endian and little-endian

    The term endianness describes the order in which computer memory stores a sequence of bytes.

Security
  • advanced persistent threat (APT)

    An advanced persistent threat (APT) is a prolonged and targeted cyber attack in which an intruder gains access to a network and ...

  • Mitre ATT&CK framework

    The Mitre ATT&CK (pronounced miter attack) framework is a free, globally accessible knowledge base that describes the latest ...

  • timing attack

    A timing attack is a type of side-channel attack that exploits the amount of time a computer process runs to gain knowledge about...

CIO
HRSoftware
  • employee resource group (ERG)

    An employee resource group is a workplace club or more formally realized affinity group organized around a shared interest or ...

  • employee training and development

    Employee training and development is a set of activities and programs designed to enhance the knowledge, skills and abilities of ...

  • employee sentiment analysis

    Employee sentiment analysis is the use of natural language processing and other AI techniques to automatically analyze employee ...

Customer Experience
  • customer profiling

    Customer profiling is the detailed and systematic process of constructing a clear portrait of a company's ideal customer by ...

  • customer insight (consumer insight)

    Customer insight, also known as consumer insight, is the understanding and interpretation of customer data, behaviors and ...

  • buyer persona

    A buyer persona is a composite representation of a specific type of customer in a market segment.

Close