What is Uuencode (Unix-to-Unix encoding)?
Uuencode (also called Uuencode/Uudecode) is a popular utility for encoding and decoding files exchanged between users or systems in a network. It originated for use between users of UNIX systems (its name stood for "Unix-to-Unix encoding").
The command takes the default standard input and produces an encoded version to the standard output. This resulting encoded file with ASCII characters is usually larger than the original binary file.
When a file or email attachment (image, text file or program) is transmitted over a network, nonprintable characters may be interpreted by the network as commands. This may cause unintended consequences. That's why it's not safe to transmit files containing nonprintable characters.
Uuencode resolves this problem by translating or converting a file or email attachment from its binary or bit-stream representation into the 7-bit ASCII set of text characters. With Uuencode, text can be handled by older systems that may not handle binary files well. Furthermore, larger files can be more easily divided into multi-part transmissions.
The encoding uses only printable ASCII characters. It includes the file mode and the operand name used by Uudecode that converts the encoded data into its original form.
Uuencode is supported in several programming languages, including:
- Python: Using the codecs module with the codec "uu"
- Perl: Supports uuencoding natively with the pack() and unpack() operators, and the format string "u"
Uuencode and Uudecode
The uuencode and uudecode commands work together.
Uuencode converts a binary file to ASCII data.
Uudecode converts the encoded file containing ASCII data back into its original binary file.
The resulting file is named name (or if the -o option is given: outfile). In all respects, it will retain the mode of the original file, except that it does not retain setuid (root-owned set user ID) and the execute bits.
Importance of Uuencode
Email messages often go to (or through) computers with different character sets. Sometimes they're handled by programs that are not 8-bit clean. An 8-bit clean system can deal correctly with extended character sets that use all 8 bits of a byte, which is what differentiates these sets from ASCII.
Such a system assumes all characters have codes in the range 0 to 127, leaving the top bit of each byte free for use as a parity bit or flag bit. This assumption works with English, but not with other languages that have larger alphabets.
If a binary file (e.g., an email message) is sent via a system or communications link that is not 8-bit clean, it will get corrupted. This is where the Uuencode command plays a critical role. Uuencode uses only ASCII characters.
Encoding binary files with Uuencode protects them from corruption. Uudecode then reverses the effects of Uuencode, so files arrive at their destination intact and unchanged.
Uuencode syntax and operation
Uuencoded data starts with a line that takes the form:
begin <mode> <file>
<mode> is the file's read/write/execute permissions
<file> is the name to be used when recreating the binary data
Example: begin 644 myfile.zip
- SourceFile: Specifies the name of the input binary file; default is standard input.
- RemoteFile: Specifies the name of the decoded file.
Localization environment variables
How does Uuencode work?
Uuencode takes in a group of three pre-formatted bytes (24 bits) and splits them into four groups of six bits each. These groups are treated as numbers of value 0-63. If there are less than three bytes left, it adds trailing zeros.
- It also adds begin/end tags, filename and delimiters
- Decimal 32 is added to each number
- The new numbers are output as ASCII characters from 32 (space) to 95 (underscore).
- Each group of 60 output characters (45 input bytes) is output as a separate line.
- Each line is preceded by an M.
- At the end of the input, if there are N output characters left after the last group of 60 and if N>0, they will be preceded by the character with code 32+N.
- The output consists of one line containing just a single space and one line containing the word "end."
History of Uuencode
The uuencode command first appeared in Berkeley Software Distribution (BSD) 4.0. BSD was an operating system (OS) based on the source code of the Research Unix OS developed at Bell Labs in the 1970s. Although BSD and its derivatives were eventually discontinued, its descendants are used by many current proprietary OSes, including Apple's macOS and iOS.
The name Uuencode stood for Unix-to-Unix encoding, since it was originally meant to be used by users of Unix systems. The idea was to use a safe encoding method to transfer files from one Unix system to another. It was used with UUCP (Unix to Unix Copy Protocol) -- a Unix utility that copies files from one computer to another -- to transfer binary files over serial lines that did not preserve the top bit of characters.
Now, Uuencode is used to send binary files in ASCII format over the internet, via email, to post to USENET newsgroups, etc.