Tar is an immeasurably useful archival tool available on most Linux/Unix-based systems. It is used regularly as part of Linux systems administration at all skill level. This article will tackle a number of common usage case and questions regarding tar. When finished, you will have learned how to effectively utilize tar archives to: compress, create, extract, modify, and otherwise operate on archive files in your day to day workflows. Before we being, lets review the full list of questions this article will answer.
- What is a tar file?
- What is the tar syntax?
- How to create a new tar archive?
- How to work with existing archives?
- How to add new/replace existing files inside an archive?
- How to delete/remove files from an archive?
- How to list files inside a tar archive?
- How to find files in tar? How to search for files inside of tar?
- How to extract files from an existing tar archive?
- How to decompress a tarball?
What is a tar file?
The standard tar file is an archive format common to Linux and Unix-based operating systems. These files take the place of the more widely-known zip file format and are used for both storage and transportation of groups of related files and/or directories between devices.
The tar archive format, just like its zip file cousin, contains an array of individual files and/or directories as well as any statistical data about them like attributes, path, ownership, and permissions. All files contained within a tar archive can be easily listed, searched, added, deleted, compressed, and extracted directly on the command line using the tar binary.
What is a tarball?
In the default format, a tar file contains an uncompressed data stream. However, it is generally more common for these archives to be compressed by an external compression program such as: gzip, bzip2, lzip, lzma, lzop, zstd, compress, and more.
What is the difference between a tar file and a tarball?
A standard tar file uses the .tar
extension and stores data in an uncompressed data stream. The term tarball specifically refers to a compressed tar archive. Tarballs will have a unique file extension based on the compression used to create them (e.g. .tgz
for gzip, most common). These file extensions simplify identification of the compression software that will be needed to decompress the file. These extensions come in both long & short formats. Some compression tools can support multiple extensions. The following quick reference table helps to swiftly identify the necessary tool and associated arguments needed for each of the common compression types.
Tar Compression Quick Reference
gzip | bzip2 | xz | lzip | lzma | lzop | zstd | compress | |||
---|---|---|---|---|---|---|---|---|---|---|
Option | –gzip | –bzip2 | –xz | –lzip | –lzma | –lzop | –zstd | –compress | ||
Flag | -z | -j | -J | – | – | – | – | -Z | ||
Extension | .gz | .bz2 | .tz2 | .xz | .lz | .lzma | .lzo | .zst | .Z | |
Tarball | .tgz | .taz | .tbz | .tbz2 | .txz | – | .tlz | – | .tzst | .taZ |
When in doubt, try: --auto-compress or -a to detect the compression type automatically. |
What is the tar syntax?
When working on the command line it is important to understand all command line arguments being executed. Otherwise we can end up with unexpected and/or potentially damaging results. Tar is no different.
The general syntax for tar is similar to many command line tools and accepts multiple types of arguments. These arguments vary depending on the operation mode being executed. The Tar General Syntax table provided below provides a basic overview of overall syntax expected by tar. We will use this key throughout the article as we go over each operation mode.
🛈 Examples in this document employ the -v
(--verbose
) flag to better illustrate the results of the command line executed. This flag is entirely optional.
Tar General Syntax
tar OPERATION [OPTIONS…] ARCHIVE ARCHIVE | |
tar OPERATION [-f|–file ARCHIVE] [OPTIONS…] [FILES…] | |
tar OPERATION [-f|–file ARCHIVE] [OPTIONS…] [MEMBERS…] | |
Legend | |
OPTION | An argument that is not one of the other types: OPERATION, FILE, MEMBER, or ARCHIVE. |
OPERATION | Which operation tar will performs from only one of the following: CATENATE, CREATE, DIFF, DELETE, LIST, UPGRADE, or EXTRACT |
FILE | A path targeting one or more files/dirs on the system outside of tar. Supports: relative & full pathing, file globs, and wildcards. |
MEMBER | A path targeting one or more files/dirs inside of a tar ARCHIVE. Supports: relative pathing only, file globs, and wildcards. |
ARCHIVE | A path to a tar file on the file system outside of tar. |
What are long & short form command line arguments?
Command line arguments come in both long and short flavors. Throughout this article we will cover syntax and examples of both format types. To differentiate them, we will call the long argument format options, while the short form are called flags. These formats on a technical level are synonyms and are interchangeable in most cases. The exception being flag concatenation.
How to use command line options?
The long format options use a double hyphen prefix (--
) coupled with one or more case-insensitive English words strung together with single hyphens (-
). These words are easy to recall and descriptive, making it possible to remember them without referencing the manual.
Consider these examples to illustrate the point:
--create
is the operation mode to CREATE a new archive.--list
is the operation mode to LIST contents of an archive.--no-recursion
is the option to DISABLE RECURSION when processing directories.
How to use command line flags?
Short form flags, unlike their longer siblings, are case-sensitive. They consist of a single English character prefixed with a single hyphen (-
) . This shorter format is ideally the first letter of its longer cousins. However, due to character limits and conflicts, some options have to use a different character or capitalization instead. Additionally, some potentially dangerous and/or lesser used options will have only their single long form.
Examples to consider:
-c
is the flag for--create
which makes sense.-t
is the flag for--list
, not the expected-l
or-L
.--delete
has no short flag format and only has it’s long form.
How to use flag concatenation?
Flags are the preferred format when it comes to passing multiple arguments on the command line. However, unlike their longer format, flags can be strung together in series using a single hyphen (-
) prefix to further reduce the complexity of the overall command line.
Consider combining the short forms of --file
(-f
) and --create
(-c
). You can merge these options into the single concatenated format of -cf ARCHIVE
and tar will parse this into its individual short form components.
How to specify the working tar archive?
All tar operation modes will require a target working archive. This is supplied using the special FILE argument. FILE must be immediately followed by the path of the archive on the system that needs to be worked on. The basic syntax for the FILE option is as follows:
Option Syntax: FILE
Long | –file ARCHIVE |
Short | -f ARCHIVE |
Description | Specify the working archive being operated on. |
How to create a new tar archive?
When creating a new archive, in addition to the previously mentioned file option, we must also pass the CREATE operation and supply it with one or more FILE paths to be added to the new ARCHIVE. By default, tar refuses to create an empty archive so you must supply at-least one valid FILE path in order to successfully create the archive.
When specifying FILES to add to your ARCHIVE, any full paths will be stripped of their leading forward slash (/
). This converts it to a relative path inside the archive which is an important distinction for operations that target MEMBERS inside an archive. The theory behind this behavior is a sort of safety precaution to prevent archive extraction from overwriting files in another location when extracted.
If the target archive you wish to create already exists, the original archive will be squashed, which means overwritten without confirmation. So be sure to save any needed existing archives before squashing them with a new archive.
Now let’s review the syntax and examples of the CREATE operation mode .
Operation Mode: CREATE
Long | tar –create [–file ARCHIVE] [OPTIONS…] FILES… |
Short | tar -cf ARCHIVE [OPTIONS…] FILES… |
Description | Create new ARCHIVE containing all specified FILES… Directories are added recursively unless –no-recursion is supplied. |
Example 1A – Create myarchive.tar, populated with dir1 dir2 file1 and file2 | |
bash-4.2$ tar -cvf myarchive.tar dir1 dir2 file1 file2 dir1/ dir1/file3 dir2/ dir2/file4 file1 file2 |
The example shows the creation of a new tar file named myarchive.tar
in the current working directory and populate it with dir1
, dir2
, file1
, and file2
. We can see from the verbose output that file3
and file4
were also added. This is due to directory recursion being enabled by default so everything looks as expected here and we have successfully created the archive.
How to work with existing archives?
There are a number of additional operations that can be performed when working with existing archives. They range from adding new files, to removing or replacing existing files, or listing files contained within an archive, and (most commonly) extracting files from archives. We will go over each of these operation modes, their syntax, and complete an example of each to further drive these lessons home.
How to add new/replace existing files inside an archive?
The UPDATE operation is used to add new files and replace existing files contained within a target archive. When applying the UPDATE operation, you will need to also supply one or more FILE paths that will be added to the specified ARCHIVE. If a FILE path already exists as a MEMBER within the archive, and the file you are adding is newer , the old file will be replaced inside the archive.
Operation Mode: UPDATE
Long | tar –update [–file ARCHIVE] [OPTIONS…] [FILES…] |
Short | tar -u [-f ARCHIVE] [OPTIONS…] [FILES…] |
Description | Add all FILES… to ARCHIVE replacing any existing files if newer. Directories are added recursively unless –no-recursion is supplied. |
Example 2 – Archive Contents Before | |
dir1/ dir1/file3 dir2/ dir2/file4 file1 file2 |
|
Example 2 – Add dir3/file5 to myarchive.tar | |
bash-4.2$ tar -uvf myarchive.tar dir3/file5 dir3/file5 |
|
Example 2 – Archive Contents After | |
dir1/ dir1/file3 dir2/ dir2/file4 file1 file2 dir3/file5 |
Continuing with our previous example we use our freshly minted myarchive.tar
file, then use the UPDATE operation to add a single file (file5
) contained within dir3
. We have specifically added the file5
without adding the whole dir3
directory, which is a noteworthy difference.
How to delete/remove files from an archive?
Removing a file from an archive requires the DELETE operation mode. Unlike previous modes, DELETE does not have a short form flag. This is a type of safety precaution used for potentially damaging operations, so the full long syntax is required.
Targeting archive MEMBERS for removal requires using relative pathing. You will need to make sure the path supplied to tar does not start with a leading forward slash (/
), otherwise, the MEMBERS inside the archive will not be found.
The following example shows how to target dir1
for deletion from our archive. We then follow up with listing all files in the archive to show that dir1
is indeed no longer present.
Operation Mode: DELETE
Long | tar –delete [–file ARCHIVE] [OPTIONS…] MEMBERS… |
Short | There is no short option equivalent. |
Description | Remove one or more MEMBERS… from ARCHIVE permanently. Does not operate on compressed archives. |
Example 3 – Archive Contents Before | |
dir1/ dir1/file3 dir2/ dir2/file4 file1 file2 dir3/file5 |
|
Example 3 – Add dir3/file5 to myarchive.tar | |
bash-4.2$ tar -uvf myarchive.tar dir3/file5 dir3/file5 |
|
Example 3 – Archive Contents After | |
dir1/ dir1/file3 dir2/ dir2/file4 file1 file2 |
How to list files inside a tar archive?
Listing the files contained inside a target archive is another very common task. This is where the --list
option, or its -t
flag counterpart, come into play. On its own, without any additional arguments, running the LIST operation will print the full MEMBER list of the archive.
However, you can narrow down this list by supplying full or partial MEMBER paths or globs to existing MEMBERS in the archive. The key item to remember when working with archive MEMBERS is to use relative pathing when refining your selection. So you will almost never start a MEMBER path with a forward slash since any leading forward slashes on a path get removed when added to an archive.
The trick is to remember that when operating inside the tar archive, all pathing is relative.
Operation Mode: LIST
Long | tar –list [–file ARCHIVE] [OPTIONS…] [MEMBERS…] |
Short | tar -tf ARCHIVE [OPTIONS…] [MEMBERS…] |
Description | List all or some MEMBERS… from ARCHIVE. Arguments are optional. |
Example 4 – Listing all files within an existing archive | |
bash-4.2$ tar -tf myarchive.tar dir1/ dir1/file3 dir2/ dir2/file4 file2 dir3/file5 |
|
Example 5 – List specific files from an archive | |
bash-4.2$ tar -tf myarchive.tar file2 dir3/file5 file1 file2 dir3/file5 tar: file1: Not found in archive tar: Exiting with failure status due to previous errors |
How to find files in tar? How to search for files inside of tar?
Tar has built-in support for file globs providing support for standard wildcard characters which can be used to refine the selection of MEMBERS within an archive. All commands that accept MEMBERS as an argument can take advantage of these wildcards. The wildcard characters in question are the standard asterisk (*
) for matching everything and the question mark (?
) for matching any single character. The following additional LIST examples show how easy it is to use wildcards to find files inside of a tar archives.
The trick is to remember that when operating inside the tar archive, all pathing is relative.
Example 6 – Listing files from an archive using the question mark ( ? ) wildcard. | |
bash-4.2$ tar -tf myarchive.tar “dir?/” dir1/ dir1/file3 dir2/ dir2/file4 dir3/file5 |
|
Example 7 – Listing files from an archive using the asterisk ( * ) wildcard. | |
bash-4.2$ tar -tf myarchive.tar “d*/file*“ dir1/file3 dir2/file4 dir3/file5 |
File globs are a spectrum of complexity and the more you learn about it, the more efficient you can be when utilizing Tar in your day to day workflow. The following manual entry goes over file globs in much greater detail than what was covered here in this article.
How to extract files from an existing tar archive?
Extraction is hands down the most used operation performed on tar archives and tarballs. Syntax-wise, extract operates exactly the same as the LIST command. It too also supports wildcard and globs so you can specifically target only the individual files or directories you need to extract.
Operation Mode: EXTRACT
Long | tar –extract [–file ARCHIVE] [OPTIONS…] [MEMBERS…] |
Short | tar -xf ARCHIVE [OPTIONS…] [MEMBERS…] |
Description | Extract all or some MEMBERS… from ARCHIVE. MEMBERS are optional. Synonyms: –get |
Example 8 – Extract all files within an existing archive | |
bash-4.2$ tar -xvf myarchive.tar dir1/ dir1/file3 dir2/ dir2/file4 file2 dir3/file5 |
|
Example 9 – Extract specific files from an archive | |
bash-4.2$ tar -xvf myarchive.tar file2 dir3/file5 file1 file2 dir3/file5 tar: file1: Not found in archive tar: Exiting with failure status due to previous errors |
⚠ Errors like in the example indicate that the specified file was not present in the target archive and thus tar will return an error state but still extracts the rest of the items it found regardless of the error.
How to decompress a tarball?
There are several types of compressed tar file formats. The following are additional examples of these file types, their associated compression tool, and both the long and short forms arguments needed to perform a decompress along with a general extraction of all archive members.
Long Form Extensions | Short Form Extensions | |||||
---|---|---|---|---|---|---|
gzip | tar -xf myname.tar.gz –gzip | tar -xzf myname.tgz tar -xzf myname.taz | ||||
bzip2 | tar -xf myname.tar.bz2 –bzip2 tar -xf myname.tar.tz2 –bzip2 | tar -xjf myname.tbz | ||||
xzs | tar -xf myname.tar.xz –xz | tar -xJf myname.txz | ||||
lzip | tar -xf myname.tar.lz –lzip | |||||
lzma | tar -xf myname.tar.lzma –lzma | tar -xf myname.tlz –lzma | ||||
lzop | tar -xf myname.tar.lzo –lzop | |||||
zstd | tar -xf myname.tar.zst –zstd | tar -xf myname.tzst –zstd | ||||
compress | tar -xf myname.tar.Z –compress | tar -xf myname.taZ –compress |
Jason Potter is a Senior Linux Systems Administrator & Technical Writer with more than 20 years experience providing technical support to customers and has a passion for writing competent and thorough technical documentation at all skill levels.
Leave a Reply