XZ Utils

Lossless data compression software
XZ Utils
Original author(s)Lasse Collin
Developer(s)The Tukaani Project
Repository
  • {{URL|example.com|optional display text}}Lua error in Module:EditAtWikidata at line 29: attempt to index field 'wikibase' (a nil value).
Written inC
Engine
    Lua error in Module:EditAtWikidata at line 29: attempt to index field 'wikibase' (a nil value).
    Operating systemCross-platform
    TypeData compression
    LicensePublic domain.[1] (but see details in Development and adoption)
    WebsiteArchived 2024-03-25 at the Wayback Machine
    .xz
    Filename extension
    .xz
    Internet media type
    application/x-xz
    Magic numberFD 37 7A 58 5A 00
    Developed byLasse Collin
    Igor Pavlov
    Initial releaseJanuary 14, 2009; 15 years ago (2009-01-14)
    Latest release
    1.1.0
    December 11, 2022; 23 months ago (2022-12-11)
    Type of formatData compression
    Open format?Yes
    Free format?Yes
    WebsiteArchived 2023-11-23 at the Wayback Machine

    XZ Utils (previously LZMA Utils) is a set of free software command-line lossless data compressors, including the programs lzma and xz, for Unix-like operating systems and, from version 5.0 onwards, Microsoft Windows. For compression/decompression the Lempel–Ziv–Markov chain algorithm (LZMA) is used. XZ Utils started as a Unix port of Igor Pavlov's LZMA-SDK that has been adapted to fit seamlessly into Unix environments and their usual structure and behavior.

    On March 29, 2024, a backdoor was discovered in the 5.6.0 and 5.6.1 distribution of XZ Utils.[2][3]

    Features

    In most cases, xz achieves higher compression rates than alternatives like gzip and bzip2. Decompression speed is higher than bzip2, but lower than gzip. Compression can be much slower than gzip, and is slower than bzip2 for high levels of compression, and is most useful when a compressed file will be used many times.[4][5]

    XZ Utils consists of two major components:

    Various command shortcuts exist, such as lzma (for xz --format=lzma), unxz (for xz --decompress; analogous to gunzip) and xzcat (for unxz --stdout; analogous to zcat)

    XZ Utils can compress and decompress both the xz and lzma file formats, but since the LZMA format is now legacy,[6] XZ Utils compresses by default to xz.

    Usage

    Both the behavior of the software as well as the properties of the file format have been designed to work similarly to those of the popular Unix compressing tools gzip and bzip2.

    Just like gzip and bzip, xz and lzma can only compress single files (or data streams) as input. They cannot bundle multiple files into a single archive – to do this an archiving program is used first, such as tar.

    Compressing an archive:

    xz   my_archive.tar    # results in my_archive.tar.xz
    lzma my_archive.tar    # results in my_archive.tar.lzma
    

    Decompressing the archive:

    unxz    my_archive.tar.xz      # results in my_archive.tar
    unlzma  my_archive.tar.lzma    # results in my_archive.tar
    

    Version 1.22 or greater of the GNU implementation of tar has transparent support for tarballs compressed with lzma and xz, using the switches --xz or -J for xz compression, and --lzma for LZMA compression.

    Creating an archive and compressing it:

    tar -c --xz   -f my_archive.tar.xz   /some_directory    # results in my_archive.tar.xz
    tar -c --lzma -f my_archive.tar.lzma /some_directory    # results in my_archive.tar.lzma
    

    Decompressing the archive and extracting its contents:

    tar -x --xz   -f my_archive.tar.xz      # results in /some_directory
    tar -x --lzma -f my_archive.tar.lzma    # results in /some_directory
    

    Single-letter tar example for archive with compress and decompress with extract using short suffix:

    tar cJf keep.txz keep   # archive then compress the directory ./keep/ into the file ./keep.txz
    tar xJf keep.txz        # decompress then extract the file ./keep.txz creating the directory ./keep/
    

    xz has supported multi-threaded compression (with the -T flag)[7] since 2014, version 5.2.0.;[8] since version 5.4.0 threaded decompression has been implemented. Threaded decompression requires multiple compressed blocks within a stream which are created by the threaded compression interface.[7] The number of threads can be less than defined if the file is not big enough for threading with the given settings or if using more threads would exceed the memory usage limit.[7]

    The xz format

    The xz format improves on lzma by allowing for preprocessing filters. The exact filters used are similar to those used in 7z, as 7z's filters are available in the public domain via the LZMA SDK.

    Development and adoption

    Development of XZ Utils took place within the Tukaani Project, which was led by Mike Kezner, by a small group of developers who once maintained a Linux distribution based on Slackware.

    All of the source code for xz and liblzma has been released into the public domain. The XZ Utils source distribution additionally includes some optional scripts and an example program that are subject to various versions of the GPL.[1]

    Specifically, the full list of GPL scripts and sources distributed with the XZ Utils software include:

    • An optional implementation of a common libc function, getopt (GNU LGPL v2.1)
    • An m4 script for pthread detection (GNU GPL v3)
    • Some nonessential wrapper scripts (xzgrep, etc) (GNU GPL v2)
    • And the example program scanlzma, which is not integrated with the build system

    The resulting software xz and liblzma binaries are public domain, unless the optional LGPL getopt implementation is incorporated.[9]

    Binaries are available for FreeBSD, NetBSD, Linux systems, Microsoft Windows, and FreeDOS. A number of Linux distributions, including Fedora, Slackware, Ubuntu, and Debian use xz for compressing their software packages. Arch Linux previously used xz to compress packages,[10] but as of December 27, 2019, packages are compressed with Zstandard compression.[11] Fedora Linux also switched to compressing its RPM packages with Zstandard with Fedora Linux 31[12]. The GNU FTP archive also uses xz.

    Supply chain attack

    On 29 March 2024, a thread[2] was published on Openwall's oss-security mailing list showing that the code of liblzma was potentially compromised. The thread author Andres Freund identified compressed test files which have been added to the code for setting up a backdoor via additions to the configure script in the tar files. He started his investigation because sshd was using a high amount of CPU.[13] The issue is tracked under the Common Vulnerabilities and Exposures ID CVE-2024-3094 which was issued by Red Hat following the disclosure of the vulnerability.[14]

    The malicious code is known to be in version 5.6.0 and 5.6.1. While the exploit remains dormant unless a specific third-party patch is used, under the right circumstances this interference could potentially enable a malicious actor to break sshd authentication and gain unauthorized access to the entire system remotely.[15] The malicious mechanism consists of two compressed test files that contain the malicious binary code. These files are available in the git repository, but remains dormant unless extracted and injected into the program.[16] The code uses the glibc IFUNC mechanism to replace an existing function in OpenSSH called RSA_public_decrypt with a malicious version. OpenSSH normally does not load liblzma, but a common third-party patch used by several Linux distributions cause it to load libsystemd, which in turn loads lzma.[16] A modified version of build-to-host.m4 was included in the release tar file uploaded on Github, which extracts a script that performs the actual injection. This modified m4 file was not present in the git repository; it was only available from tar files released by the maintainer separate from git.[16] A script that extracts the malicious code from "test case" files and injects them into liblzma. The file appears to only perform the injection when the system is being built on an x86-64 Linux system that uses glibc and GCC and is being built via dpkg or rpm.[16] While the backdoor was added by Jia Tan, a maintainer of the xz project, tt is unknown whether this backdoor was intentionally placed by the maintainer or whether the maintainer was compromised.[17]

    The list of affected Linux distributions includes Debian unstable[18], Fedora Rawhide[19], Kali Linux[20], OpenSUSE Tumbleweed[21]. Confirmed to not be affected are Red Hat Enterprise Linux[22], SUSE Linux Enterprise[21], Amazon Linux[23].

    While Arch Linux issued advisory for users to update immediately it nevertheless noted that Arch's OpenSSH package does not include the common third-party patch necessary for the backdoor[24]. The analysis of "affected" binary on Arch did not found differences between the "fixed" and "affected" version on assembly level.[25][better source needed]

    FreeBSD is not affected by this attack, as all supported FreeBSD releases include versions of xz that predate the affected releases and the attack targets Linux's glibc.[26]

    References

    1. 1.0 1.1 Licensing on tukaani.org "The most interesting parts of XZ Utils (e.g. liblzma) are in the public domain. You can do whatever you want with the public domain parts. Some parts of XZ Utils (e.g. build system and some utilities) are under different free software licenses such as GNU LGPLv2.1, GNU GPLv2, or GNU GPLv3."
    2. 2.0 2.1 Freund, Andres (2024-03-29). "backdoor in upstream xz/liblzma leading to ssh server compromise". oss-security mailing list.
    3. https://archlinux.org/news/the-xz-package-has-been-backdoored/
    4. Henry-Stocker, Sandra (2017-12-12). "How to squeeze the most out of Linux file compression". Network World. Retrieved 2020-02-09.
    5. "Gzip vs Bzip2 vs XZ Performance Comparison". RootUsers. 2015-09-16. Retrieved 2020-02-09.
    6. LZMA Utils, retrieved 2011-01-25
    7. 7.0 7.1 7.2 "Linux Manpages Online - man.cx manual pages".
    8. XZ Utils Release Notes
    9. "In what cases is the output of a GPL program covered by the GPL too?". GNU.org. Retrieved 21 August 2019.
    10. Pierre Schmitz (2010-03-23). "News: Switching to xz compression for new packages".
    11. "Arch Linux - News: Now using Zstandard instead of xz for package compression". www.archlinux.org. Retrieved 2020-01-07.
    12. Mach, Daniel. "Changes/Switch RPMs to zstd compression". Fedora Project Wiki. Retrieved 30 March 2024.
    13. "A backdoor in xz". lwn.net. Retrieved 2024-03-30.
    14. "NVD - CVE-2024-3094". nvd.nist.gov. Retrieved 2024-03-30.
    15. "Urgent security alert for Fedora 41 and Rawhide users". www.redhat.com. Retrieved 2024-03-29.
    16. 16.0 16.1 16.2 16.3 James, Sam. "xz-utils backdoor situation". Gist.
    17. Goodin, Dan (2024-03-29). "Backdoor found in widely used Linux utility breaks encrypted SSH connections". Ars Technica. Retrieved 2024-03-29.
    18. "CVE-2024-3094". security-tracker.debian.org. Retrieved 2024-03-30.
    19. "Urgent security alert for Fedora 41 and Fedora Rawhide users". www.redhat.com. Retrieved 2024-03-30.
    20. "All about the xz-utils backdoor | Kali Linux Blog". Kali Linux. 2024-03-29. Retrieved 2024-03-30.
    21. 21.0 21.1 "openSUSE addresses supply chain attack against xz compression library". openSUSE News. 2024-03-29. Retrieved 2024-03-30.
    22. "cve-details". access.redhat.com. Retrieved 2024-03-30.
    23. "CVE-2024-3094". Amazon Web Services, Inc. Retrieved 2024-03-30.
    24. "Arch Linux - News: The xz package has been backdoored". archlinux.org. Retrieved 2024-03-30.
    25. "Re: backdoor in upstream xz/liblzma leading to ssh server compromise". openwall.com. Retrieved 2024-03-30.
    26. "Disclosed backdoor in xz releases - FreeBSD not affected". Retrieved 2024-03-30.

    External links