CVE-2026-31431 “Copy Fail”: Analysis, Impact and Mitigation Strategies

What Is the Copy Fail Linux Vulnerability (CVE-2026-31431)?

CVE-2026-31431, publicly known as Copy Fail, is a vulnerability that allows local privilege escalation (LPE) in the Linux kernel with a severity of High and a CVSS 3.1 score of 7.8. The flaw affects the AF_ALG cryptographic path for AEAD operations and allows a local unprivileged user to trigger a controlled 4-byte write into the page cache of readable files, with a potential impact of root access and container escape in shared infrastructures.

Unlike classic vulnerabilities such as Dirty COW, which depend on race conditions, Copy Fail is a direct and reproducible logic flaw. That characteristic increases operational risk in cloud, multi-tenant, Kubernetes, and CI/CD environments, where different workloads share the same host kernel.

Introduction and context

The vulnerability resides in a logic flaw within the AEAD cryptographic implementation of the Linux kernel, specifically in algif_aead, the component that exposes the kernel’s cryptographic API to user space through AF_ALG sockets.

In most production environments, this interface is not required for the normal operation of the system. However, it typically comes enabled by default in modern distributions, which increases the attack surface and the relevance of this vulnerability in Linux.

Technical analysis of the vulnerability

The problem originates from the interaction of three components that, on their own, are legitimate:

  • The splice() system call, which transfers data by passing direct references to page cache pages without copying them, an optimization widely used in the kernel.
  • An optimization introduced in 2017 in algif_aead that allows decryption operations to be performed in-place, assuming that it is safe to reuse the same memory as source and destination.
  • The authencesn algorithm, the only one in the kernel that uses the destination buffer as temporary working space during the cryptographic operation.

When these three elements coincide in a single operation, the kernel performs a 4-byte write outside the bounds of the allocated buffer, overwriting page cache pages that belong to read-only files.

The most relevant behavior of this corruption is that the modification occurs in shared RAM and is not synchronized to disk. In practice, the file on disk may remain intact while the binary executed from the page cache is modified. For integrity verification tools based solely on the file on disk, the system may appear to be clean.

Impact on Kubernetes, containers, and multi-tenant environments

For Kubernetes clusters and multi-tenant platforms, Copy Fail represents a high risk. The page cache is managed by the host and is shared among the different workloads running on the same kernel.

The exploitation scenario is straightforward. An attacker who achieves code execution in an unprivileged container, whether through a vulnerable application, a compromised dependency, or leaked credentials, can corrupt the page cache copy of a shared SUID binary. By invoking that binary, the attacker can obtain root not only within the workload, but on the vulnerable host itself.

Environments with stronger workload isolation, such as microVMs or sandboxes backed by virtualization, can significantly reduce this attack surface by preventing different workloads from sharing the same host page cache. Traditional container platforms based on runc do not provide that same level of isolation by default.

How to identify whether a Linux system is vulnerable

To verify whether the organization’s infrastructure is vulnerable to Copy Fail, the following validations are recommended:

Kernel version. Run uname -r and validate against the vendor’s advisories. According to NVD and the available public references, the affected branches and their first corrected upstream versions are:

  • 4.14 up to < 5.10.254
  • 5.11 up to < 5.15.204
  • 5.16 up to < 6.1.170
  • 6.2 up to < 6.6.137
  • 6.7 up to < 6.12.85
  • 6.13 up to < 6.18.22
  • 6.19 up to < 6.19.12

Branches prior to 4.14 fall outside the publicly cited regression. In distributions with backports, the final validation must be done against the vendor advisory and not solely by version number.

Use of the AF_ALGinterface. Identify whether legitimate applications on the system are using the affected socket by running lsof | grep AF_ALG or ss -f alg. If no process is using it, it can generally be disabled with low impact, but it is advisable to first confirm that there are no legitimate consumers dependent on this interface.

Verification of the cryptographic module. Validate how algif_aead was compiled with the command grep CONFIG_CRYPTO_USER_API_AEAD /boot/config-$(uname -r). If the result is =y, the component is built into the kernel and tools such as rmmod will not be able to unload it. In that scenario, any boot-time mitigation must be specifically evaluated and validated for the affected platform.

At Silent4Labs we have also created a script that can help safely identify this vulnerability on Linux systems, since it makes no changes to the system.

Copy Fail in Linux
Output script

It can be found at: https://github.com/Silent4Labs/check-copyfail-cve-2026-31431

Mitigation Strategies for Copy Fail in Linux

The defensive approach requires implementing both immediate and medium-term controls.

1. Applying the Official Linux Kernel Patch

The definitive mitigation is updating the kernel. The main fix published upstream reverts the optimization introduced in 2017 and removes the chaining of page cache pages towards an in-place write path in algif_aead.

The main public reference for the fix is commit a664bf3d603d. In scenarios with manual backports or vendor-specific patch series, validation must be done against the distributor’s advisory or against the complete applied series, not assuming that an isolated change identified by name or message is sufficient.

2. Blocking AF_ALG With Seccomp Policies

If the patch cannot be applied immediately, the publicly best-supported temporary mitigation consists of preventing algif_aead from loading, for example with an install algif_aead /bin/false rule in modprobe.d, and unloading the module if it is already active.

This action has implications that must be considered:

  • Security protocols such as SSH, disk encryption (LUKS), and kTLS are not interrupted in the typical case, because they do not depend on this user-space interface.
  • Hardware cryptographic acceleration through this API becomes disabled.
  • Applications that use this interface, such as the afalg engines in OpenSSL, must be configured to fall back to software functions. Otherwise, they may fail after the reboot required for the change to take effect.

For the above reasons, it is recommended to validate the behavior in test environments before applying the change in production.

As an additional technical alternative, if CONFIG_CRYPTO_USER_API_AEAD=y and the component is built into the kernel, a boot-time workaround such as initcall_blacklist=algif_aead_init may be evaluated. That option should be treated as a contingency mitigation to be validated on a case-by-case basis, not as the primary temporary recommendation.

3. Hardening Kubernetes and CI/CD Environments

For Kubernetes clusters and continuous integration (CI/CD) runners, the most recommended protection in the short term is the strict application of Seccomp policies to block the creation of AF_ALG sockets. This control interrupts the exploitation chain at its first step, without requiring changes to the host kernel.

It is advisable to verify whether the default Seccomp profile of the runtime in use, such as Docker, containerd, or CRI-O, or Kubernetes’ RuntimeDefault already covers this restriction. In many cases it does not, so explicit hardening of the profile becomes necessary.

Conclusions on the Copy Fail Linux Vulnerability

The analysis of Copy Fail and of the page cache modification technique shows a relevant shift in how malicious code can evade perimeter monitoring controls and disk-based file integrity controls.

The data converges on four actionable recommendations for cybersecurity, operations, and platform teams.

  1. Apply the official patch as soon as possible. The main public reference for the fix is commit a664bf3d603d, but in environments with backports or vendor patch series, the effective correction must be validated against the corresponding official advisory.
  1. Harden Seccomp profiles in container environments. Blocking the creation of AF_ALG sockets cuts the exploitation chain at its first step. This measure is applicable immediately without the need to update the host kernel.
  1. Do not rely exclusively on disk-based integrity controls. Tools that validate integrity through checksums against files on disk are blind to this type of page cache corruption.
  1. Reassess the isolation model in multi-tenant workloads. For infrastructure running workloads from multiple tenants with sensitive data, it is recommended to migrate towards environments with stronger isolation, such as microVMs or sandboxes backed by virtualization. The logical isolation of a container is not equivalent to a complete security barrier.

References

  • NVD, CVE-2026-31431: https://nvd.nist.gov/vuln/detail/CVE-2026-31431
  • CERT-EU, Security Advisory 2026-005: https://cert.europa.eu/publications/security-advisories/2026-005/
  • Ubuntu Security, CVE-2026-31431: https://ubuntu.com/security/CVE-2026-31431
  • Ubuntu Blog, available mitigations and fixes: https://ubuntu.com/blog/copy-fail-vulnerability-fixes-available
  • Technical case site copy.fail: https://copy.fail
  • Xint / Theori, main technical analysis: https://xint.io/blog/copy-fail-linux-distributions
  • Public PoC repository: https://github.com/theori-io/copy-fail-CVE-2026-31431
  • Main upstream fix on kernel.org, commit a664bf3d603d: https://git.kernel.org/stable/c/a664bf3d603dc3bdcf9ae47cc21e0daec706d7a5
  • Silent4Labs assessment script: https://github.com/Silent4Labs/check-copyfail-cve-2026-31431

Autor

  • Eduardo Salmerón is a Computer Engineering graduate from FES Aragón and has a solid foundation in cybersecurity, beginning with a diploma in Information Security from UNAM and complemented by specialized training in Intrusion Testing and Incident Response. He is certified as an ECIH (Certified Incident Handler). With more than 10 years of experience, he has served as a consultant across the finance, energy, tourism, and retail sectors. He currently works as a Cyber Threat Researcher at Silent4Business.e currently works as a Cyber Threat Researcher at Silent4Business.

Post Comment

LinkedIn
Share
×