runc 1.1.13 and earlier as well as 1.2.0-rc2 and earlier can be tricked into creating empty files or directories in arbitrary locations in the host filesystem by sharing a volume between two containers and exploiting a race with os.MkdirAll. While this can be used to create empty files, existing files will not be truncated.
An attacker must have the ability to start containers using some kind of custom volume configuration. Containers using user namespaces are still affected, but the scope of places an attacker can create inodes can be significantly reduced. Sufficiently strict LSM policies (SELinux/Apparmor) can also in principle block this attack -- we suspect the industry standard SELinux policy may restrict this attack's scope but the exact scope of protection hasn't been analysed.
This is exploitable using runc directly as well as through Docker and Kubernetes.
The CVSS score for this vulnerability is CVSS:3.1/AV:L/AC:L/PR:N/UI:R/S:C/C:N/I:L/A:N (Low severity, 3.6).
Using user namespaces restricts this attack fairly significantly such that the attacker can only create inodes in directories that the remapped root user/group has write access to. Unless the root user is remapped to an actual user on the host (such as with rootless containers that don't use /etc/sub[ug]id), this in practice means that an attacker would only be able to create inodes in world-writable directories.
A strict enough SELinux or AppArmor policy could in principle also restrict the scope if a specific label is applied to the runc runtime, though we haven't thoroughly tested to what extent the standard existing policies block this attack nor what exact policies are needed to sufficiently restrict this attack.
Fixed in runc v1.1.14 and v1.2.0-rc3.
main
patches:
release-1.1
patches:
Thanks to Rodrigo Campos Catelin (@rata) and Alban Crequy (@alban) from Microsoft for discovering and reporting this vulnerability.