In the Linux kernel, the following vulnerability has been resolved:
fork: do not invoke uffd on fork if error occurs
Patch series "fork: do not expose incomplete mm on fork".
During fork we may place the virtual memory address space into an inconsistent state before the fork operation is complete.
In addition, we may encounter an error during the fork operation that indicates that the virtual memory address space is invalidated.
As a result, we should not be exposing it in any way to external machinery that might interact with the mm or VMAs, machinery that is not designed to deal with incomplete state.
We specifically update the fork logic to defer khugepaged and ksm to the end of the operation and only to be invoked if no error arose, and disallow uffd from observing fork events should an error have occurred.
This patch (of 2):
Currently on fork we expose the virtual address space of a process to userland unconditionally if uffd is registered in VMAs, regardless of whether an error arose in the fork.
This is performed in dupuserfaultfdcomplete() which is invoked unconditionally, and performs two duties - invoking registered handlers for the UFFDEVENTFORK event via dupfctx(), and clearing down userfaultfdforkctx objects established in dupuserfaultfd().
This is problematic, because the virtual address space may not yet be correctly initialised if an error arose.
The change in commit d24062914837 ("fork: use _mtdup() to duplicate maple tree in dup_mmap()") makes this more pertinent as we may be in a state where entries in the maple tree are not yet consistent.
We address this by, on fork error, ensuring that we roll back state that we would otherwise expect to clean up through the event being handled by userland and perform the memory freeing duty otherwise performed by dupuserfaultfdcomplete().
We do this by implementing a new function, dupuserfaultfdfail(), which performs the same loop, only decrementing reference counts.
Note that we perform mmgrab() on the parent and child mm's, however userfaultfdctxput() will mmdrop() this once the reference count drops to zero, so we will avoid memory leaks correctly here.