4.3 KiB
CVE-2024-50066
Description
In the Linux kernel, the following vulnerability has been resolved:mm/mremap: fix move_normal_pmd/retract_page_tables raceIn mremap(), move_page_tables() looks at the type of the PMD entry and thespecified address range to figure out by which method the next chunk ofpage table entries should be moved.At that point, the mmap_lock is held in write mode, but no rmap locks areheld yet. For PMD entries that point to page tables and are fully coveredby the source address range, move_pgt_entry(NORMAL_PMD, ...) is called,which first takes rmap locks, then does move_normal_pmd(). move_normal_pmd() takes the necessary page table locks at source anddestination, then moves an entire page table from the source to thedestination.The problem is: The rmap locks, which protect against concurrent pagetable removal by retract_page_tables() in the THP code, are only takenafter the PMD entry has been read and it has been decided how to move it. So we can race as follows (with two processes that have mappings of thesame tmpfs file that is stored on a tmpfs mount with huge=advise); notethat process A accesses page tables through the MM while process B does itthrough the file rmap:process A process B========= =========mremap mremap_to move_vma move_page_tables get_old_pmd alloc_new_pmd *** PREEMPT *** madvise(MADV_COLLAPSE) do_madvise madvise_walk_vmas madvise_vma_behavior madvise_collapse hpage_collapse_scan_file collapse_file retract_page_tables i_mmap_lock_read(mapping) pmdp_collapse_flush i_mmap_unlock_read(mapping) move_pgt_entry(NORMAL_PMD, ...) take_rmap_locks move_normal_pmd drop_rmap_locksWhen this happens, move_normal_pmd() can end up creating bogus PMD entriesin the line pmd_populate(mm, new_pmd, pmd_pgtable(pmd)). The effectdepends on arch-specific and machine-specific details; on x86, you can endup with physical page 0 mapped as a page table, which is likelyexploitable for user->kernel privilege escalation.Fix the race by letting process B recheck that the PMD still points to apage table after the rmap locks have been taken. Otherwise, we bail andlet the caller fall back to the PTE-level copying path, which will thenbail immediately at the pmd_none() check.Bug reachability: Reaching this bug requires that you can createshmem/file THP mappings - anonymous THP uses different code that doesn'tzap stuff under rmap locks. File THP is gated on an experimental configflag (CONFIG_READ_ONLY_THP_FOR_FS), so on normal distro kernels you needshmem THP to hit this bug. As far as I know, getting shmem THP normallyrequires that you can mount your own tmpfs with the right mount flags,which would require creating your own user+mount namespace; though I don'tknow if some distros maybe enable shmem THP by default or something likethat.Bug impact: This issue can likely be used for user->kernel privilegeescalation when it is reachable.
POC
Reference
- https://www.vicarius.io/vsociety/posts/cve-2024-50066-kernel-detection-vulnerability
- https://www.vicarius.io/vsociety/posts/cve-2024-50066-kernel-mitigation-vulnerability