In PGO continuous mode, we mmap the profile file into shared memory, which
allows multiple processes to be updating the same memory.
The -fprofile-update=atomic option forces the counter increments to be atomic,
but the counter size is always 64-bit (in -m32 and -m64), so in 32-bit mode the
atomic operations are function calls to libatomic.a and these function calls use
locks.
The lock based libatomic.a functions are per-process, so two processes will race
on the same shared memory because each will acquire their own lock.