Remove unnecessary explicit memory barriers on ARM #14567
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Reading the LLVM atomics documentation, the notes for code generation state that weak architectures should generate memory barriers, which implies that all memory orders, except for relaxed, should always generate the necessary memory barriers when asked to.
Compiling and disassembling programs for ARMv6 and ARMv7 (you must specify a CPU or CPU feature to target these, otherwise LLVM defaults to ARMv4 or ARMv5), we can indeed notice that memory barriers are properly emitted by the LLVM code generation. For example the assembly contains
dmb ish
for ARMv7 andmcr 15, 0, r1, cr7, cr10, {5}
for ARMv6.Compiling for older ARMv4 or ARMv5 the atomics call the
__sync_*
symbols fromlibgcc
where memory ordering is a noop (not supported by the micro architecture).This patch thus removes the explicit memory barriers. They duplicate what LLVM backends are already doing (optimized away in some cases, but there may be edge cases where it's not), and this noticeably simplifies our code.