Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove unnecessary explicit memory barriers on ARM #14567

Conversation

ysbaddaden
Copy link
Contributor

@ysbaddaden ysbaddaden commented May 6, 2024

Reading the LLVM atomics documentation, the notes for code generation state that weak architectures should generate memory barriers, which implies that all memory orders, except for relaxed, should always generate the necessary memory barriers when asked to.

Compiling and disassembling programs for ARMv6 and ARMv7 (you must specify a CPU or CPU feature to target these, otherwise LLVM defaults to ARMv4 or ARMv5), we can indeed notice that memory barriers are properly emitted by the LLVM code generation. For example the assembly contains dmb ish for ARMv7 and mcr 15, 0, r1, cr7, cr10, {5} for ARMv6.

Compiling for older ARMv4 or ARMv5 the atomics call the __sync_* symbols from libgcc where memory ordering is a noop (not supported by the micro architecture).

This patch thus removes the explicit memory barriers. They duplicate what LLVM backends are already doing (optimized away in some cases, but there may be edge cases where it's not), and this noticeably simplifies our code.

Reading the LLVM atomics documentation, the notes for code generation
state that weak architectures should generate memory barriers, which
implies that all memory orders, except for relaxed, should always
generate the necessary memory barriers when asked to.

Compiling and disassembling programs for ARMv6 and ARMv7 (you must
specify a CPU or CPU feature to target these), we can indeed notice that
memory barriers are properly emitted by the LLVM code generation. For
example the assembly contains `dmb ish` for ARMv7 and
`mcr 15, 0, r1, cr7, cr10, {5}` for ARMv6.

Compiling for older ARMv4 or ARMv5 the atomics call the `__sync_*`
symbols from `libgcc` where memory ordering is a noop (not supported).

This patch thus removes the explicit memory barriers as they duplicate
what LLVM backends are already doing anyway.
@ysbaddaden
Copy link
Contributor Author

This time it should really be the last time we deal with this topic 😅

The initial error was that we used a lazy set to clear atomics when we should have used explicit memory orders. Only on x86, in very specific scenarios, can we get away with an lazy set.

@straight-shoota straight-shoota added this to the 1.13.0 milestone May 6, 2024
@straight-shoota straight-shoota changed the title Fix: no need for explicit memory barriers on ARM Fix: No need for explicit memory barriers on ARM May 7, 2024
@straight-shoota straight-shoota merged commit ddd2d22 into crystal-lang:master May 7, 2024
60 checks passed
@ysbaddaden ysbaddaden deleted the fix/atomic-memory-order-on-weak-architectures branch May 7, 2024 09:27
@straight-shoota straight-shoota changed the title Fix: No need for explicit memory barriers on ARM Fix remove explicit memory barriers on ARM Jun 14, 2024
@straight-shoota straight-shoota changed the title Fix remove explicit memory barriers on ARM Remove unnecessary explicit memory barriers on ARM Jun 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

2 participants