-
Notifications
You must be signed in to change notification settings - Fork 14.1k
Enable outline-atomics by default on more AArch64 platforms
#144938
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
r? @ChrisDenton rustbot has assigned @ChrisDenton. Use |
|
These commits modify compiler targets. |
|
Whoops r? @ghost |
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
[experiment] enable outline-atomics on more aarch64 platforms try-job: arm-android try-job: dist-android try-job: dist-x86_64-freebsd try-job: dist-aarch64-windows-gnullvm try-job: dist-aarch64-apple try-job: aarch64-msvc-1 try-job: aarch64-msvc-2 try-job: dist-aarch64-msvc
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
[experiment] enable outline-atomics on more aarch64 platforms try-job: arm-android try-job: dist-android try-job: dist-x86_64-freebsd try-job: dist-aarch64-windows-gnullvm try-job: dist-aarch64-apple try-job: aarch64-msvc-1 try-job: aarch64-msvc-2 try-job: dist-aarch64-msvc
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
a25e101 to
e20add7
Compare
This comment was marked as outdated.
This comment was marked as outdated.
This comment has been minimized.
This comment has been minimized.
Enable outline-atomics by default on more aarch64 platforms try-job: aarch64-apple try-job: aarch64-msvc-1 try-job: aarch64-msvc-2 try-job: arm-android try-job: dist-android try-job: dist-aarch64-windows-gnullvm try-job: dist-aarch64-apple try-job: dist-aarch64-msvc try-job: dist-various try-job: dist-x86_64-freebsd
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
|
I think this is good for fuchsia. I'm going to run it by some folks tomorrow morning. |
|
No concerns from other folks on fuchsia. Thanks for working on this! |
| )] | ||
| static RUST_LSE_INIT: extern "C" fn() = { | ||
| extern "C" fn init_lse() { | ||
| use crate::arch; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment is actually for line 37 below. Shouldn't the comment be updated to refer to the renamed file?
-// This is provided by compiler-builtins::aarch64_linux.
+// This is provided by compiler-builtins::aarch64_outline_atomics.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for catching this! Updated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I verified that __rust_enable_lse is run on Apple platforms when having -lse,+outline-atomics.
It isn't actually used though, the code will always fall back to the slower, non LSE implementation (regardless of whether the current CPU actually supports it), since try_lse_op! is gated with .arch_extension lse, which doesn't seem to be enabled on Apple platforms?
And if I just remove it, I get:
error: ADR/ADRP relocations must be GOT relative
|
note: instantiated into assembly here
--> <inline asm>:6:14
|
6 | adrp x16, __ZN17compiler_builtins23aarch64_outline_atomics16HAVE_LSE_ATOMICS17h7f004a93068845c5E; nop; ldrb w16, [x16, :lo12:__ZN17compiler_builtins...
| ^
error: unknown AArch64 fixup kind!
|
note: instantiated into assembly here
--> <inline asm>:6:14
|
6 | adrp x16, __ZN17compiler_builtins23aarch64_outline_atomics16HAVE_LSE_ATOMICS17h7f004a93068845c5E; nop; ldrb w16, [x16, :lo12:__ZN17compiler_builtins...
| ^
That error seems to come from around here in AArch64MachObjectWriter.cpp, which seems to hint that the way the check is implemented isn't supported on Mach-O? Unsure.
|
Stated in the Zulip thread already, but just to be clear, this change is fine for Android as well. |
|
It might make sense to land this for some targets first to avoid waiting on target maintainers replying and create a tracking issue for the rest. |
|
☔ The latest upstream changes (presumably #149351) made this pull request unmergeable. Please resolve the merge conflicts. |
Build outline atomic symbols on all targets that have `outline-atomics` enabled, rather than only on Linux. Since this is no longer OS-specific, also rename the module.
b18a6c6 to
fe78458
Compare
|
This PR was rebased onto a different main commit. Here's a range-diff highlighting what actually changed. Rebasing is a normal part of keeping PRs up to date, so no action is needed—this note is just to help reviewers. |
fe78458 to
4401f9c
Compare
Change the gating and link sections to enable this for any platforms that enable `outline-atomics`, rather than only Linux. Additionally, no longer run this if LSE is available, since in this case the outline versions will never be called.
Windows has a similar flag `/forceInterlockedFunctions`, which uses names such as `_InterlockedAdd64_rel`.
Per LLVM commit c5e7e64 ("[AArch64][Clang][Linux] Enable out-of-line atomics by default.") [1], Clang enables these on Android. Thus, do the same in Rust. [1]: llvm/llvm-project@c5e7e649d537067de
Clang has done this by default since LLVM commit 1a963d3 ("[Driver] Make -moutline-atomics default for aarch64-fuchsia targets"), [1], so do the same here. [1]: llvm/llvm-project@1a963d3
Clang has recently started doing this, as of LLVM commit 5d774ec8d183
("[Driver] Enable outline atomics for OpenBSD/aarch64") [1]. Thus, do
the same here.
[1]: llvm/llvm-project@5d774ec
4401f9c to
66c150c
Compare
|
@bors2 try |
Enable `outline-atomics` by default on more AArch64 platforms try-job: aarch64-msvc-* try-job: arm-android try-job: dist-android try-job: dist-aarch64-windows-gnullvm try-job: dist-aarch64-msvc try-job: dist-various-* try-job: test-various
This comment has been minimized.
This comment has been minimized.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
|
@bors2 try |
This comment has been minimized.
This comment has been minimized.
Enable `outline-atomics` by default on more AArch64 platforms try-job: aarch64-msvc-* try-job: arm-android try-job: dist-android try-job: dist-aarch64-llvm-mingw try-job: dist-aarch64-msvc try-job: dist-various-* try-job: test-various
The baseline Armv8.0 ISA doesn't have atomics instructions, but in
practice most hardware is at least Armv8.1-A (2014), which includes
single-instruction atomics as part of the LSE feature. As a performance
optimization for these cases, GCC and LLVM have the
-moutline-atomicsflagto turn atomic operations into calls to symbols like
__aarch64_cas1_acq.These can do runtime feature detection and use the LSE instructions if
available, falling back to more portable load-exclusive/store-exclusive
loops.
Since the recent 3b50253 ("compiler-builtins: plumb LSE support
for aarch64 on linux") our builtins support this LSE optimization, and
since 6936bb9 ("Dynamically enable LSE for aarch64 rust provided
intrinsics"), std will set the flag as part of its startup code. The first
commit in this PR configures this to work on all platforms built with
outline-atomics, not just Linux.Thus, enable
outline-atomicsby default on Android, OpenBSD, Windows,and Fuchsia platforms that don't have LSE in the baseline. The feature is
already enabled on Linux. Platform-specific details are included in each
commit message.
The current implementation can still be accessed by setting
-Ctarget-feature=-outline-atomics. Setting-Ctarget-feature=+lseora relevant CPU will use the single-instruction atomics without the call
overhead. https://rust.godbolt.org/z/dsdrzszoe
Link: https://learn.arm.com/learning-paths/servers-and-cloud-computing/lse/intro/
Original Clang outline-atomics benchmarks: https://reviews.llvm.org/D91157#2435844
try-job: aarch64-msvc-*
try-job: arm-android
try-job: dist-android
try-job: dist-aarch64-llvm-mingw
try-job: dist-aarch64-msvc
try-job: dist-various-*
try-job: test-various