Skip to content

Conversation

@taiki-e
Copy link
Member

@taiki-e taiki-e commented Dec 27, 2025

This is a PR for thumbv6-none-eabi (bere-metal Armv6k in Thumb mode) which proposed to be added by rust-lang/rust#150138.

Armv6k supports atomic instructions, but they are unavailable in Thumb mode unless Thumb-2 instructions available (v6t2).

Using Thumb interworking (can be used via #[instruction_set]) allows us to use these instructions even from Thumb mode without Thumb-2 instructions, but LLVM does not implement that processing (as of LLVM 21), so this PR implements it in compiler-builtins.

The code around __sync builtins is basically copied from arm_linux.rs which uses kernel_user_helpers for atomic implementation.
The atomic implementation is a port of my atomic-maybe-uninit inline assembly code.

This PR has been tested on QEMU 10.2.0 using patched compiler-builtins and core that applied the changes in this PR and rust-lang/rust#150138 and the portable-atomic no-std test suite (can be run with ./tools/no-std.sh thumbv6-none-eabi on that repo) which tests wrappers around core::sync::atomic. (Note that the target-spec used in test sets max-atomic-width to 32 and atomic_cas to true, unlike the current rust-lang/rust#150138.) The original atomic-maybe-uninit implementation has been tested on real Arm hardware.

(Note that Armv6k also supports 64-bit atomic instructions, but they are skipped here. This is because there is no corresponding code in arm_linux.rs (since the kernel requirements increased in 1.64, it may be possible to implement 64-bit atomics there as well. see also taiki-e/portable-atomic#82), the code becomes more complex than for 32-bit and smaller atomics.)

cc @thejpster (target maintainer)

I'll undraft this PR once the target maintainer approved this approach.

};
}

atomic_rmw!(@old __sync_fetch_and_add_1, u8, |a: u8, b: u8| a.wrapping_add(b));
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The macro calls from here to L319 are the same as in arm_linux.rs (the definitions of the called macros are different), so I think it would be preferable to extract them into a separate file.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in af8d4af.

taiki-e added a commit to taiki-e/portable-atomic that referenced this pull request Dec 27, 2025
"mcr p15, #0, {zero}, c7, c10, #5"
};
}
macro_rules! asm_use_dmb {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this macro is just an asm block, but with macro expansion (and an extra {zero} register which I guess is useful somewhere). But it could probably do with a comment to explain all that because it's really non-obvious. Especially as this macro doesn't have anything to do with dmb (other than it's written as a macro that needs to be expanded).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was originally a helper for writing code that required no changes other than using the Data Memory Barrier (DMB instruction or equivalent instructions).

In this PR, it is indeed unclear, so it can be deleted.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed asm_use_dmb! macro and renamed dmb! macro to cp15_barrier! (to more clearly reflect what it does) in a1a5401.

@thejpster
Copy link

This looks OK to me, but it's a bit beyond my Arm experience.

@taiki-e taiki-e marked this pull request as ready for review December 28, 2025 13:00
Copy link
Contributor

@tgross35 tgross35 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty good to me, thanks for the patch. This should wait until one of the compiler leads approves the target PR, though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: could you name this arm_thumb_sync_builtins or similar?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I think we may as well group this by functionality like we have for other areas, something like:

src/sync/
    arm_linux.rs
    thumbv6k.rs
    arm_thumb_shared.rs // this file

I'll move the aarch64 builtins there too.

Comment on lines +147 to +151
// FIXME(safety): preconditions review needed
let curval = unsafe { T::load_relaxed(ptr) };
let newval = f(curval);
// FIXME(safety): preconditions review needed
if unsafe { T::cmpxchg(ptr, curval, newval) } == curval {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you mind filling these in? Most of them are probably just related to pointer validity.

dst = in(reg) dst,
// Note: this cast must be a zero-extend since loaded value
// which compared to it is zero-extended.
old = in(reg) old as u32,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

u32::from(old), since this should never truncate

Comment on lines +69 to +79
concat!("ldrex", $suffix, " {out}, [{dst}]"), // atomic { out = *dst; EXCLUSIVE = dst }
"cmp {out}, {old}", // if out == old { Z = 1 } else { Z = 0 }
"bne 3f", // if Z == 0 { jump 'cmp-fail }
cp15_barrier!(), // fence
"2:", // 'retry:
concat!("strex", $suffix, " {r}, {new}, [{dst}]"), // atomic { if EXCLUSIVE == dst { *dst = new; r = 0 } else { r = 1 }; EXCLUSIVE = None }
"cmp {r}, #0", // if r == 0 { Z = 1 } else { Z = 0 }
"beq 3f", // if Z == 1 { jump 'success }
concat!("ldrex", $suffix, " {out}, [{dst}]"), // atomic { out = *dst; EXCLUSIVE = dst }
"cmp {out}, {old}", // if out == old { Z = 1 } else { Z = 0 }
"beq 2b", // if Z == 1 { jump 'retry }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For my own understanding: why is the barrier needed between the first ldrex/strex, but not between subsequent l/s pairs?

Comment on lines +116 to +127
// SAFETY: the caller must uphold the safety contract.
// casts are okay because $ty and $base implement the same layout.
unsafe { <$base as Atomic>::load_relaxed(src.cast::<$base>()) as Self }
}
#[inline]
unsafe fn cmpxchg(dst: *mut Self, old: Self, new: Self) -> Self {
// SAFETY: the caller must uphold the safety contract.
// casts are okay because $ty and $base implement the same layout.
unsafe {
<$base as Atomic>::cmpxchg(dst.cast::<$base>(), old as $base, new as $base)
as Self
}
Copy link
Contributor

@tgross35 tgross35 Jan 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.cast_{un,}signed() is stable now and would make the as casts more clear

Comment on lines +39 to +40
#[rustfmt::skip]
macro_rules! atomic {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think rustfmt does any changes here

@tgross35
Copy link
Contributor

tgross35 commented Jan 4, 2026

This PR has been tested on QEMU 10.2.0 using patched compiler-builtins and core that applied the changes in this PR and rust-lang/rust#150138 and the portable-atomic no-std test suite (can be run with ./tools/no-std.sh thumbv6-none-eabi on that repo) which tests wrappers around core::sync::atomic.

If you get the chance at some point, I would be rather interested in having some of your atomic test suite here. Especially the part you mention since we don't have any no-std tests set up.

(Note that Armv6k also supports 64-bit atomic instructions, but they are skipped here. This is because there is no corresponding code in arm_linux.rs (since the kernel requirements increased in 1.64, it may be possible to implement 64-bit atomics there as well. see also taiki-e/portable-atomic#82), the code becomes more complex than for 32-bit and smaller atomics.)

Assuming this would allow us to set target_has_atomic = "64", I think that would be worth it. Just not pressing, of course.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants