Skip to content

Improve definition of CAMLthread_local for C⁠+⁠+ compatibility#147

Draft
MisterDA wants to merge 2 commits intotrunkfrom
thread_local
Draft

Improve definition of CAMLthread_local for C⁠+⁠+ compatibility#147
MisterDA wants to merge 2 commits intotrunkfrom
thread_local

Conversation

@MisterDA
Copy link
Owner

@MisterDA MisterDA commented Sep 13, 2025

Since OCaml 5, with multicore support, the runtime is written in C11 and uses thread-local storage. There's only one variable (a POD) that is public in the OCaml headers. It was first using GCC's __thread, then we changed it to use thread_local in C23 and C⁠+⁠+11, and _Thread_local in C111, for portability. We hit the ABI incompatibility problem that is described in N2850: _Thread_local for better C⁠+⁠+ interoperability with C, as if a C++ compiler reads our header, it will use the C⁠+⁠+ thread_local ABI to access the variable, instead of the C11 _Thread_local semantics.

We hit problems on Cygwin, and I'm expecting problems with MSVC. See #14220 and #13541.

Looking to fix this problem once and for all, I've found N2850, submitted to both the C and C⁠+⁠+ working groups. There is no definitive fix, the paper suggests adding _Thread_local to C++, to make it easier to write common headers, or to have extern "C" thread_local (in C⁠+⁠+) behave like C11 semantics for the thread-local ABI instead of C⁠+⁠+'s. The paper also suggests moving away from C23 thread_local spelling, to avoid confusion with C⁠+⁠+11 thread_local, and remain with C11 _Thread_local.

I contacted the authors of the paper for advice, here's the short answer:

The C++ committee was not supportive of adding _Thread_local in C++ and the C committee moved forward with adding thread_local as their preferred spelling despite being informed of the issue.
The C++ committee was interested in a potential solution where thread_local behaves like _Thread_local when used for variables with extern "C" linkage; however, I have some qualms with proposing that.

Missing a standard way to use C semantics within a C⁠+⁠+ compiler, we have to resort to compiler extensions. The compiler extensions (__thread, __declspec(thread)) seem to always be equivalent to _Thread_local. Here are my proposed changes:

  1. Use C11 _Thread_local internally, and for variables protected by CAML_INTERNALS. My preference is to use the standard keyword wherever possible.

For exposed symbols (currently only the caml_state):

  1. If using MSVC, prefer __declspec(thread).

    On Windows thread_local is implemented with __declspec(thread).

    In C⁠+⁠+, also use [[msvc::no_tls_guard]]2 on the declaration to avoid intermediary functions checking if the variable has been initialized. Unfortunately Clang doesn't support this attribute (https://github.com/llvm/llvm-project/issues/57696), so for the *-pc-windows target in C⁠+⁠+ we have to fallback to the __thread keyword.

    In Clang, __declspec(thread) is generally equivalent in functionality to the GNU __thread keyword.

  2. Otherwise the compiler is GCC or Clang, or another compiler (nowadays Intel® oneAPI DPC++/C⁠+⁠+ Compiler and IBM Open XL C/C⁠+⁠+ for AIX are both based on LLVM), and supports __thread with GCC's semantics:

    G++ now implements the C⁠+⁠+11 thread_local keyword; this differs from the GNU __thread keyword primarily in that it allows dynamic initialization and destruction semantics. Unfortunately, this support requires a run-time penalty for references to non-function-local thread_local variables defined in a different translation unit even if they don't need dynamic initialization, so users may want to continue to use __thread for TLS variables with static initialization semantics. (GCC 4.8)

    ISO C11 thread-local storage (_Thread_local, similar to GNU C __thread) is now supported. (GCC 4.9)

As an extension, Clang-based compilers also support the C _Thread_local keyword in C⁠+⁠+ (GCC doesn't), but we can use __thread instead. Unfortunately MSVC rejects mixing __declspec(thread) and _Thread_local in declarations and definitions.

To help convincing, here's a complete example: https://godbolt.org/z/TKxd1nbMY, with the output of GCC (4.9.4 and trunk), MSVC (19.38 introduced C11 atomic), and Clang (trunk, targets x86_64-pc-window and x86_64-*-linux), running both in C and C⁠+⁠+, showing that the generated assembly is identical. If you want to come up with a different macro dance, you may validate it with this example.

> cat <<EOF > test.h
#ifdef __cplusplus
extern "C"
#endif
__declspec(thread) int x;
EOF
> cat <<EOF >test.c
#include "test.h"
__declspec(thread) int x = 42;
EOF
> cat <<EOF >main.cpp
#include "test.h"

int get_x() { return x; }
void set_x(int i) { x = i; }

int main() {
  (void)get_x();
  set_x(1337);
  return get_x();
}
> clang --target=x86_64-pc-windows -Wall -c test.c
> clang --target=x86_64-pc-windows -Wall -c main.c
> clang --target=x86_64-pc-windows -Wall test.o main.o -o main.exe

Footnotes

  1. _Thread_local is a C11 keyword. The macro thread_local, defined in <threads.h>, expands to _Thread_local. If __STDC_NO_THREADS__3 is defined to 1, the header is not provided. In C23, thread_local becomes a keyword, and the spelling _Thread_local is discouraged.

  2. Support for the attribute can be tested with __has_cpp_attribute (standard in C++20), supported by all MSVC and Clang version that we're interested in. However for an attribute with a namespace we need to check first for __cplusplus as GCC until 11 didn't support the namespace syntax in C mode.

  3. There was a bug where Microsoft implemented C11 <threads.h>, forgot to ship the header, and did not define the guard macro.

@MisterDA MisterDA changed the title Improve definition of CAMLthread_local for C++ compatibility Improve definition of CAMLthread_local for C&NoBreak;+&NoBreak;+ compatibility Sep 16, 2025
@MisterDA MisterDA changed the title Improve definition of CAMLthread_local for C&NoBreak;+&NoBreak;+ compatibility Improve definition of CAMLthread_local for C⁠+⁠+ compatibility Sep 16, 2025
@MisterDA MisterDA force-pushed the thread_local branch 3 times, most recently from f9f7c20 to 3483c77 Compare September 18, 2025 00:30
@MisterDA MisterDA force-pushed the thread_local branch 5 times, most recently from bd8f4bd to dbf630c Compare September 25, 2025 08:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant