Skip to content

Conversation

@tmiw
Copy link
Collaborator

@tmiw tmiw commented Jan 2, 2023

This PR performs the following:

  1. Addresses TBD comment inside codec2_fifo.c by using memcpy to write/read to FIFO.
  2. Enables mutexes in the ctest to prevent macOS failures in test_fifo.

@drowe67
Copy link
Owner

drowe67 commented Jan 8, 2023

Thanks for taking a look at this @tmiw. As per the source code documentation these fifos should be thread safe by design (I've been using this design for 20 years). If the test failed it could be due to a bug being introduced. A mutex should not be required. But who knows, maybe recent compilers have changed this situation 🤔

  1. Does the test pass on the same machine using master?
  2. Is there is any significant performance increase using memcpy?

@tmiw
Copy link
Collaborator Author

tmiw commented Jan 9, 2023

  1. Does the test pass on the same machine using master?

It looks like it passes on master until you change CMakeLists.txt as follows:

diff --git a/CMakeLists.txt b/CMakeLists.txt
index e3fae644..6e7a6af4 100644
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -100,8 +100,8 @@ if((NOT WIN32) AND (NOT MICROCONTROLLER_BUILD))
     set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -fPIC")
 endif()
 
-set(CMAKE_C_FLAGS_DEBUG "-g -O2 -DDUMP")
-set(CMAKE_C_FLAGS_RELEASE "-O3")
+set(CMAKE_C_FLAGS_DEBUG "-g -DDUMP")
+set(CMAKE_C_FLAGS_RELEASE "")
 
 #
 # Setup Windows/MinGW specifics here.

(basically disable optimization entirely)

After making the above change, test_fifo passes maybe 50% of the time on my M1 Mac Mini. (I had disabled optimization before while investigating other ctest failures on that platform to see if they were compiler related.)

  1. Is there is any significant performance increase using memcpy?

Using ctest -R test_fifo and optimization enabled:

On master: 0.04s
This PR, with mutex: 0.12s
This PR, without mutex: N/A, 100% failure rate

Looking more, it seems that codec2_fifo_used() uses both fifo->pin and fifo->pout. I wonder if we're getting into a situation where two threads are modifying both at the same time?

@tmiw
Copy link
Collaborator Author

tmiw commented Jan 10, 2023

I was playing around a bit tonight and noticed that there's a stdatomic.h header file. Tweaking the FIFO code a bit to use atomic loads and stores results in a 100% pass rate on the M1 Mac using memcpy (0.08s using the same execution as above). Still faster than using a mutex but slower than master. 🤔

@tmiw
Copy link
Collaborator Author

tmiw commented Jan 10, 2023

BTW, on my 2019 MacBook Pro (x86_64):

master: 0.15s
This PR (as of 9dbec98): 0.18s

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants