Use memcpy() inside codec2_fifo #375

tmiw · 2023-01-02T01:38:45Z

This PR performs the following:

Addresses TBD comment inside codec2_fifo.c by using memcpy to write/read to FIFO.
Enables mutexes in the ctest to prevent macOS failures in test_fifo.

…tly pass.

drowe67 · 2023-01-08T20:15:59Z

Thanks for taking a look at this @tmiw. As per the source code documentation these fifos should be thread safe by design (I've been using this design for 20 years). If the test failed it could be due to a bug being introduced. A mutex should not be required. But who knows, maybe recent compilers have changed this situation 🤔

Does the test pass on the same machine using master?
Is there is any significant performance increase using memcpy?

tmiw · 2023-01-09T17:58:25Z

Does the test pass on the same machine using master?

It looks like it passes on master until you change CMakeLists.txt as follows:

diff --git a/CMakeLists.txt b/CMakeLists.txt
index e3fae644..6e7a6af4 100644
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -100,8 +100,8 @@ if((NOT WIN32) AND (NOT MICROCONTROLLER_BUILD))
     set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -fPIC")
 endif()
 
-set(CMAKE_C_FLAGS_DEBUG "-g -O2 -DDUMP")
-set(CMAKE_C_FLAGS_RELEASE "-O3")
+set(CMAKE_C_FLAGS_DEBUG "-g -DDUMP")
+set(CMAKE_C_FLAGS_RELEASE "")
 
 #
 # Setup Windows/MinGW specifics here.

(basically disable optimization entirely)

After making the above change, test_fifo passes maybe 50% of the time on my M1 Mac Mini. (I had disabled optimization before while investigating other ctest failures on that platform to see if they were compiler related.)

Is there is any significant performance increase using memcpy?

Using ctest -R test_fifo and optimization enabled:

On master: 0.04s
This PR, with mutex: 0.12s
This PR, without mutex: N/A, 100% failure rate

Looking more, it seems that codec2_fifo_used() uses both fifo->pin and fifo->pout. I wonder if we're getting into a situation where two threads are modifying both at the same time?

tmiw · 2023-01-10T09:25:32Z

I was playing around a bit tonight and noticed that there's a stdatomic.h header file. Tweaking the FIFO code a bit to use atomic loads and stores results in a 100% pass rate on the M1 Mac using memcpy (0.08s using the same execution as above). Still faster than using a mutex but slower than master. 🤔

tmiw · 2023-01-10T09:35:31Z

BTW, on my 2019 MacBook Pro (x86_64):

master: 0.15s
This PR (as of 9dbec98): 0.18s

M1 Mac is fast enough to require USE_MUTEX for test_fifo to consisten…

60ad7f8

…tly pass.

tmiw mentioned this pull request Jan 2, 2023

Use four samples at a time for estimating corr. #374

Merged

100% pass on M1 Mac (10/10 runs) using stdatomic.h.

6b6fe2f

Fix Linux build error.

9dbec98

We should atomic load once and atomic store once per operation.

59d927e

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Use memcpy() inside codec2_fifo #375

Use memcpy() inside codec2_fifo #375

Uh oh!

tmiw commented Jan 2, 2023

Uh oh!

drowe67 commented Jan 8, 2023 •

edited

Loading

Uh oh!

tmiw commented Jan 9, 2023

Uh oh!

tmiw commented Jan 10, 2023

Uh oh!

tmiw commented Jan 10, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Use memcpy() inside codec2_fifo #375

Are you sure you want to change the base?

Use memcpy() inside codec2_fifo #375

Uh oh!

Conversation

tmiw commented Jan 2, 2023

Uh oh!

drowe67 commented Jan 8, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tmiw commented Jan 9, 2023

Uh oh!

tmiw commented Jan 10, 2023

Uh oh!

tmiw commented Jan 10, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

drowe67 commented Jan 8, 2023 •

edited

Loading