Skip to content

Conversation

@JulesLebert
Copy link
Contributor

@JulesLebert JulesLebert commented Apr 14, 2023

The preprocessing function that removes saturation just remove the samples where the signal reaches the threshold. This changes allow to also remove the signal around these saturations points (useful to remove parts of a recording where the cable was disconnected for example).

The problem with the modified clip function in pure python is that it is very slow. I have tried a vectorised approach that uses a convolve function from scipy.signal but it only improves performances marginally. Therefore, I have included a numba implementation of the function that increases the speed by ~10 times on my machine and that will only run if numba is installed.

@alejoe91 alejoe91 added enhancement New feature or request preprocessing Related to preprocessing module labels Apr 14, 2023
Copy link
Collaborator

@h-mayorquin h-mayorquin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contribution.

My thinking in order of descending importance:

  1. I think that the ClipRecording segment should not be modified. Right now, it is very straightforward and does its own thing very well. If we really want to have something like this we can just create another RecordingSegment. In brief, I feel the price of complexity is worse than the price of a small duplication. Finally, this should have some tests.
  2. Are you sure that you need the loop for the numpy implementation? I think that you can calculate where the indexes of np.were change by more than one (np.diff) an just expand to the left and right accordingly. This will avoid the inner loop which is the most costly part of your function. Am I missing something?
  3. I think that numba in general will play bad with our way of parallelizing with the ChunkRecordingExecutor. The former because I am not sure if it will require recompling per core which is usually a higher cost than the operation it implements. Have you tested this?

@alejoe91 alejoe91 requested a review from samuelgarcia April 26, 2023 14:30
@alejoe91 alejoe91 assigned alejoe91 and samuelgarcia and unassigned alejoe91 May 3, 2023
@alejoe91
Copy link
Member

alejoe91 commented Feb 3, 2025

@JulesLebert is this still being worked on?

@samuelgarcia samuelgarcia added this to the 0.104.0 milestone Dec 4, 2025
def _replace_slice_max_numba(traces, a_max, frames_before, frames_after, value_max):
m, n = traces.shape
to_clear = np.zeros(m, dtype=np.bool_)
for j in range(n):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you are looping channel then time.
I have the intuition that the reverse would be faster to have coalesing memory access.
no ?

@samuelgarcia
Copy link
Member

@JulesLebert
This looks good to me.
I have the intuition that when could have ms_before=None ms_after=None by default and if there both None we keep the previous bahavior that used to be fast and simple.
What do you think ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request preprocessing Related to preprocessing module

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants