Skip to content

Conversation

@varshneydevansh
Copy link

@varshneydevansh varshneydevansh commented Aug 23, 2023

The basics

  • I branched from the master
  • My pull request is against master
  • My code follows the deepmind Style Guide

The Details

resolves

Proposed Changes

Similarly to #4, it would be useful to be able to back out of sampling without needing to wrap things in a thread or use an executor. I agree that often, you'd want to sample asynchronously to maximize throughput, but there are cases where the predictability and simplicity are preferable even if it comes at the expense of efficiency (e.g., in research). A timeout argument would simplify the synchronous setting without sacrificing safeguards from indefinite waiting.

Behavior Before Change

The client can indeed get blocked indefinitely if the server crashes or is terminated.

Behavior After Change

After providing the timeout functionality to the client.sample we can set a timeout for the flush operation using the timeout_ms argument.


// Buffer time to account for connection overhead
constexpr auto CONNECTION_BUFFER_TIME = std::chrono::milliseconds(200);

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The deadline must only be set if the rate_limiter_timeout is set. Also actually it isn't quite this easy I just realised since you also can't set the deadline when using the datasets (which is the recommended way).

Furthermore, there is a whole bunch of tests needed here to verify that all the behaviours are as expected.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, I oversight that if-clause to check if the rate_limiter_timeout is set or not.

When utilizing the “datasets” method. It implies that while using “datasets”, we cannot set a deadline for the RPC call as it might be getting transferred. Am I getting this right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes the normal behaviour is to wait for the connection to be establish and only break out if the rate limiter has blocked for the specified amount of time.

Basically I think there might be a need to clearly separate the concepts here and follow a pattern more similar to the writers. I.e. instead of just returning a generator from client.sample we might have to return a special object which has support for getting the next item with an optional deadline.

Really though, this feature will be quite a lot of work and I can't guarantee that I'll be able to accept it unless it is both very cleanly implemented and guaranteed to work in all cases. How important is this feature for you? Unless it is absolutely critical I would probably recommend you not to spend more time on this.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am just looking for my learning and to improve my programming ability. Since 2017 I did ML/AI stuff but left open source without even trying. Now I am doing what I left and seeking places where I can learn better with some guidance.

So, this is all for my learning. =)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now I can do this. Will take out some time for this :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Feature request: timeout argument for client.sample

2 participants