-
Notifications
You must be signed in to change notification settings - Fork 113
Some changes to address blocking after a node is lost #2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
…erations. closing channel.socket so as not to orphan the socket. Using an iterator in handleIO instead of a foreach loop which isn't thread safe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this seems like the big one in my testing that clogs up the mechanism when a server node is lost suddenly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please see this change set where the bug was introduced last July
00f2e78
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, that looks like a typo/bug.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
Hi, first, thanks much for your effort into fixing bugs here! Looking at your PR here, I identified two things that look indeed wrong:
For your other code, I added some additional comments that need a bit of clarification. Note that we do the actual code review through gerrit, github is just a mirror here. You can either setup a gerrit account and work through the process, or I'll do it for you and attribute you in the commit message, whatever you like more. Cheers, |
|
I don't see where to get setup with gerrit, I suppose I'd defer to you |
|
@sveesible it is described here, but seems legit that I'll do it, its quicker I guess. http://www.couchbase.com/wiki/display/couchbase/Contributing+Changes |
This should not be defaulted to 1 as this value must be 0 for the isActive() call to return true, and thus allow redistributed ops to be put into this nodes operation queue
BinaryOperationFactory was improperly setting a null get callback for the gets callback on clone of get operations, causing bad behavior under load with redistribution of bulkgets
Redistribution of write operations was just popping the operation back onto the original node and then continuing through the clone logic.
closing channel.socket to be consistent with other close logic
Using an iterator in handleIO for Selector.selectedKeys instead of a foreach loop because foreach changes stuff in the list while it loops