-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add blocking progress mode to Python async #116
base: branch-0.40
Are you sure you want to change the base?
Add blocking progress mode to Python async #116
Conversation
37ea715
to
1c1ab87
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the core of this looks fine, but I am reminded that this epoll_wait
issue with asyncio doesn't actually quite work correctly with the code we are using (see discussion here rapidsai/ucx-py#888)
# - All asyncio tasks that isn't waiting on UCX must be executed | ||
# so that the asyncio's next state is epoll wait. | ||
# See <https://github.com/rapidsai/ucx-py/issues/413> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although I think I understand the constraint, I am not sure what it means for the "next state" to be epoll_wait. Surely there can be arbitrary non-ucx tasks?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm gonna be honest and say my understanding here is also a bit fuzzy, and as you noted yourself this "doesn't work" (except when it does), with the original being adapted from https://stackoverflow.com/a/48491563. In my understanding, what epoll_wait
refers to here is the socket state, which is there solely to provide a mechanism to prevent asyncio
from running out of "ready" tasks, so epoll_wait
will ensure that if nothing useful happens in the event loop, asyncio
will still be awaken at some point to allow the loop to reevaluate.
|
||
if self.worker.arm(): | ||
# At this point we know that asyncio's next state is | ||
# epoll wait. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Who does the epoll_wait
call?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As per my #116 (comment), I believe this is related to the sockets.
Yes, I remember that. The purpose of this is not necessarily to be used long-term, but rather to have a fallback to original UCX-Py behavior. This will allow us testing as close as possible with what UCX-Py did in the past, and we may later deprecate/remove this if we are confident we have a better option (e.g., |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanations!
52e4d1d
to
91ab7bf
Compare
… python-async-blocking-mode
… python-async-blocking-mode
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had some small docstring comments but I still think this looks good, is there anything else to do here?
The blocking progress mode ensure the worker is progress whenever the UCX | ||
worker reports an event on its epoll file descriptor. In certain | ||
circumstances the epoll file descriptor may not |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Grammar suggestion:
Blocking progress mode ensures that the worker is progressed whenever the UCX worker reports and event on its epoll file descriptor.
*nit8: The second sentence appears to be incomplete.
Implements the blocking progress mode (UCX-Py default), which was still not implemented in UCXX.