Apply custom function with Dataframes as arguments #8419
Unanswered
christosmak
asked this question in
Q&A
Replies: 1 comment
-
It might be possible to get good speed-up with |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hello everyone,
I have successfully parallelized my code with Dask, however I think that there is room for performance optimization on what I have done. Given that I am quite new to Dask and Python in general, I cannot quite easily wrap my head around Dask documentation in order to address the issue that I am having.
To begin with, here are some lines of code:
My main issue here is that it takes too long for Dask to apply the function on temp, even on a machine with 24 processors. The same code in R (using the built-in library with parLapply) takes 1/3 of the time. I suspect that I am doing something wrong with distributing work to the Client, I have tried to use other features of Dask such as client.submit or client.scatter, but could not get any of them to work.
Could there be a more efficient way to do this using submit or scatter?
Thank you in advance.
Beta Was this translation helpful? Give feedback.
All reactions