-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Detect cycle during projectivize #10877
Detect cycle during projectivize #10877
Conversation
Discussed with @kadarakos : I'll make some changes, so putting it in draft temporarily. |
Communicate presence of cycles through an output argument.
The has_cycle pointer is too easy to misuse. Ideally, we would have a sum type like Rust's `Result` here, but C is not there yet.
We are now explicitly checking for cycles, so the algorithm must always terminate. Either we encounter the head, we find a root, or a cycle.
|
||
|
||
cdef bool _is_nonproj_arc(int tokenid, const vector[int]& heads) nogil: | ||
cdef pair[bool, bool] _is_nonproj_arc(int tokenid, const vector[int]& heads) nogil: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we could use C-enumerations (or even C 11 enum class
for additional type-safety, if Cython supports it) in place of pair[bool, bool]
? That would be more readable/explicit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would be possible from the pair[bool, bool]
functions, but not for pair[int, bool]
(unless we hack the semantics like before). I am not sure if I like using different types for different related functions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was saying to @kadarakos, I wish we had sum types 😭.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After discussion with @shadeMe and @kadarakos, redid it with C exceptions. Now we have to decide which of the TMTOWTDI's we want.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was telling @danieldk that I think its nice to have the exception, because even though it might be an overkill here, we could use it to refactor maybe some of the clunky errors with raise from cython
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be nice in some places, but it also means that we have to be more mindful about except
signatures. I found out today if you call a noexcept
function (can throw a C exception) in a cdef nogil
function, the exception is swallowed? So every place we call a noexcept
function, we have to to take care that we have an except
specification. Also, the exception is not propagated as a C exception, but as a Python exception from that point, so you need something like except *
or except -1
(depending on your return value).
There are probably places in our code base where we are calling functions that may throw (except
) in cdef nogil
functions, where we do not take proper care of propagating the exception.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am still not sure if this is a net improvement over returning errors as values and maybe having a template class for errors. At least, something like
cdef Result[int, string] myfunction(...) nogil
is harder to mis than an except
that is only visible in some pxd
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm. What do you think of having always tuple[Any, bool, bool, bool, ....]
where the Any
is the actual value and the bool
s refer to various errors that can be encountered?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That has very weak semantics. The benefit of the Result
approach is that the error could be anything strongly-typed (like an enum class or string).
So what's left now is to add some cycle-detection test cases to |
Yeah and |
Description
Attempting to detect cycles during
projectivize
step inget_aligned_parse
. Fixes #10859.Types of change
Small bugfix
Checklist