Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

inconcistencies around end of stream reading #1017

Open
inolen opened this issue Oct 3, 2024 · 6 comments
Open

inconcistencies around end of stream reading #1017

inolen opened this issue Oct 3, 2024 · 6 comments

Comments

@inolen
Copy link
Contributor

inolen commented Oct 3, 2024

Hey @PetteriAimonen, wanted to bring this issue up that I hit while working on #1008.

Right now, the main decode loop looks like -

while (stream->bytes_left) {
  if (!decode_tag(&eof)) {
    if (eof) break;
    else return false;
  }
  
  ... decode field ...
}

... handle misc errors, e.g. required fields ...

There's a couple of issues I noticed here -

  1. This doesn't behave consistent when callbacks are used. Streams not using callbacks will break out of this when bytes_left is 0 before attempting to call decode_tag and setting the "end-of-stream" error, meanwhile with callbacks the callback will be called, which will in turn set bytes_left to 0 / return false and ultimately set the "io error" error in pb_readbyte before the loop breaks. This means things like required errors won't be set correctly when callbacks are used as there is already an error on the stream set.

  2. Callbacks setting bytes_left to 0 feels weird, since pb_decode.c otherwise manages that. Perhaps adopting the read() behavior and returning an integer of bytes read / 0 (eof) / -1 (error) would make handling this state easier.

  3. Using stream->bytes_left as the loop condition feels flawed with callbacks who set the stream's size to some non-exact size. It feels like decode_tag could just return false indicating either an eof or error, and decode_inner could return stream->errmsg == NULL so you end up with -

while (decode_tag()) {
  ... decode field ...
 }
 
 return stream->errmsg == NULL;
@PetteriAimonen
Copy link
Member

I agree that this should probably be reviewed. I think the current design is somewhat a result of #278.

@inolen
Copy link
Contributor Author

inolen commented Oct 4, 2024

@PetteriAimonen what are the restrictions around modifying the callback interface (e.g. to return n / 0 / -1) instead of a bool?

@PetteriAimonen
Copy link
Member

It affects user code, so it should only be done after good consideration. On the other hand, 1.0.0 would be the correct time to do the change if it is worthwhile.

Using signed return value also introduces annoying edge case if the data size exceeds (or claims to exceed, such as in malicious bytes payload) 2 GB.

@inolen
Copy link
Contributor Author

inolen commented Oct 5, 2024

AFAIK, a ssize_t can always hold a size_t, right? The read() system call returns an ssize_t and takes an size_t for count. So as long as we use both, there shouldn't be any edge cases to have to manually handle.

Edit: that can't be right, sizeof(ssize_t) and sizeof(size_t) are the same on my machine. It would need to be handled, however, wouldn't the malicious code ultimately just cause an error to be returned when the return overflows?

Edit 2: Just realized ssize_t isn't standard, but we already define our own pb_size_t / pb_ssize_t that could be used by the callbacks.

@PetteriAimonen
Copy link
Member

To me it feels like the issues might be solvable without such a large change. I strongly prefer API stability when possible.

But I have to admit I haven't really looked into this yet.

@inolen
Copy link
Contributor Author

inolen commented Oct 6, 2024

Totally understood re: stable API.

The fundamental problem here is that for each read, there are 3 possible results -

  1. success
  2. eof
  3. error

With the bool, we can return only one. If a correct bytes_left was set for every stream, the bool would be sufficient as it could be used to determine eof leaving the bool to determine success / error. This is normally the case for buffer streams, but not for callbacks (e.g. in basic_stream/decode_stream it sets SIZE_MAX for the bytes_left).

Since I think the calls to read are always a specific size (vs some max buffer size with the intent of draining the stream) using an enum for a return value (OK, EOF, ERR) would also be sufficient it seems.

Edit: and the way the callbacks work around this to some degree is as stated in the original issue. When they return false, they also set bytes_left to zero on eof. Then further up the callstack at https://github.com/nanopb/nanopb/blob/master/pb_decode.c#L178C17-L178C18 it check if bytes_left is 0. However, at this point an error has already been set and any subsequent actual errors aren't reported properly. This could be worked around by modifying https://github.com/nanopb/nanopb/blob/master/pb_decode.c#L108 to only call PB_RETURN_ERROR if bytes_left != 0, and just returning false if bytes_left == 0. I think that would fix the issue, but the behavior isn't very straight forward.

inolen added a commit to inolen/nanopb that referenced this issue Nov 15, 2024
This is part of the upcoming fast decode commit. The fast path didn't compare
against bytes_left in the main loop, which caused a number of tests to fail
as unexpected end-of-stream errors were raised on the stream.

This change hoists the check for bytes_left from pb_decode_varint32_eof,
up into pb_decode_tag, such that eof is set before an error is raised
when called with bytes_left == 0. This simplifies the main loop from -

while (stream->bytes_left)
    if (!pb_decode_tag())
        ...

to -

while (pb_decode_tag())
    ...

This is also important because bytes_left isn't guaranteed to be valid for callback
streams. This created an edge case where callback streams would set an "io error"
when eof was hit, regardless of if decoding was successful or not. This error could
then mask actual errors generated after the main loop (e.g. missing required fields).

A workaround and accompanying regression test was added for this, and is being
further tracked in issue nanopb#1017.
inolen added a commit to inolen/nanopb that referenced this issue Nov 15, 2024
This is part of the upcoming fast decode commit. The fast path didn't compare
against bytes_left in the main loop, which caused a number of tests to fail
as unexpected end-of-stream errors were raised on the stream.

This change hoists the check for bytes_left from pb_decode_varint32_eof,
up into pb_decode_tag, such that eof is set before an error is raised
when called with bytes_left == 0. This simplifies the main loop from -

while (stream->bytes_left)
    if (!pb_decode_tag())
        ...

to -

while (pb_decode_tag())
    ...

This is also important because bytes_left isn't guaranteed to be valid for callback
streams. This created an edge case where callback streams would set an "io error"
when eof was hit, regardless of if decoding was successful or not. This error could
then mask actual errors generated after the main loop (e.g. missing required fields).

A workaround and accompanying regression test was added for this, and is being
further tracked in issue nanopb#1017.
PetteriAimonen pushed a commit that referenced this issue Nov 19, 2024
This is part of the upcoming fast decode commit. The fast path didn't compare
against bytes_left in the main loop, which caused a number of tests to fail
as unexpected end-of-stream errors were raised on the stream.

This change hoists the check for bytes_left from pb_decode_varint32_eof,
up into pb_decode_tag, such that eof is set before an error is raised
when called with bytes_left == 0. This simplifies the main loop from -

while (stream->bytes_left)
    if (!pb_decode_tag())
        ...

to -

while (pb_decode_tag())
    ...

This is also important because bytes_left isn't guaranteed to be valid for callback
streams. This created an edge case where callback streams would set an "io error"
when eof was hit, regardless of if decoding was successful or not. This error could
then mask actual errors generated after the main loop (e.g. missing required fields).

A workaround and accompanying regression test was added for this, and is being
further tracked in issue #1017.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants