inconcistencies around end of stream reading #1017

inolen · 2024-10-03T21:55:09Z

Hey @PetteriAimonen, wanted to bring this issue up that I hit while working on #1008.

Right now, the main decode loop looks like -

while (stream->bytes_left) {
  if (!decode_tag(&eof)) {
    if (eof) break;
    else return false;
  }
  
  ... decode field ...
}

... handle misc errors, e.g. required fields ...

There's a couple of issues I noticed here -

This doesn't behave consistent when callbacks are used. Streams not using callbacks will break out of this when bytes_left is 0 before attempting to call decode_tag and setting the "end-of-stream" error, meanwhile with callbacks the callback will be called, which will in turn set bytes_left to 0 / return false and ultimately set the "io error" error in pb_readbyte before the loop breaks. This means things like required errors won't be set correctly when callbacks are used as there is already an error on the stream set.
Callbacks setting bytes_left to 0 feels weird, since pb_decode.c otherwise manages that. Perhaps adopting the read() behavior and returning an integer of bytes read / 0 (eof) / -1 (error) would make handling this state easier.
Using stream->bytes_left as the loop condition feels flawed with callbacks who set the stream's size to some non-exact size. It feels like decode_tag could just return false indicating either an eof or error, and decode_inner could return stream->errmsg == NULL so you end up with -

while (decode_tag()) {
  ... decode field ...
 }
 
 return stream->errmsg == NULL;

The text was updated successfully, but these errors were encountered:

PetteriAimonen · 2024-10-04T04:32:35Z

I agree that this should probably be reviewed. I think the current design is somewhat a result of #278.

inolen · 2024-10-04T05:49:13Z

@PetteriAimonen what are the restrictions around modifying the callback interface (e.g. to return n / 0 / -1) instead of a bool?

PetteriAimonen · 2024-10-04T07:24:31Z

It affects user code, so it should only be done after good consideration. On the other hand, 1.0.0 would be the correct time to do the change if it is worthwhile.

Using signed return value also introduces annoying edge case if the data size exceeds (or claims to exceed, such as in malicious bytes payload) 2 GB.

inolen · 2024-10-05T19:30:03Z

AFAIK, a ssize_t can always hold a size_t, right? The read() system call returns an ssize_t and takes an size_t for count. So as long as we use both, there shouldn't be any edge cases to have to manually handle.

Edit: that can't be right, sizeof(ssize_t) and sizeof(size_t) are the same on my machine. It would need to be handled, however, wouldn't the malicious code ultimately just cause an error to be returned when the return overflows?

Edit 2: Just realized ssize_t isn't standard, but we already define our own pb_size_t / pb_ssize_t that could be used by the callbacks.

PetteriAimonen · 2024-10-06T06:26:55Z

To me it feels like the issues might be solvable without such a large change. I strongly prefer API stability when possible.

But I have to admit I haven't really looked into this yet.

inolen · 2024-10-06T22:03:29Z

Totally understood re: stable API.

The fundamental problem here is that for each read, there are 3 possible results -

success
eof
error

With the bool, we can return only one. If a correct bytes_left was set for every stream, the bool would be sufficient as it could be used to determine eof leaving the bool to determine success / error. This is normally the case for buffer streams, but not for callbacks (e.g. in basic_stream/decode_stream it sets SIZE_MAX for the bytes_left).

Since I think the calls to read are always a specific size (vs some max buffer size with the intent of draining the stream) using an enum for a return value (OK, EOF, ERR) would also be sufficient it seems.

Edit: and the way the callbacks work around this to some degree is as stated in the original issue. When they return false, they also set bytes_left to zero on eof. Then further up the callstack at https://github.com/nanopb/nanopb/blob/master/pb_decode.c#L178C17-L178C18 it check if bytes_left is 0. However, at this point an error has already been set and any subsequent actual errors aren't reported properly. This could be worked around by modifying https://github.com/nanopb/nanopb/blob/master/pb_decode.c#L108 to only call PB_RETURN_ERROR if bytes_left != 0, and just returning false if bytes_left == 0. I think that would fix the issue, but the behavior isn't very straight forward.

This is part of the upcoming fast decode commit. The fast path didn't compare against bytes_left in the main loop, which caused a number of tests to fail as unexpected end-of-stream errors were raised on the stream. This change hoists the check for bytes_left from pb_decode_varint32_eof, up into pb_decode_tag, such that eof is set before an error is raised when called with bytes_left == 0. This simplifies the main loop from - while (stream->bytes_left) if (!pb_decode_tag()) ... to - while (pb_decode_tag()) ... This is also important because bytes_left isn't guaranteed to be valid for callback streams. This created an edge case where callback streams would set an "io error" when eof was hit, regardless of if decoding was successful or not. This error could then mask actual errors generated after the main loop (e.g. missing required fields). A workaround and accompanying regression test was added for this, and is being further tracked in issue nanopb#1017.

This is part of the upcoming fast decode commit. The fast path didn't compare against bytes_left in the main loop, which caused a number of tests to fail as unexpected end-of-stream errors were raised on the stream. This change hoists the check for bytes_left from pb_decode_varint32_eof, up into pb_decode_tag, such that eof is set before an error is raised when called with bytes_left == 0. This simplifies the main loop from - while (stream->bytes_left) if (!pb_decode_tag()) ... to - while (pb_decode_tag()) ... This is also important because bytes_left isn't guaranteed to be valid for callback streams. This created an edge case where callback streams would set an "io error" when eof was hit, regardless of if decoding was successful or not. This error could then mask actual errors generated after the main loop (e.g. missing required fields). A workaround and accompanying regression test was added for this, and is being further tracked in issue #1017.

PetteriAimonen added Component-Decoder Type-Review labels Oct 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

inconcistencies around end of stream reading #1017

inconcistencies around end of stream reading #1017

inolen commented Oct 3, 2024 •

edited

Loading

PetteriAimonen commented Oct 4, 2024

inolen commented Oct 4, 2024

PetteriAimonen commented Oct 4, 2024

inolen commented Oct 5, 2024 •

edited

Loading

PetteriAimonen commented Oct 6, 2024

inolen commented Oct 6, 2024 •

edited

Loading

inconcistencies around end of stream reading #1017

inconcistencies around end of stream reading #1017

Comments

inolen commented Oct 3, 2024 • edited Loading

PetteriAimonen commented Oct 4, 2024

inolen commented Oct 4, 2024

PetteriAimonen commented Oct 4, 2024

inolen commented Oct 5, 2024 • edited Loading

PetteriAimonen commented Oct 6, 2024

inolen commented Oct 6, 2024 • edited Loading

inolen commented Oct 3, 2024 •

edited

Loading

inolen commented Oct 5, 2024 •

edited

Loading

inolen commented Oct 6, 2024 •

edited

Loading