Separate parsing of 0-terminated messages into separate functions. #278

vlovich · 2017-09-02T05:25:21Z

It's not actually documented anywhere that a 0 tag in the stream will terminate the protobuf. This isn't consistent with any other protobuf implementation as all of them simply reject such input.

PetteriAimonen · 2017-09-02T05:26:53Z

True enough, though I wish it also worked elsewhere. But in terms of backwards compatibility, I see no harm of it and no point removing it now.

vlovich · 2017-09-02T06:23:57Z

I don't think there's a backwards incompatibility issue. I think it's a simple fix:

if (temp == 0)
{
   *eof = stream->bytes_left == 0; /* Special feature: allow 0-terminated messages. */
    return false;
}

Rejecting & accepting all the same input as reference implementations I think lowers the surprise that any corruption in a datastream silently causes things to succeed.

PetteriAimonen · 2017-09-02T08:28:14Z

What about people who use that feature, such as some of my own code? It's quite handy when both sides of communication use nanopb.

vlovich · 2017-09-02T18:52:58Z

I'm not sure I understand. In your code you create istream buffers larger than the size of the actual message & use implicit null termination to frame back-to-back messages? I'm also not sure what this has to do with both sides being nanopb. The encoder never produces these nulls. So these nulls would have to get injected by framing code external to nanopb no?

PetteriAimonen · 2017-09-02T19:37:33Z

For example when using serial or network stream directly (using pb_istream_t/pb_ostream_t callback), one can just do pb_write(stream, "\0", 1); after the message to delimit them.

I agree it is kind of a dirty trick.

vlovich · 2017-09-02T20:30:01Z

How would you know when to attempt decoding on the read side? Just keep accumulating until a decode succeeds?

Anyway, I think the extension could be preserved either as a runtime property stored in pb_istream_t, as a compile-time flag, or a mix where it's runtime configurable & defaults to a compile-time constant value on initialization that's on if unspecified.

PetteriAimonen · 2017-09-02T20:39:52Z

No, not accumulating anywhere. By using a pb_istream_t callback that blocks on the read until the bytes are available. See for example here https://github.com/nanopb/nanopb/blob/master/examples/network_server/client.c#L80

What problem is this actually causing for you? I agree that it can cause some corrupted protobuf messages to be accepted by nanopb which could be rejected by protoc, but also protoc will accept plenty many corrupted protobuf messages. So what is the point of detecting a few more?

PetteriAimonen · 2017-09-08T05:56:27Z

Hmm, in light of https://stackoverflow.com/questions/46031508/using-callbacks-for-nested-and-repeated-fields-in-a-protobuf-using-nanopb-in-c/ and https://stackoverflow.com/questions/26866911/can-proto-files-fields-start-at-zero/ , I'm beginning to agree that this does complicate debugging in some cases. Accidental null bytes are pretty common, and this feature is the only thing that causes the parser to stop early before decoding it all.

I think the best course of action is to separate this functionality into pb_decode_nullterminated() and pb_encode_nullterminated() in nanopb-0.4.0. This is then similar API-wise as pb_en/decode_delimited().

Sorry for my initial rebuff, it took me some time to realize the downsides of this feature :)

Nanopb has traditionally supported messages to be terminated with a zero tag. However, this is not really standard protobuf behaviour, so it makes sense to separate it into a different function. Because it is a breaking change, it will happen in 0.4.0 release. But I add the functions here early so that new code can start using them now. Also changed the network_server example to use en/decode_delimited(), which is the more common protobuf method of simple message framing.

This should simplify specifying combinations, so that we don't need every variant of pb_decode_noinit_delimited() etc. Also disables zero-terminated decoding in pb_decode (issue #278).

badrbouslikhin · 2019-04-08T09:12:33Z

When sending the following buffer 10 00 10 01 10 02 10 03 10 04 (which correspond to the repeated uint32 array [0, 1, 2, 3, 4]) nanopb stops parsing after the first 00.
When sending the following buffer 10 01 10 02 10 03 10 04 (which correspond to a repeated uint32 array [1, 2, 3, 4]) nanopb parses the entire array.

Is this issue related to the described behaviour?

PetteriAimonen · 2019-04-08T11:30:26Z

@MrZeroo00 No, it doesn't seem related. This issue only affects behaviour for invalid messages, while your message appears correctly formatted. Are you sure you are passing a length of 10 bytes to pb_istream_from_buffer() function?

badrbouslikhin · 2019-04-08T12:50:36Z

@PetteriAimonen I'm passing a buffer that's larger than 10 bytes to pb_istream_from_buffer(). It doesn't seem to be a size issue, I tried with smaller and larger buffers, the behaviour is as described above.

Here's the C code:

#include <ESP8266WiFi.h>
#include <ESP8266HTTPClient.h>

#include "pb_common.h"
#include "pb.h"
#include "pb_decode.h"

#include "test.pb.h"

const char* ssid = "XXXX";
const char* password = "XXXX";

String IP = "http://192.168.130.106/";

size_t message_length = 1024;

bool print_int32(pb_istream_t *stream, const pb_field_t *field, void **arg)
{
    uint64_t value;
    if (!pb_decode_varint(stream, &value))
        return false;
    
    unsigned long long1 = (unsigned long)((value & 0xFFFF0000) >> 16 );
    unsigned long long2 = (unsigned long)((value & 0x0000FFFF));

    String dec = String(long1, DEC)   String(long2, DEC);
    Serial.print(dec); 
    return true;
}

void setup () {
  Serial.begin(115200);
  Serial.printf("Connecting to %s ", ssid);
  WiFi.begin(ssid, password);
  while (WiFi.status() != WL_CONNECTED)
  {
    delay(500);
    Serial.print(".");
  }
  Serial.println();

  String payload;

  //Check WiFi connection status
  if (WiFi.status() == WL_CONNECTED) { 
    //Declare an object of class HTTPClient
    HTTPClient http;  

    //Specify request destination
    http.begin(IP);  
    //Send the request
    int httpCode = http.GET();

    //Check the returning code
    if (httpCode > 0) { 
      //Get the request response payload
      payload = http.getString();
      //Print the response payload
      Serial.println(payload);
    }
    //Close connection
    http.end();
  }

  // Decoding TestArrays structure
  TestArrays message = TestArrays_init_zero;
  const unsigned char *payloadChar = (const unsigned char *)payload.c_str();
  pb_istream_t stream = pb_istream_from_buffer(payloadChar, message_length);
  // Now we are ready to decode the message.
  // bool status = pb_decode(&stream, TestArrays_fields, &message.data);
  message.data.funcs.decode = &print_int32;
  if (!pb_decode(&stream, TestArrays_fields, &message)) {
    Serial.println("Decoding failed");
    Serial.print(PB_GET_ERROR(&stream));
  }
}

void loop() {
}

And here's the proto file:

syntax = "proto2";
 
message TestMessage {
  required int32 number = 1;
}

message TestArrays {
  repeated uint32 data = 2;
}

Should we create a new issue?

PetteriAimonen · 2019-04-08T15:41:41Z

@MrZeroo00 Arduino String object cannot handle binary data, it will cut it short at the first 0x00 byte. Also, you should pass the real message length, not 1024 to pb_istream_from_buffer().

I don't use that platform, but looking at the headers something like this is probably closer to correct:

 char buffer[32];
 int size = http.getSize();
 http.getStream().readBytes(buffer, size);
 pb_istream_t stream = pb_istream_from_buffer(buffer, size);
 if (pb_decode(&stream, TestArrays_fields, &message))

If you still have problems, print out the actual data (as hex) that you are passing to pb_istream_from_buffer() and also the length of it.

PetteriAimonen added BreakingChange Component-Decoder Component-Encoder Priority-Low Type-Enhancement labels Sep 8, 2017

PetteriAimonen changed the title ~~pb_decode 0-terminated "special feature" seems like a mis-feature~~ Separate parsing of 0-terminated messages into separate functions. Sep 8, 2017

PetteriAimonen mentioned this issue Sep 29, 2017

Bug or Inconsistency with protobuf while encoding "uint32" datatype #288

Closed

eleree mentioned this issue Jul 21, 2018

error in changlog.txt? #329

Closed

PetteriAimonen added the FixedInGit label Jan 25, 2019

PetteriAimonen closed this as completed Dec 20, 2019

PetteriAimonen mentioned this issue Oct 4, 2024

inconcistencies around end of stream reading #1017

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Separate parsing of 0-terminated messages into separate functions. #278

Separate parsing of 0-terminated messages into separate functions. #278

vlovich commented Sep 2, 2017 •

edited

Loading

PetteriAimonen commented Sep 2, 2017

vlovich commented Sep 2, 2017 •

edited

Loading

PetteriAimonen commented Sep 2, 2017

vlovich commented Sep 2, 2017

PetteriAimonen commented Sep 2, 2017

vlovich commented Sep 2, 2017

PetteriAimonen commented Sep 2, 2017

PetteriAimonen commented Sep 8, 2017

badrbouslikhin commented Apr 8, 2019

PetteriAimonen commented Apr 8, 2019

badrbouslikhin commented Apr 8, 2019 •

edited

Loading

PetteriAimonen commented Apr 8, 2019

Separate parsing of 0-terminated messages into separate functions. #278

Separate parsing of 0-terminated messages into separate functions. #278

Comments

vlovich commented Sep 2, 2017 • edited Loading

PetteriAimonen commented Sep 2, 2017

vlovich commented Sep 2, 2017 • edited Loading

PetteriAimonen commented Sep 2, 2017

vlovich commented Sep 2, 2017

PetteriAimonen commented Sep 2, 2017

vlovich commented Sep 2, 2017

PetteriAimonen commented Sep 2, 2017

PetteriAimonen commented Sep 8, 2017

badrbouslikhin commented Apr 8, 2019

PetteriAimonen commented Apr 8, 2019

badrbouslikhin commented Apr 8, 2019 • edited Loading

PetteriAimonen commented Apr 8, 2019

vlovich commented Sep 2, 2017 •

edited

Loading

vlovich commented Sep 2, 2017 •

edited

Loading

badrbouslikhin commented Apr 8, 2019 •

edited

Loading