Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Separate parsing of 0-terminated messages into separate functions. #278

Closed
vlovich opened this issue Sep 2, 2017 · 12 comments
Closed

Separate parsing of 0-terminated messages into separate functions. #278

vlovich opened this issue Sep 2, 2017 · 12 comments

Comments

@vlovich
Copy link

vlovich commented Sep 2, 2017

It's not actually documented anywhere that a 0 tag in the stream will terminate the protobuf. This isn't consistent with any other protobuf implementation as all of them simply reject such input.

@PetteriAimonen
Copy link
Member

True enough, though I wish it also worked elsewhere. But in terms of backwards compatibility, I see no harm of it and no point removing it now.

@vlovich
Copy link
Author

vlovich commented Sep 2, 2017

I don't think there's a backwards incompatibility issue. I think it's a simple fix:

if (temp == 0)
{
   *eof = stream->bytes_left == 0; /* Special feature: allow 0-terminated messages. */
    return false;
}

Rejecting & accepting all the same input as reference implementations I think lowers the surprise that any corruption in a datastream silently causes things to succeed.

@PetteriAimonen
Copy link
Member

What about people who use that feature, such as some of my own code? It's quite handy when both sides of communication use nanopb.

@vlovich
Copy link
Author

vlovich commented Sep 2, 2017

I'm not sure I understand. In your code you create istream buffers larger than the size of the actual message & use implicit null termination to frame back-to-back messages? I'm also not sure what this has to do with both sides being nanopb. The encoder never produces these nulls. So these nulls would have to get injected by framing code external to nanopb no?

@PetteriAimonen
Copy link
Member

For example when using serial or network stream directly (using pb_istream_t/pb_ostream_t callback), one can just do pb_write(stream, "\0", 1); after the message to delimit them.

I agree it is kind of a dirty trick.

@vlovich
Copy link
Author

vlovich commented Sep 2, 2017

How would you know when to attempt decoding on the read side? Just keep accumulating until a decode succeeds?

Anyway, I think the extension could be preserved either as a runtime property stored in pb_istream_t, as a compile-time flag, or a mix where it's runtime configurable & defaults to a compile-time constant value on initialization that's on if unspecified.

@PetteriAimonen
Copy link
Member

No, not accumulating anywhere. By using a pb_istream_t callback that blocks on the read until the bytes are available. See for example here https://github.com/nanopb/nanopb/blob/master/examples/network_server/client.c#L80

What problem is this actually causing for you? I agree that it can cause some corrupted protobuf messages to be accepted by nanopb which could be rejected by protoc, but also protoc will accept plenty many corrupted protobuf messages. So what is the point of detecting a few more?

@PetteriAimonen
Copy link
Member

Hmm, in light of https://stackoverflow.com/questions/46031508/using-callbacks-for-nested-and-repeated-fields-in-a-protobuf-using-nanopb-in-c/ and https://stackoverflow.com/questions/26866911/can-proto-files-fields-start-at-zero/ , I'm beginning to agree that this does complicate debugging in some cases. Accidental null bytes are pretty common, and this feature is the only thing that causes the parser to stop early before decoding it all.

I think the best course of action is to separate this functionality into pb_decode_nullterminated() and pb_encode_nullterminated() in nanopb-0.4.0. This is then similar API-wise as pb_en/decode_delimited().

Sorry for my initial rebuff, it took me some time to realize the downsides of this feature :)

@PetteriAimonen PetteriAimonen changed the title pb_decode 0-terminated "special feature" seems like a mis-feature Separate parsing of 0-terminated messages into separate functions. Sep 8, 2017
PetteriAimonen added a commit that referenced this issue Sep 16, 2017
Nanopb has traditionally supported messages to be terminated with a zero tag.
However, this is not really standard protobuf behaviour, so it makes sense
to separate it into a different function. Because it is a breaking change,
it will happen in 0.4.0 release. But I add the functions here early so that
new code can start using them now.

Also changed the network_server example to use en/decode_delimited(), which
is the more common protobuf method of simple message framing.
PetteriAimonen added a commit that referenced this issue Jan 25, 2019
This should simplify specifying combinations, so that we don't
need every variant of pb_decode_noinit_delimited() etc.

Also disables zero-terminated decoding in pb_decode (issue #278).
@badrbouslikhin
Copy link

When sending the following buffer 10 00 10 01 10 02 10 03 10 04 (which correspond to the repeated uint32 array [0, 1, 2, 3, 4]) nanopb stops parsing after the first 00.
When sending the following buffer 10 01 10 02 10 03 10 04 (which correspond to a repeated uint32 array [1, 2, 3, 4]) nanopb parses the entire array.

Is this issue related to the described behaviour?

@PetteriAimonen
Copy link
Member

@MrZeroo00 No, it doesn't seem related. This issue only affects behaviour for invalid messages, while your message appears correctly formatted. Are you sure you are passing a length of 10 bytes to pb_istream_from_buffer() function?

@badrbouslikhin
Copy link

badrbouslikhin commented Apr 8, 2019

@PetteriAimonen I'm passing a buffer that's larger than 10 bytes to pb_istream_from_buffer(). It doesn't seem to be a size issue, I tried with smaller and larger buffers, the behaviour is as described above.

Here's the C code:

#include <ESP8266WiFi.h>
#include <ESP8266HTTPClient.h>

#include "pb_common.h"
#include "pb.h"
#include "pb_decode.h"

#include "test.pb.h"

const char* ssid = "XXXX";
const char* password = "XXXX";

String IP = "http://192.168.130.106/";

size_t message_length = 1024;

bool print_int32(pb_istream_t *stream, const pb_field_t *field, void **arg)
{
    uint64_t value;
    if (!pb_decode_varint(stream, &value))
        return false;
    
    unsigned long long1 = (unsigned long)((value & 0xFFFF0000) >> 16 );
    unsigned long long2 = (unsigned long)((value & 0x0000FFFF));

    String dec = String(long1, DEC)   String(long2, DEC);
    Serial.print(dec); 
    return true;
}

void setup () {
  Serial.begin(115200);
  Serial.printf("Connecting to %s ", ssid);
  WiFi.begin(ssid, password);
  while (WiFi.status() != WL_CONNECTED)
  {
    delay(500);
    Serial.print(".");
  }
  Serial.println();

  String payload;

  //Check WiFi connection status
  if (WiFi.status() == WL_CONNECTED) { 
    //Declare an object of class HTTPClient
    HTTPClient http;  

    //Specify request destination
    http.begin(IP);  
    //Send the request
    int httpCode = http.GET();

    //Check the returning code
    if (httpCode > 0) { 
      //Get the request response payload
      payload = http.getString();
      //Print the response payload
      Serial.println(payload);
    }
    //Close connection
    http.end();
  }

  // Decoding TestArrays structure
  TestArrays message = TestArrays_init_zero;
  const unsigned char *payloadChar = (const unsigned char *)payload.c_str();
  pb_istream_t stream = pb_istream_from_buffer(payloadChar, message_length);
  // Now we are ready to decode the message.
  // bool status = pb_decode(&stream, TestArrays_fields, &message.data);
  message.data.funcs.decode = &print_int32;
  if (!pb_decode(&stream, TestArrays_fields, &message)) {
    Serial.println("Decoding failed");
    Serial.print(PB_GET_ERROR(&stream));
  }
}

void loop() {
}

And here's the proto file:

syntax = "proto2";
 
message TestMessage {
  required int32 number = 1;
}

message TestArrays {
  repeated uint32 data = 2;
}

Should we create a new issue?

@PetteriAimonen
Copy link
Member

@MrZeroo00 Arduino String object cannot handle binary data, it will cut it short at the first 0x00 byte. Also, you should pass the real message length, not 1024 to pb_istream_from_buffer().

I don't use that platform, but looking at the headers something like this is probably closer to correct:

 char buffer[32];
 int size = http.getSize();
 http.getStream().readBytes(buffer, size);
 pb_istream_t stream = pb_istream_from_buffer(buffer, size);
 if (pb_decode(&stream, TestArrays_fields, &message))

If you still have problems, print out the actual data (as hex) that you are passing to pb_istream_from_buffer() and also the length of it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants
@vlovich @PetteriAimonen @badrbouslikhin and others