Support Bit Packed Bools #10

MiddleMan5 · 2024-06-13T00:30:00Z

Hey, I've been using this library and it's great!

One limitation I ran into using it is that the binary format I'm working with encodes bools as single bits instead of an entire byte. Maybe offer a mode to extract individual bits out as bools?

ghostiam · 2024-06-13T13:12:44Z

Hi, thanks for your interest in the project!

I thought about this, but there might be a problem, what to do with the remaining bits? Just ignore them?

Now, as an option, you can create your own enum type with bit flags.
You can also add methods to the new type to more conveniently obtain bool values.

type MyDataFlags uint8

const (
	MyDataFlag1 MyDataFlags = 1 << iota
	MyDataFlag2
	MyDataFlag3
	MyDataFlag4
	MyDataFlag5
	MyDataFlag6
	MyDataFlag7
	MyDataFlag8
)

func (f MyDataFlags) HasFlag1() bool {
	return f&MyDataFlag1 != 0
}

func (f MyDataFlags) HasFlag2() bool {
	return f&MyDataFlag2 != 0
}

// etc...

type MyData struct {
	Flags MyDataFlags
}

func main() {
	data := []byte{0b01010101}
	var actual MyData
	err := binstruct.UnmarshalBE(data, &actual)
	if err != nil {
		log.Fatal(err)
	}

	println(actual.Flags & MyDataFlag1) // 1
	println(actual.Flags & MyDataFlag2) // 0

	// or

	println(actual.Flags.HasFlag1()) // true
	println(actual.Flags.HasFlag2()) // false
}

MiddleMan5 · 2024-06-17T19:15:43Z

Thanks for the example code! Yeah that's basically the approach we're taking now. The downside is we have to keep track of a "bit offset" manually externally to the code which is kind of a pain. Bools being packed as single bits is pretty common in a lot of the binary formats I've worked with, so I think this is a valid use case.

One possible implementation would involve tracking the total bit offset instead of byte offset internally. As you pop off byte aligned chunks to decode from the stream this offset would get incremented * 8. When decoding you could check that the current offset was a multiple of 8 bits and if so retain the current logic.

Decoding bit-packed bool fields would be a little different; when popping off a bit the bit offset would become a non-multiple of 8 bits. Any subsequent operations would need to read in the correct number of bits from the stream and re-align them to the data type.

I think offering a "read n bits" function would also be helpful to allow the user to choose to drop the remaining bits.

Underflow logic would remain the same aligned vs. unaligned. If you try to read a byte and only 7 bits remain then a regular underflow error occurs (user or input is wrong)

I might have time to open up an example PR if you're interested, what do you think?

ghostiam · 2024-06-18T07:17:48Z

Thanks for the detailed description!
Yes, I would be glad to see an example. Is there an example of some open/popular protocol/data using bit offset?

I think we can add such functionality, but we need to have functionality for explicit transition to unaligned (maybe a tag "read bits and remain unaligned", like bits:3,unalign), and return to aligned mode, which will discard unread bits.
Ideally, we'd add an option to NewDecoder/Unmarshal, but the library doesn't support it at the moment :( (Improvement in plans).

One possible implementation would involve tracking the total bit offset instead of byte offset internally.

I think I would still stay with byte offset, but add additional fields to control the bit shift. As long as the bit shift is 0, we can use the same logic as we have now. But I'll think about it some more.

MiddleMan5 · 2024-06-19T00:26:36Z

I think I like the idea of controlling shifting between unaligned and aligned access modes and discarding bits on the transition yeah!

The codebase I've been working with most recently that provides generic binary encoding/decoding approaches supporting non-byte-aligned codecs is openc3. The codebase is written in python/ruby and is by no means easy to read but I'll include it here for reference. Specifically the packet item accessors that handle converting from input fields to binary representations:
https://github.com/OpenC3/cosmos/blob/a3f4b9a3ccb9097fd9d1a73645886ad60c83f754/openc3/python/openc3/accessors/binary_accessor.py#L157

The protocols I work with unfortunately are proprietary, but the ones I know of that are public that allow for non-byte aligned packing off the top of my head are:

Cap'n Proto - https://github.com/capnproto/capnproto
Thrift - https://github.com/apache/thrift

ghostiam · 2024-06-20T20:51:09Z

I thought about what functionality is needed to introduce bit-offset:

BE/LE tags should behave like MSB/LSB when working with bits (or add explicit aliases?);
Support for signed numbers when converting from bits;
What to do with offset? it's in bytes. Reset to aligned mode?
I think it’s worth reading several bytes at once into a buffer like uint64 (reading only 7 bytes, 1 byte will remain for shifted bits) to make it easier to shift bits in several bytes at a time.

I will add more as ideas and questions arise.

ghostiam added the enhancement New feature or request label Jun 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support Bit Packed Bools #10

Support Bit Packed Bools #10

MiddleMan5 commented Jun 13, 2024

ghostiam commented Jun 13, 2024

MiddleMan5 commented Jun 17, 2024

ghostiam commented Jun 18, 2024 •

edited

Loading

MiddleMan5 commented Jun 19, 2024 •

edited

Loading

ghostiam commented Jun 20, 2024

Support Bit Packed Bools #10

Support Bit Packed Bools #10

Comments

MiddleMan5 commented Jun 13, 2024

ghostiam commented Jun 13, 2024

MiddleMan5 commented Jun 17, 2024

ghostiam commented Jun 18, 2024 • edited Loading

MiddleMan5 commented Jun 19, 2024 • edited Loading

ghostiam commented Jun 20, 2024

ghostiam commented Jun 18, 2024 •

edited

Loading

MiddleMan5 commented Jun 19, 2024 •

edited

Loading