Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Syntax for manual padding/alignment within packed struct #676

Closed
skyfex opened this issue Jan 8, 2018 · 15 comments
Closed

Syntax for manual padding/alignment within packed struct #676

skyfex opened this issue Jan 8, 2018 · 15 comments
Labels
proposal This issue suggests modifications. If it also has the "accepted" label then it is planned.
Milestone

Comments

@skyfex
Copy link

skyfex commented Jan 8, 2018

When packed structs describe memory-mapped registers or binary files/protocols, it is often necessary to pad the struct to correctly position the fields within the struct. In C this has normally been solved by creating special padding fields. An example is:

const GpioConf = struct {
  direction: u1,
  _reserved1: u3,
  pull: u1
  _reserved2: u3,
 ....
}

The downside is that you have to name something you're not interested in naming, and specify a type that has no meaning other than its bit-width. You'll have to manually maintain some kind of numbering scheme for the reserved/padding fields.

In addition, since Zig requires you to specify every field when instantiating a struct, you end up having to fill in the padding fields as well:

pinCnf = GpioConf { .direction = 1, ._reserved1 = 0, .pull = 1 , .... };

It would be very useful to have a special syntax for specifying padding. An example might be:

const GpioConf = struct {
  direction: u1,
  padding(3),
  pull: u1,
  padding(3),
  ...
};

Or maybe it should be "@padding(3)". You could also use a type name (u3) instead of a number to specify bit size. But then there's more than one way to specify padding ("u3", "i3", etc.)

Another solution would be that Zig to just treat fields named with a special prefix differently. E.g., when prefixed with "_", field names don't collide, refering to them is illegal, and you don't have to specify them in instantiations. Less special syntax, and you could choose a descriptive field name ("_reserved", "_padding", "_deprecated", etc.) but you'd have to specify a type.

A good idea might be to consider how this syntax could be extended in the future. If you use the function-call like syntax, can you call any compile-time function? Can a compile-time function return a field, or list of fields to be inserted into a struct?

This issue was spawned by a discussion in #488

@andrewrk andrewrk added the proposal This issue suggests modifications. If it also has the "accepted" label then it is planned. label Jan 8, 2018
@andrewrk
Copy link
Member

andrewrk commented Jan 8, 2018

Once concern I have is, if the field is reserved, doesn't that sometimes mean that it must be set to 0?

So sometimes the most correct thing would be:

GpioConf { .direction = 1, ._reserved1 = 0, .pull = 1 , .... };

and other times it is not specified to have to be 0, so this is the most correct thing:

GpioConf { .direction = 1, ._reserved1 = undefined, .pull = 1 , .... };

Another thought I had is that we already have syntax to ignore return values from functions like this:

_ = foo();

If we decided to have some special handling of padding, we could expand this concept to fields:

const GpioConf = struct {
  direction: u1,
  _: u3,
  pull: u1
  _: u3,
 ....
};

@andrewrk andrewrk added this to the 0.3.0 milestone Jan 8, 2018
@skyfex
Copy link
Author

skyfex commented Jan 8, 2018

Yes, in most cases the value of reserved fields should be 0. That's a good point.

There might be usecases where you want a reserved or static field that should be non-0 when instantiated. That speaks in favor of having named padded/reserved fields. It would combine naturally with a syntax for default values for fields:

const GpioConf = struct {
  direction: u1,
  _deprecated: u3 = 2,
  pull: u1,
  _: u3 = 0
};

This could have other uses. If the struct describes a file structure that has a magic sequence in its header, you could specify this in the struct:

const Jpeg = struct {
   _soi: u16 = 0xFFD8
  ....
};

@PavelVozenilek
Copy link

Sometimes it is needed to signal there's no padding, contrary to the rules. The _ : un syntax does not support this. Unused parts of a structure also should not have initial value.

@thejoshwolfe
Copy link
Contributor

Sometimes it is needed to signal there's no padding

You could say _: void, although do you really need to do that? You could also just leave a comment.

@skyfex
Copy link
Author

skyfex commented Jan 10, 2018

I just realized I should have used "packed struct" in the examples.

@PavelVozenilek Can you mention a usecase for signaling no padding? Do you mean that you want to be able to use a non-packed struct, but to have exceptions to the alignment that the compiler does?

@PavelVozenilek
Copy link

PavelVozenilek commented Jan 11, 2018

@skyfex: certain structures use non-standard (not naturally aligned) padding (some image formats, couple of Win32 structs). Signalling padding(0) would tell the compile it not to be smart and not to insert anything there.

The problem with padding is that it is invisible. Performance obsessive people (and those on architectures where unaligned access throws - old Alpha was one) spend lot of time checking it again and again. If padding is explicit (and verified by the compiler) then all this busywork goes away.

@andrewrk
Copy link
Member

Performance obsessive people can use packed structs, which have no padding, and have fields in the order listed. Others can use normal structs, which have correct alignment for all fields and choose an optimal field order for the programmer. There's no need for padding(0).

@tgschultz
Copy link
Contributor

I don't know the reason default values aren't in Zig currently, but if I had to guess I'd say it was because it makes it unclear to the reader what is happening when the struct is instantiated? If so, here's a thought on how to deal with that:

const Thing = packed struct {
    a: u8 = 0x00,
    _: u8 = 0,
    b: u12 = 0xF5C,
    _: u4 = 0,
};

...

var the: Thing = defaults;
var other: Thing = defaults{.b = 0xC55};

Here Thing is defined with unnamed fields used as padding that is set to 0 by default, and named fields "a" and "b" set to 0 and 0xF5C by default respectively. At the instancing site the programmer can either initialize all fields (except the padding) as normal, or use the "defaults" keyword to initialize all values to the declared default as in "the". Optionally, they can override the default initialization value on select fields as seen with "other".

const Thing = packed struct {
    a: u8 = 0x00,
    _: u8,
    b: u12,
    _: u4,
};

...

var the: Thing = defaults;  //produces an error since named fields are declared which have no defaults
var other: Thing = defaults{.b = 0xC55};  //allowed, unnamed fields are simply left uninitialized

If the struct doesn't declare a default value for a field, and it is not provided during instancing, it would produce a compile time error.

@andrewrk
Copy link
Member

This introduces 2 new syntax primitives, which I don't think is necessary.

If we were going to have default values in structs, it would look like this:

const Thing = packed struct {
    a: u8 = 0x00,
    _: u8 = 0,
    b: u12 = 0xF5C,
    _: u4 = 0,
};

...
var the = Thing {}; // same syntax; fields with default values may be omitted
var other = Thing {.b = 0xC55}; // same thing

This looks pretty reasonable to me, and maybe we'll do it. The only reason not to would be to encourage all the initialization of a struct to happen in the same place, or maybe just to keep the language smaller. But it doesn't increase the language size by much, and the initialization in the same place idiom is dubious. Default values with fields would also be harmonious with #68 (hot code swapping).

@tgschultz
Copy link
Contributor

My reasoning for the "defaults" keyword was to make it obvious at the instantiation site that there was hidden initialization happening. I'm not sure that would ever be a big deal, but I can imagine a case where some gargantuan struct is initialized with all defaults except a few fields and the reader may not realize that a heavier instantiation is taking place.

@andrewrk
Copy link
Member

This is solved by accepting #485.

But see also #1512.

Let's avoid doing the _ thing for now and see how painful reserved1 reserved2 etc is.

@skyfex
Copy link
Author

skyfex commented Nov 21, 2018

I wouldn't say it solves this. But it provides a very adequate work-around.

A real solution for padding would be future proof (i.e. language explicitly knows that a field is intended to be padding), explicitly documented, and provides only one way to do it. But it's more of a nice-to-have now than absolutely necessary.

A possible solution based on the current language features is to simply add a padding type to the standard library. So instead of using uX as the de-facto type for padding, it would be padByX or something. Could be written up as a new proposal though.

@andrewrk
Copy link
Member

#3133 would implement this. Example would be:

const GpioConf = struct {
  direction: u1,
  pull: u1 align(0:4:1), // 0 byte aligned, 4 bit offset into a 1 byte host integer
  next_byte: u8 align(1),
};

Although I still think the explicit padding is a bit easier to read:

const GpioConf = struct {
  direction: u1,
  reserved1: u3 = 0,
  pull: u1,
  reserved2: u3 = 0,

  next_byte: u8, // zig infers align(1)
};

Also the padding in the first example would not get set to 0, it would get set to undefined. So I think the second example (which is possible in status quo zig today) is the better solution, for this use case.

@skyfex
Copy link
Author

skyfex commented Aug 28, 2019

Another thing to keep in mind when it comes to padding, is compile-time reflection.

It's quite possible that you'd want to be able to write compile time functions that ignore padding. If you use a convention for named fields, you couldn't write a function that would work universally. Not sure if anonymous fields is good for reflection either though. So this perspective maybe speaks to having some kind of special way of tagging fields as padding. Alignment would work well, but then you don't get to set the value of the padding as you say.

@andrewrk
Copy link
Member

For this case, it's not really padding though, is it? A reserved field is a different concept than padding, because it matters what the field will be in a future ABI. So having the reserved fields set to 0 really is part of the semantics. On the other hand, with padding, it's just for field alignment purposes, and will be ignored by compile time functions doing reflection. So I think that #3133 is headed in a good direction.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
proposal This issue suggests modifications. If it also has the "accepted" label then it is planned.
Projects
None yet
Development

No branches or pull requests

5 participants