-
Notifications
You must be signed in to change notification settings - Fork 5.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Immutable dynamic arrays #12587
Comments
love the code sample |
Since it's really not dynamic array, better syntax would be |
Thanks for the issue - this has been on our longer term agenda ever since we introduced immutables :-) (I'm actually surprised that it wasn't requested much earlier ;-)). One thing to be aware of is that this will not be as cheap as value-type immutables, because we won't be able to inline the values directly - in most cases I'd expect the values to end up being In retrospect, I'd much rather have introduced what we now have as |
I haven't given this much thought, so I can't quite comment on complexity of such an implementation. In all scenarios where I'd use this however, a maximum length (which also means there's some wasted bytecode on the unused slots) is good enough.
Yes, while the bytecode size for immutable variables in my example would be the same, the actual code that accesses them would not (both size and gas would differ). I imagine instead of a push we'd just have codecopy mload (with no need to expand allocated memory), which would use very little gas. (except for .length, which would be a push) |
Just would like to bring up a particular use case of mapping to an immutable array like in |
Hm, I'm not quite sure how you'd do mappings as they are currently extremely storage based (since they key is just a 256 bit hash, there's no notion of the 'capacity' of the map). But you'd likely be able to implement such a thing yourself using multiple immutable arrays (a few with the actual data, and one that tells you which mapping to use), though it'd require some boilerplate similar to the one in my code sample, even if this feature was implemented. I'd limit discussion here to the feature discussed in the opening post. |
I think it's similar to
Is this an immutable struct? |
Hi @ekpyron! I keep running into scenarios where such a feature would be extremely useful :) Is there an estimated timeline for this feature? Is there any work the community can do to push it forward (e.g. come up with a more detailed spec, begin implementation work, etc.)? |
This fails miserably, but might work if/when this lands: ethereum/solidity#12587 As is, adds ~5KB to contract size on the factory even moving `getAssetInfo` to the harness.
@nventuro Unfortunately, there hasn't been progress on this and we don't have an estimated timeline for it. For now I moved it up in our design backlog, which will hopefully lead to us discussing it in one of our calls next week. The main problem is that the implementation of this won't be trivial and I guess probably either @chriseth or I would have to do it and I'm not sure when we'll find the time. Conceptually, the main question I think is whether we want to enforce an "assign once and do not read" logic for dynamic immutable types like we do for immutables of value type. |
This fails miserably, but might work if/when this lands: ethereum/solidity#12587 As is, adds ~5KB to contract size on the factory even moving `getAssetInfo` to the harness.
We just discussed this in our design call and decided to make it a goal for Q2 2022. Current rough sketch of a plan is:
However, we will probably have some iterations on this plan as we start implementing this. |
The |
A |
Here's an extreme example: 8 sets of 30 immutable values each that could be implemented as code/immutable arrays using this feature. |
How would the immutable values be stored in the code? Having dynamically sized data in the middle of the bytecode would make symbolic analysis of such bytecode even more annoying (perhaps even impossible?) than it already is. Here are some things that are already difficult to deal with when symbolically executing (given the existance of dynamic constructor args and static immutables):
These can currently be worked around as long as dynamic data exists at the end of the bytecode, in this case we can produce an index beyond which all bytecode is known to be a symbolic value, and error out if we attempt to execute past this index (symbolic execution of symbolic opcodes is not a tractable problem). I'm not sure how I would handle this if code could contain arbitrary pieces of symbolic data with an unknown length at any location in the code. side note: EIP-3540 cannot come soon enough... |
I'd say that the immutable would definitely be stored after the opcodes. |
Hm... thinking about actually implementing this, it may be a problem to parse a |
In that case could the keyword be |
That's a bit weird for a full proper data location...
But I may be able to work around the parsing problem, if we restrict to arrays only... so that would mean that the first version would only allow the new |
If we had a proper reference syntax we wouldn't need this new |
Can you elaborate on that? Reference syntax and data locations are pretty orthogonal. |
The only reason we need |
I honestly don't understand what you're saying :-). The |
Would love to see |
Any chance to have it implemented? |
Yes. It's on our roadmap, planned for Q3: #13723. |
By the way: we may want to reconsider the name after all... it's not unlikely that one of the upcoming EVM upgrades will move from Not that it's the most important holdup here. But open for other suggestions. |
@ekpyron I would propose to use naming |
Will we be able to use this keyword for constructor arguments? |
I don't think it makes sense for arguments of constructors or external functions. What would that even mean? |
Currently constructor arguments can be memory-only. I thought it could be |
Technically it could, but I don't see any benefit in that. I.e. this would mean that we reserve space for them in the memory area where the constructor is building the runtime code and copy them there from calldata. This seems pointless because the runtime code would not be able to access them (the memory is there but you cannot access constructor's parameter outside of the constructor). You could achieve the same thing explicitly by declaring an immutable and copying the constructor parameter there yourself.
Interesting. I actually did not know about that that limitation and I don't see why it has to be there. In #9514 it looks like the thing being fixed was parameters without location specified at all and there's no explanation why calldata was excluded. It was done in 0.7.0, which was the time when calldata parameters were already allowed anywhere. Could be that it was just considered a corner case not worth handling - in the constructor we have the whole contract bytecode in calldata so the code finding the parameters has to account for that. @chriseth do you remember the context for that change? |
@cameel as far as I can see in constructor |
During contract creation you have a copy of the init code (including constructor arguments) in calldata, so So yes, generally both allowing |
@ekpyron, strange, I see contract A {
event Log(uint256);
constructor(uint256 a, bytes memory b) {
emit Log(msg.data.length);
}
} => {
"from": "0xf8e81D47203A863245E36C48e151709F0C19fBe8",
"topic": "0x909c57d5c6ac08245cf2a6de3900e2b868513fa59099b92b27d8db823d92df9c",
"event": "Log",
"args": {
"0": "0"
}
} |
@k06a Haha, that's fascinating, you're right! I didn't actually check the spec on this, but evmone, the execution engine we use for testing, does apparently implement this with incorrect behaviour, I wouldn't have thought that possible. I'll file an issue with them - they should pass the consensus tests, so I'm a bit puzzled about how this can be. In any case, at least |
@ekpyron what do you think about idea to introduce new EIP for |
Yes, that would definitely be helpful - but it's hard these days to get an EIP accepted and live. It'd be quite nice if all of that actually went live at some point - it would make all contracts noticeably cheaper and have quite nice analysis properties - but the EIP process is complicated, straining and unpredictable, I wouldn't wager to bet on the chances nor the ETA at this point. |
I see let tmp := mload(0) // Store the value of memory[0]
codecopy(0, offset, 32) // Copy bytecode 32 bytes to memory[0]
let result := mload(0) // Reading 32 bytes to stack
mstore(0, tmp) // Restore the value of memory[0] |
Yes - since we reserve scratch space, we could in practice even implement it as a plain
Still quite a bit away from the presumed 3 gas a |
@ekpyron if you would try to simulate this instruction in Yul/Assembly, you would definitely need to backup and restore |
We reserve the memory between 0 and 0x3f for temporary calculations, like for hashing (the keccak opcode is also only defined for memory, and storage slot calculation often involve hashing of single words). |
I understand, but no one wants the following code to break unexpectedly: mstore(0, typehash)
mstore(0x20, codeload(offset))
let hash := keccak256(0, 0x40) |
Sure - which is why I wouldn't define a simulated |
BTW we could define it as a function when Solidity will support global assembly blocks: function codeload(offset) -> value {
let tmp := mload(0)
codecopy(0, offset, 0x20)
value := mload(0)
mstore(0, tmp)
} |
Abstract
Add native support for dynamic
immutable
arrays, with a length defined at construction time (up to a maximum) and automatic bounds checks.Motivation
immutable
is a fantastic feature, allowing for highly efficient configuration access without forcing developers to use hardcoded constants, making testing and deployment much easier and enjoyable. However, there is zero support for any kind of arrays.What I describe can be manually implemented today in solc >0.7, but it leads to a very large amount of error-prone boilerplate. No alternative exists for gas-efficient access to these values. You can look at production code using this pattern here. Using the proposed features in such a contract would reduce sour code size to less than a third of the original length, while also increasing auditability and maintainability.
Specification
'Dynamic' may be a confusing term. What I mean is that the contents and length would be dynamically defined at construction time. Read access with bounds check using the
[]
operator and.length
would both be supported. This feature set is essentially the same as that of memory arrays, except memory arrays can have their contents mutated.Truly dynamic immutable arrays would be quite an undertaking as the length of the runtime code would be dependent on construction arguments, but I'd argue that going that far is unnecessary. In all applications I've seen, there is a compile-time known maximum array length.
The proposed feature could therefore work as follows:
uint256[] private immutable(50) foo
)foo.length
andfoo[i]
can be used to write (not read) to the array. For simplicity, we could allow for each value to be written multiple times and perform no bounds check (other than.length <= N
andi < N
).foo.length
returns the array's length andfoo[i]
performs bounds check withfoo.length
.Backwards Compatibility
This is a new feature using brand new (as of today invalid) syntax, so there should be no backwards compatibility implications.
Extra
Code samples
As mentioned above, this can be implemented today using standard Solidity. See for example how one might declare and use an up to 10 tokens array:
This is of course very error-prone and hard to maintain, but it is the best way to get the desired behavior. With the suggested feature, we get the exact same bytecode size and performance using the following code:
The text was updated successfully, but these errors were encountered: