Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

accounts/abi: add another unpack interface #15770

Merged
merged 5 commits into from
Feb 21, 2018

Conversation

holiman
Copy link
Contributor

@holiman holiman commented Dec 28, 2017

This PR adds the ability to unpack ABI-encoded data without supplying a data structure to place the data into.

It allows the caller to do method.Inputs.UnpackValues(data) , and receive a []interface{} containing the data in whatever type is used by the abi. This enables the usecase that the actual abi-spec is not known in advance, such as when the signer wants to decode arbitrary data.

The old API-methods are left intact; when a programmer wants to interact with a specific (known) ABI, it's probably (?) useful to be able to handles custom structs instead of casting suff on-the-fly.

This PR uses the same set of testcases that was already in use for Unpack, and use them to test the new method(s). The PR does (should) not interfere with existing code/features, and should be fairly safe to merge.

@holiman holiman requested a review from gballet December 28, 2017 10:27
@holiman
Copy link
Contributor Author

holiman commented Dec 30, 2017

I've mow modified this PR, so that the more complex unpack-into-struct uses the unpackValues under the hood, and does not duplicate the decoding part. This makes a cleaner separation between the two layers, and decreases the overall complexity.

@holiman
Copy link
Contributor Author

holiman commented Dec 30, 2017

(which also means that I could remove the unpack_testV2 file)

Copy link
Member

@gballet gballet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a minor comment for using an iterator pattern instead of building an array on the fly, otherwise looks good to me.

@@ -67,59 67,55 @@ func (arguments Arguments) LengthNonIndexed() int {
return out
}

func (arguments Arguments) NonIndexed() Arguments {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wouldn't it make sense to rewrite it as an iterator, since it's mostly called with range and the only time that you don't, you only need the first value? That would avoid parsing the entire array if you don't need to.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@holiman Please add a method doc.

@gballet Usually a method will have only a handful of arguments. Creating and maintaining an iterator would be a much larger overhead than cycling through 5 elements of a slice.

Copy link
Member

@karalabe karalabe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general I like this PR, since it separates the ABI decoding a bit and adds an intermediate layer in between that we can unit test.

A question however is whether we can attack this intermediate layer? When marshaling into explicit Go types, we immediately discard binary data that doesn't match what we expect. Here however we accept any meaningful data and only validate it later.

I.e. I think it would be very very wise to run this code through gofuzz for starters, and also to explore whether it could go OOM with an appropriately constructed input.

@@ -137,11 137,11 @@ func (abi *ABI) UnmarshalJSON(data []byte) error {

// MethodById looks up a method by the 4-byte id
// returns nil if none found
func (abi *ABI) MethodById(sigdata []byte) *Method {
func (abi *ABI) MethodById(sigdata []byte) (*Method, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we could use [4]byte here to compile time ensure the size is ok?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could, but then the caller needs to convert his slice of bytes into an array instead, with some for-loop. Is that really nicer?

}
}
return nil
return nil, fmt.Errorf("ABI spec does not contain method signature in data: 0x%x", sigdata[:4])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally Go uses shorter error messages since they tend to get appended to one another. If possible, try to make it a bit more concise. E.g. fmt.Errorf("no method with id %#x", sigdata[:4])

Note 0x%x is the same at %#x for formatting.

b := fmt.Sprintf("%v", abi.MethodById(m.Id()))
m2, err := abi.MethodById(m.Id())
if err != nil {
t.Fatal(err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And this is where you append errors :)

t.Fatalf("Failed to look up ABI method: %v", err)

@@ -67,59 67,55 @@ func (arguments Arguments) LengthNonIndexed() int {
return out
}

func (arguments Arguments) NonIndexed() Arguments {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@holiman Please add a method doc.

@gballet Usually a method will have only a handful of arguments. Creating and maintaining an iterator would be a much larger overhead than cycling through 5 elements of a slice.

if err != nil {
return err
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick, but why so many empty lines? They are useful for explicitly separating logical blocks, but here the } line in itself is already a nice visual separator so adding 1 empty line seems a tad wasteful.

retval := make([]interface{}, 0, arguments.LengthNonIndexed())

virtualArgs := 0

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why an empty line?

marshalledValue, err := toGoType((index virtualArgs)*32, arg.Type, data)

if arg.Type.T == ArrayTy {
//If we have a static array, like [3]uint256, these are coded as
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Space between // and ``If`

return retval, nil
}

// UnpackValues performs the operation Go format -> Hexdata
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UnpackValues -> PackValues


// UnpackValues performs the operation Go format -> Hexdata
// It is the semantic opposite of UnpackValues
func (arguments Arguments) PackValues(args []interface{}) ([]byte, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this method, if we can just call Pack(args...) instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need it, I just added to have coder<->decoder pipe with identical input<->output format. LMK if I should remove it.

@@ -0,0 1,335 @@
// Copyright 2015 The go-ethereum Authors
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's with the v2 stuff?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to remove this file. Since this is a copy of the existing one, but uses the 'raw' unpacker, this is now not needed since they complex parsing happens on top of the 'raw' parsing.

@holiman
Copy link
Contributor Author

holiman commented Jan 13, 2018

A question however is whether we can attack this intermediate layer? When marshaling into explicit Go types, we immediately discard binary data that doesn't match what we expect. Here however we accept any meaningful data and only validate it later.

Good point. I'll experiment a bit with that.

@holiman
Copy link
Contributor Author

holiman commented Jan 13, 2018

It does handle very large arrays, and detects out-of-bound size: Error: abi: cannot marshal in to go type: length insufficient 128 require 9223372036837998658. However, if the size is even larger, it becomes negative (crash), or goes above 64 byte, then it just ignores that.

@holiman
Copy link
Contributor Author

holiman commented Jan 13, 2018

The original code kind of intentionally ignored all offset and length data above 64 bits. It was vulnerable to negative input, though. I've now made the following changes:

  • Check the full 32 bytes of offset and size. Error if bad.
  • Check that length can never be negative (which is no longer an issue, though, since it's converted from a bigint, but it's on a separate layer so it's good for robustness if we decide to change back).
  • Testcases for 64bit-negative, 64-bit very large, above 64-bit larrge both offset and size.

@holiman
Copy link
Contributor Author

holiman commented Jan 13, 2018

Here however we accept any meaningful data and only validate it later.

It's not quite the case, thought. The toGoType is pretty defensive, and checks that it does not reference outside the size of the input data, nor allocate slices larger than the actual input data size.

@holiman
Copy link
Contributor Author

holiman commented Jan 13, 2018

Rebased on master

@holiman
Copy link
Contributor Author

holiman commented Jan 13, 2018

Check the full 32 bytes of offset and size. Error if bad.

Note: I don't really care if we do that, or if we keep it as it was previously. It almost looks like the previous code was written like that for a purpose (?)

@holiman
Copy link
Contributor Author

holiman commented Jan 23, 2018

Ping @karalabe

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants