Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

type conversion checker for array should be stronger #4904

Merged
merged 1 commit into from
Sep 17, 2018
Merged

type conversion checker for array should be stronger #4904

merged 1 commit into from
Sep 17, 2018

Conversation

liangdzou
Copy link
Contributor

@liangdzou liangdzou commented Sep 5, 2018

Your checklist for this pull request

Please review the guidelines for contributing to this repository.

Please also note that this project is released with a Contributor Code of Conduct. By participating in this project you agree to abide by its terms.

Checklist

  • Code compiles correctly
  • All tests are passing
  • New tests have been created which fail without the change (if possible)
  • README / documentation was extended, if necessary
  • Changelog entry (if change is visible to the user)
  • Used meaningful commit messages

Description

This fix is for the issue #4901.
The type conversion checker for array should be stronger.

@@ -1560,7 1560,9 @@ bool ArrayType::isImplicitlyConvertibleTo(const Type& _convertTo) const
if (convertTo.isByteArray() != isByteArray() || convertTo.isString() != isString())
return false;
// memory/calldata to storage can be converted, but only to a direct storage reference
if (convertTo.location() == DataLocation::Storage && location() != DataLocation::Storage && convertTo.isPointer())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, this is weird - shoudn't convertTo be a pointer in the example case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added some print in the function. It seems it is not a pointer.
Actually, the assert in the codegen part does not require it to be a pointer because it is conjunction in the assertion.
I will do more investigation about why it is not a pointer.

810       solAssert(                                                                                                                                                                                                                         811         (targetType.isPointer() || (typeOnStack.isByteArray() && targetType.isByteArray())) &&                                                                                                                                           812         typeOnStack.location() == DataLocation::Storage,                                                                                                                                                                                 813         "Invalid conversion to storage type."                                                                                                                                                                                            814       );

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is converted in the function "ReferenceResolver::endVisit" by the following code:

401   TypePointer type = _variable.typeName()->annotation().type;
402   if (auto ref = dynamic_cast<ReferenceType const*>(type.get()))
403   {
404     bool isPointer = !_variable.isStateVariable();
405     type = ref->copyForLocation(typeLoc, isPointer);
406   }

gdb proof:

(gdb) l
396                             break;
397                     case Location::Unspecified:
398                             solAssert(!_variable.hasReferenceOrMappingType(), "Data location not properly set.");
399                     }
400
401             TypePointer type = _variable.typeName()->annotation().type;
402             if (auto ref = dynamic_cast<ReferenceType const*>(type.get()))
403             {
404                     bool isPointer = !_variable.isStateVariable();
405                     type = ref->copyForLocation(typeLoc, isPointer);
(gdb) p _variable.typeName()->annotation().type->toString()
$15 = "int256[10] storage pointer"
(gdb) p _variable.annotation().type->toString()
$16 = "int256[10] storage ref"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the way, what is the difference of pointer and reference in Solidity. Solidity does not support pointer arithmatic and it seems the pointer or reference is additional info to a type, e.g. the main types in this example are both int[] or int[10].

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The type is basically a ArrayType with m_isPointer set to true, i.e. "int256[10] storage pointer".
Since ArrayType is always a ReferenceType, shown by code class ReferenceType: public Type, it always passes the dynamic_cast to reference type test. I think finally all array type will always be a reference type. All array pointer will be replaced to array reference in current implimentation.

@liangdzou
Copy link
Contributor Author

@chriseth Sorry, I should try to pass the isoltest first. :-)

@liangdzou
Copy link
Contributor Author

I think the solution should be losing the restrict on the solcAssert in codegen because when the target is a reference it is possible for a convert from memory to storage. The convert from memory to storage reference will copy the content of the memory.

Copy link
Contributor Author

@liangdzou liangdzou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reverted the change.
fix by losing the solcAssert in codegen.

@@ -1560,7 1560,9 @@ bool ArrayType::isImplicitlyConvertibleTo(const Type& _convertTo) const
if (convertTo.isByteArray() != isByteArray() || convertTo.isString() != isString())
return false;
// memory/calldata to storage can be converted, but only to a direct storage reference
if (convertTo.location() == DataLocation::Storage && location() != DataLocation::Storage && convertTo.isPointer())
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added some print in the function. It seems it is not a pointer.
Actually, the assert in the codegen part does not require it to be a pointer because it is conjunction in the assertion.
I will do more investigation about why it is not a pointer.

810       solAssert(                                                                                                                                                                                                                         811         (targetType.isPointer() || (typeOnStack.isByteArray() && targetType.isByteArray())) &&                                                                                                                                           812         typeOnStack.location() == DataLocation::Storage,                                                                                                                                                                                 813         "Invalid conversion to storage type."                                                                                                                                                                                            814       );

@@ -1560,7 1560,9 @@ bool ArrayType::isImplicitlyConvertibleTo(const Type& _convertTo) const
if (convertTo.isByteArray() != isByteArray() || convertTo.isString() != isString())
return false;
// memory/calldata to storage can be converted, but only to a direct storage reference
if (convertTo.location() == DataLocation::Storage && location() != DataLocation::Storage && convertTo.isPointer())
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is converted in the function "ReferenceResolver::endVisit" by the following code:

401   TypePointer type = _variable.typeName()->annotation().type;
402   if (auto ref = dynamic_cast<ReferenceType const*>(type.get()))
403   {
404     bool isPointer = !_variable.isStateVariable();
405     type = ref->copyForLocation(typeLoc, isPointer);
406   }

gdb proof:

(gdb) l
396                             break;
397                     case Location::Unspecified:
398                             solAssert(!_variable.hasReferenceOrMappingType(), "Data location not properly set.");
399                     }
400
401             TypePointer type = _variable.typeName()->annotation().type;
402             if (auto ref = dynamic_cast<ReferenceType const*>(type.get()))
403             {
404                     bool isPointer = !_variable.isStateVariable();
405                     type = ref->copyForLocation(typeLoc, isPointer);
(gdb) p _variable.typeName()->annotation().type->toString()
$15 = "int256[10] storage pointer"
(gdb) p _variable.annotation().type->toString()
$16 = "int256[10] storage ref"

@@ -1560,7 1560,9 @@ bool ArrayType::isImplicitlyConvertibleTo(const Type& _convertTo) const
if (convertTo.isByteArray() != isByteArray() || convertTo.isString() != isString())
return false;
// memory/calldata to storage can be converted, but only to a direct storage reference
if (convertTo.location() == DataLocation::Storage && location() != DataLocation::Storage && convertTo.isPointer())
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the way, what is the difference of pointer and reference in Solidity. Solidity does not support pointer arithmatic and it seems the pointer or reference is additional info to a type, e.g. the main types in this example are both int[] or int[10].

@@ -1560,7 1560,9 @@ bool ArrayType::isImplicitlyConvertibleTo(const Type& _convertTo) const
if (convertTo.isByteArray() != isByteArray() || convertTo.isString() != isString())
return false;
// memory/calldata to storage can be converted, but only to a direct storage reference
if (convertTo.location() == DataLocation::Storage && location() != DataLocation::Storage && convertTo.isPointer())
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The type is basically a ArrayType with m_isPointer set to true, i.e. "int256[10] storage pointer".
Since ArrayType is always a ReferenceType, shown by code class ReferenceType: public Type, it always passes the dynamic_cast to reference type test. I think finally all array type will always be a reference type. All array pointer will be replaced to array reference in current implimentation.

@codecov
Copy link

codecov bot commented Sep 6, 2018

Codecov Report

Merging #4904 into develop will increase coverage by 59.36%.
The diff coverage is 66.66%.

Impacted file tree graph

@@             Coverage Diff              @@
##           develop    #4904        /-   ##
============================================
  Coverage    28.51%   87.88%    59.36%     
============================================
  Files          314      314               
  Lines        31496    31510        14     
  Branches      3730     3721        -9     
============================================
  Hits          8981    27691     18710     
  Misses       21835     2557    -19278     
- Partials       680     1262       582
Flag Coverage Δ
#all 87.88% <66.66%> (?)
#syntax 28.48% <66.66%> (-0.04%) ⬇️

@liangdzou
Copy link
Contributor Author

Finally, I found that explicit type conversion for reference type will always try to convert the type to a reference type. Solidity will modify the is_pointer and location part if it is a explicit type conversion.

However, the explicit type conversion for an array will call the implicit type conversion. But, the check for an implicit conversion to a storage reference is very weak because the function believes that this conversion will cause data copy. But, explicit type conversion will not cause any data copy.

So, this bug is fixed by distinguishing the explicit type conversion with implicit type conversion.

@chriseth
Copy link
Contributor

chriseth commented Sep 6, 2018

Ah great, that sounds much better! I'm wondering whether we should rather fix that explicit conversion converts to reference, because that does not sound right.

I actually wonder where we have a valid conversion to storage pointer anyway. Did you find an instance of that? Perhaps contract c { string x; function f() { bytes(x); } }

@chriseth
Copy link
Contributor

chriseth commented Sep 6, 2018

Hm, so perhaps an earlier version of yours did go in the right direction: conversion between string and bytes is a very special case, because the base types are different, but we can convert without actually copying anything.

@liangdzou
Copy link
Contributor Author

Yeah, two types can convert if and only if the layout and the size are the same.

@liangdzou
Copy link
Contributor Author

Except that we want to have some constraint on the memory.
Anyway, the convert from into[10] storage to int[] storage shouldn't be allowed without copying since the layout is totally different.

@chriseth
Copy link
Contributor

chriseth commented Sep 6, 2018

Could you add an assert or check that an explicit conversion to pointer is only allowed for a byte/string conversion?

@liangdzou
Copy link
Contributor Author

FYI, the following code has no problem since it is a conversion to "int[] storage ref" that caused a copy.

➜  liang@matrix ~/projects/zzz cat t1.sol 
pragma solidity ^0.4.15;
contract Hello {
  int[10] x;
  int[] y;
  function f() public {
    y = x;
  }
}
➜  liang@matrix ~/projects/zzz solc t1.sol 
Warning: This is a pre-release compiler version, please do not use it in production.
➜  liang@matrix ~/projects/zzz 

I am not quite sure if I understand you well on adding the assertion. Current version only allow the basetype to be the same which is more restrict than the assertion you mentioned. The assertion seems useless.

@liangdzou
Copy link
Contributor Author

The methodology for this fix is that the implicit conversion to reference causes copy because an expression that takes a reference as parameter requires that. However, the explicit type conversion will never cause a copy even if it is converting to a reference type. Actually, Solidity tries to avoid implicitly converting a reference to a pointer, for some reason I do not know, by changing the target pointer type to reference type automatically.

So, this fix distinguishes the explicit type conversions from implicit ones and regard them as a non-copy conversion.

@liangdzou
Copy link
Contributor Author

I think you want to only allow the explicit conversion to a pointer for target types of bytes or string.
I added to following to lines of code. The test passed locally. Seems reasonable constraint. :-)

                if (isExplicit && convertTo.isPointer() && !convertTo.isByteArray() && !convertTo.isString())
                        return false;

@liangdzou
Copy link
Contributor Author

@chriseth Hi, do you have any further comments on this PR. Thanks :-)

@liangdzou
Copy link
Contributor Author

@chriseth the type check for bytes and string is moved to type checker part. I think this is the place it belongs to, and the requirement added by you also fixes the issue #4948.

@liangdzou liangdzou closed this Sep 12, 2018
@liangdzou liangdzou reopened this Sep 12, 2018
@ekpyron
Copy link
Member

ekpyron commented Sep 12, 2018

The issue I see with this PR is that in some cases now implicit conversions are less restrictive than explicit conversions. In general explicit conversions should always be less restrictive than implicit ones.

Do you think it might make sense to not let an explicit conversion change from a storage reference to a storage pointer instead and thus allowing the following, instead of disallowing it?

contract C {
        uint[] a;
        uint[] b;
        function f() public view {
            uint[] storage c = a;
            uint[] storage d = b;
            d = uint[](c);
        }
}

I also think the following should be valid:

contract C {
        uint[] a;
        uint[] b;
        function f() public view {
            uint[] storage c = a;
            uint[] memory d = b;
            d = uint[](c);
        }
}

Since in both cases the same is valid as an implicit instead of an explicit conversion. @chriseth, what do you think?

@liangdzou
Copy link
Contributor Author

In Solidity, we will regard a conversion to reference type will need to copy. Essentially, this means Solidity always regard the explicit conversion for an array will need a copy. (I thought the implicit conversion will never copy anything. I think this should be wrong. So, this PR needs to be updated accordingly.)

I guess this might be the reason why array pointer type will always be changed to a reference type because the result of an explicit conversion is an RValue (I do not have a concrete proof yet).

This might be the difference between implicit conversion and explicit conversion, e.g. implicit conversion will never happen for the x in "x=exp" but may happen for explicit conversion.

@liangdzou
Copy link
Contributor Author

Could you add an assert or check that an explicit conversion to pointer is only allowed for a byte/string conversion?

@chriseth @ekpyron I have updated this PR to only add the restrictions. Moreover, 6 type conversion test cases are added.

@chriseth
Copy link
Contributor

The following only applies to reference types: Implicit conversion can copy. For example if you do x = y; and x and y are storage variables of different array types. Explicit conversion might copy, but since an explicitly converted value can always "stand on its own" (i.e. x = uint[](y); and (uint[](y))[2]) are both valid), an explicit conversion can only copy if the conversion converts to memory. Since explicit conversions do not change the data location, this can actually never happen from storage to memory.

@chriseth
Copy link
Contributor

We REALLY have to re-architect the copy and conversion rules... There was a proposal to require things like x = copyof(y), where copyof(y) is something like std::move(y) which only changes the type but does not generate any code.

@chriseth
Copy link
Contributor

I much prefer this version. Note that there is also an exception in CompilerUtils::convertType about byte arrays, so this matches well.

@liangdzou
Copy link
Contributor Author

@chriseth @ekpyron
Do you think we should not change array pointer to array ref in copyForLocationIfReference?
Then, the if-statement can be changed to assert. :-)

@liangdzou
Copy link
Contributor Author

From the name of the function, I can understand that copyForLocationIfReference will copy the location, but I am not quite sure why should we set the _isPointer.

@liangdzou
Copy link
Contributor Author

@chriseth @ekpyron
Simply add an assertion will only trigger the assert fail earlier which will not fix the problem.

@liangdzou
Copy link
Contributor Author

@chriseth @ekpyron
Shall I fix this by not allowing modifying the _isPointer and change the if-statement to assertion?

@ekpyron
Copy link
Member

ekpyron commented Sep 13, 2018

If we want to change the behaviour of copyForLocationIfReference to not modify m_isPointer, then we have to consider all its use cases. An example would be ReferencesResolver::endVisit(Mapping const& _typeName). If I see it correctly, here calling copyForLocationIfReference for the value type relies on it setting m_isPointer to false. Not sure whether that's the only case.

@liangdzou
Copy link
Contributor Author

liangdzou commented Sep 13, 2018

I mean only this location not set it by adding an argument with default value false.

@ekpyron
Copy link
Member

ekpyron commented Sep 13, 2018

That might make sense, yes.

@liangdzou
Copy link
Contributor Author

Shall I try this way or wait for more discussion?

@ekpyron
Copy link
Member

ekpyron commented Sep 13, 2018

Instead of modifying copyForLocationIfReference, we could use ReferenceType::copyForLocation directly in TypeChecker.cpp#L1736, if resultType is a reference type. (I take it that's the place you're talking about?). I think it's worth a try, but you can also wait for @chriseth's response to be sure.

@liangdzou
Copy link
Contributor Author

@ekpyron
yeah, sounds good. I will have a try first. Not a very large change.

@ekpyron
Copy link
Member

ekpyron commented Sep 13, 2018

For the external tests to pass you need to rebase the PR (they depend on #4926 being merged now).

@ekpyron
Copy link
Member

ekpyron commented Sep 13, 2018

I think the latest change is looking good, though!

@liangdzou
Copy link
Contributor Author

rebased. Thx :-)

auto resultArrayType = dynamic_cast<ArrayType const*>(resultType.get());
solAssert(
!argArrayType || argArrayType->location() != DataLocation::Storage ||
((resultArrayType->isPointer() || (argArrayType->isByteArray() && resultArrayType->isByteArray())) &&
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think resultArrayType could be nullptr here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This assertion should probably only happen if the type are explicitly convertible (note that m_errorReporter.typeErrror does not throw).

For example, the following will crash this way:

contract C {
	uint[] a;
	function f() public pure {
		uint[] storage b = a;
		uint c = uint(uint[](b));
	}
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this should be fixed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed and added one test case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think resultArrayType could be nullptr here.

In the function convertType, it shows that when argument type is array the target type is array. So, I think it should be ok to also assert it here. :-)

 646   switch (stackTypeCategory)                                                    
 647   {                                                                                                                                                                                                       
...
 816   case Type::Category::Array:                                                   
 817   {                                                                             
 818     solAssert(targetTypeCategory == stackTypeCategory, "");                                                                                                                                               

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean in CompilerUtils.cpp? Yes, that part isn't run, if there are errors in the type checker - so we have to gracefully create a error here in the TypeChecker, so that we can later assert in convertType :-). But yes, the latest change will take care of that!

contract C {
int[] x;
function f() public {
int[] storage a = x;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indentation is off.

@chriseth
Copy link
Contributor

As I said earlier, I think the whole concept of reference, pointer and convertability has to be reworked, but I think this PR is a very good solution for now.

type conversion. Also, add some test cases for #4901 and #4948.
Copy link
Member

@ekpyron ekpyron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took the liberty to move the type checker assertion and to "rephrase" it slightly, so that it is easier to read in the future, and to add another assertion for resultArrayType, just to be sure (I hope you don't mind @liangdzou).

With this slight change I'm fine with merging this now.

@liangdzou
Copy link
Contributor Author

liangdzou commented Sep 14, 2018

@ekpyron I am fine with your change. Very much easy to read now. 👍
By the way, how can you do that? I mean modifying without generating a new commit.
Did you rebase by squash?

@ekpyron
Copy link
Member

ekpyron commented Sep 14, 2018

@liangdzou In this case it was only the latest commit that I modified, so I just used git commit --amend (which modifies the last commit using the staged changes) and then git push -f (to overwrite the old commit in the remote repo). In case I want to modify a commit that's not the latest one, but further down the history, I usually create a tmp commit and then rebase and squash it into the commit I wanted to modify, but there may be more elegant ways as well :-).

@liangdzou
Copy link
Contributor Author

liangdzou commented Sep 15, 2018

@ekpyron Thanks 👍

@chriseth chriseth merged commit d46d3fe into ethereum:develop Sep 17, 2018
@liangdzou liangdzou deleted the type_conversion branch September 20, 2018 15:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants