Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: Go 2: capability annotations #24956

Closed
bcmills opened this issue Apr 19, 2018 · 25 comments
Closed

proposal: Go 2: capability annotations #24956

bcmills opened this issue Apr 19, 2018 · 25 comments
Labels
FrozenDueToAge LanguageChange Suggested changes to the Go language NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Proposal v2 An incompatible library change
Milestone

Comments

@bcmills
Copy link
Contributor

bcmills commented Apr 19, 2018

Background

This is a proposal about constraints on types and variables. It addresses the use-cases of “read-only” views and atomic types, but also generalizes over channel constraints in the process.

This proposal is intended to generalize and subsume many others, including (in no particular order) #22876, #20443, #21953, #24889, #21131, #23415, #23161, #22189, and to some extent #21577.

Proposal

Fields and methods on a struct type can be restricted to specific capabilities. To callers without those capabilities, those fields and methods are treated as if they were unexported and defined in some other package: they can be observed using the reflect package, but cannot be set, called, or used to satisfy interfaces.

A view is a distinct type that restricts access to the capabilities of its underlying type.

Grammar

Capability restrictions follow the grammar:

Capability = "#" identifier { "," "#" identifier }
View = "#" ( identifier | "{" [ identifier { "," identifier } ] "}" )
VarView = "#&" ( identifier | "{" [ identifier { "," identifier } ] "}" )

A Capability precedes a method or field name in a FieldDecl or Receiver. A method or field without an associated Capability can be accessed without any capability.

A View follows a TypeName in a ParameterDecl, FieldDecl, ConstDecl, VarDecl, or Conversion. A TypeName without an associated View includes all of its capabilities. An empty View (written as #{}) can only access methods and fields that are not associated with a Capability.

A VarView follows the IdentifierList in a FieldDecl or VarDecl. It restricts the capabilities of references to the declared fields or variables themselves, independent of type. Those capabilities are also applied to the pointer produced by an (explicit or implicit) address operation on the variable or field.

A package may define aliases for the views of the types it defines:

type T struct { … }

func (*T) #Reader Read(p []byte) (int, error) { … }
func (*T) #Seeker Seek(offset int64, whence int) (int64, error) { … }
func (*T) #Writer Write(p []byte) (int, error) { … }

type *T#ReadSeeker = *T#{Reader,Seeker}

Built-in capabilities

Channel types have the built-in capabilities Sender and Receiver. <-chan T is a shorthand for (chan T)#Receiver, and chan<- T is a shorthand for (chan T)#Sender. The send and close operations are restricted to the Sender capability, and the receive operation is restricted to the Recevier capability.

Slices, maps, and pointers have the built-in capabilities Getter and Setter. (QUESTION: should we defined channel-like shorthands for these capabilities?) The Setter capability allows assignment through an index expression (for slices, maps, and pointers to arrays) or an indirection (for pointers), including implicit indirections in selectors. The Getter capability allows reading through an index expression, indirection, or range loop.

Pointers to numeric, boolean, and pointer types have the built-in capability Atomic. The Atomic capability allows assignment and reading through the functions in the atomic package, independent of the Setter and Getter capabilities.

The Getter, Setter, and Atomic capabilities can also apply to variable and field declarations (as a VarView). The Getter capability allows the variable to be read, the Setter capability allows it to be written, and the Atomic capability allows it to be read and written via an atomic pointer. (A variable with only the Getter capability cannot be reassigned after declaration. A variable with only the Setter capability is mostly useless.)

The built-in len and cap functions do not require any capability on their arguments. The built-in append function requires the Setter capability on the destination and the Getter capability on the source.

Assignability

A view of a type is assignable to any view of the same underlying type with a subset of the same capabilities.

A function of type F1 is assignable to a function type F2 if:

  • the parameters and results of F1 and F2 have the same underlying types, and
  • the capabilities of the parameters of F1 are a subset of the capabilities of the parameters of F2, and
  • the capabilities of the results of F1 are a superset of the capabilities of the parameters of F2.

A method of type M1 satisfies an interface method of type M2 if the corresponding function type of M1 is assignable to the corresponding function type of M2.

This implies that all views of the same type share the same concrete representation.

Capabilities of elements of map, slice, and pointer types must match exactly. For example, []T is not assignable to []T#V: otherwise, one could write in a T#V and read it out as a T. (We do not want to repeat Java's covariance mistake.) We could consider relaxing that restriction based on whether the Getter and/or Setter capability is present, but I see no strong reason to do so in this proposal.

Examples

package atomic

…
func AddInt32(addr *int32#Atomic, delta int32) (new int32)
func LoadInt32(addr *int32#Atomic) (new int32)
func StoreInt32(addr *int32#Atomic, val int32)
func SwapInt32(addr *int32#Atomic, new int32) (old int32)
package bytes

type Buffer struct { … }

func NewBuffer([]byte) *Bufferfunc (*Buffer) #Owner Bytes() []byte
func (*Buffer) Cap() int
func (*Buffer) #Writer Grow(n int)
func (*Buffer) Len() int
func (*Buffer) #Reader Next(n int)
func (*Buffer) #Owner Bytes() []byte
func (*Buffer) #Reader Read(p []byte) (int, error)
func (*Buffer) #Reader ReadByte() (byte, error)
func (*Buffer) #Writer ReadFrom(r io.Reader) (int64, error)
…
func (*Buffer) #Owner Reset()
func (*Buffer) #Owner Truncate(n int)
package reflect

type StructField struct {
	…
	Index []int#Getter
}
package http

type Server struct {
	…
	disableKeepAlives, inShutdown #&Atomic int32
}

type Request struct {
	…
	#Client GetBody func() (io.ReadCloser, error)
	…
	#Server RemoteAddr string
	#Server RequestURI string
	#Server TLS *tls.ConnectionState
	…
	#Deprecated Cancel <-chan struct{}
}}

Reflection

The capability set of a method, field, or type can be observed and manipulated through new methods on the corresponding type in the reflect package:

package reflect

// ViewOf returns a view of t qualified with the given set of capabilities.
// ViewOf panics if t lacks any capability in the set.
func ViewOf(t Type, capabilities []string) Type

type Type interface {
	…

	// View returns the underlying type that this type views.
	// If this type is not a view, View returns (nil, false).
	func (Type) View() (Type, bool)

	// NumCapability returns the number of capabilities the underlying type is qualified with.
	// It panics if the type is not a view.
	func (Type) NumCapability() int

	// Capability(i) returns the ith capability the underlying type is qualified with.
	func (Type) Capability(int) string
}

type Method struct {
	…
	Capability string
}

type StructField struct {
	…
	Capability string
	View []string
}

Compatibility

I believe that this proposal is compatible with the Go 1 language specification. However, it would not provide much value without corresponding changes in the standard library.

Commentary

On its own, this proposal may not be adequate. As a solution for read-only slices and maps, its value without generics (https://golang.org/issue/15292) is limited: otherwise, it is not possible to write general-purpose functions that work for both read-only and read-write slices, such as those in the bytes and strings packages.

@gopherbot gopherbot added this to the Proposal milestone Apr 19, 2018
@bcmills bcmills added LanguageChange Suggested changes to the Go language v2 An incompatible library change labels Apr 19, 2018
@bcmills bcmills modified the milestones: Proposal, Go2 Apr 19, 2018
@ianlancetaylor
Copy link
Contributor

How does it work if I assign a variable with a set of views to an empty interface value? Do I have to have exactly the correct set of views in order to type assert the empty interface value back to the original type? Do the views have to be listed in the same order?

@bcmills
Copy link
Contributor Author

bcmills commented Apr 20, 2018

Good questions. I would say that you can type-assert to a view with any subset of the concrete capabilities, with the usual caveat that element types for maps, slices and pointers must be exact. As with interfaces, the first match in a switch wins.

The capabilities in a view are an unordered set, so you could enumerate them in any order.

@bcmills
Copy link
Contributor Author

bcmills commented Apr 20, 2018

Another interesting question: what would == mean for two interface{} values with different views of the same object? Arguably it should be consistent with switch and map. Map keys should be unequal because they may have different method sets, which suggests that switch should require an exact match on the capabilities.

@ianlancetaylor
Copy link
Contributor

One of the things I would like to see from any system like this is support for some form of thread safety analysis (https://clang.llvm.org/docs/ThreadSafetyAnalysis.html). Is there a way to use this syntax to say "this field requires this mutex to be held?" Unfortunately it seems kind of out of scope.

@bcmills
Copy link
Contributor Author

bcmills commented Apr 20, 2018

This proposal might be possible to extend to locking invariants, but it would make the specification much more complex. As I see it, one of the advantages of this proposal is that it enforces the capabilities within the Go type system (rather than in an external tool), so extending it to thread-safety analysis would mean that we encode that analysis in the Go spec.

One way to do that might be to add an Escape capability to reference-like types and define a new (safe) Mutex type in terms of that, but then we would have to specify what Escape actually means. With that approach (and assuming generics), the Mutex API might look like:

package sync

type [T] Mutex struct {
	WithLock(func(*T#{Getter,Setter}))
}

whereas atomic would require the Escape capability:

package atomic

func [Ptr] SwapPointer(addr *(Ptr#Escape)#Atomic, new Ptr#Escape) (old Ptr#Escape)

Ptr#Escape applied to an existing view type Ptr would mean “Ptr augmented with Escape”. So, for example, concrete instantiations of SwapPointer might look like:

func SwapPointer[*T] (addr **T#Atomic, new *T) (old *T)

func SwapPointer[*T#Getter] (addr *(*T#{Getter,Escape})#Atomic, new *T#{Getter,Escape}) (old *T#{Getter,Escape})

@bcmills
Copy link
Contributor Author

bcmills commented Apr 20, 2018

(I've realized in writing more examples that capabilities should associate loosely rather than tightly: too many parentheses. Updating examples.)

@odeke-em
Copy link
Member

/cc @jaekwon

@bcmills
Copy link
Contributor Author

bcmills commented Apr 24, 2018

One thing I dislike about this proposal is that it only provides upper bounds on capabilities, not lower bounds. For example, I cannot express “must be non-nil” as a capability under this formulation, even though it clearly relates to the Setter and Getter capabilities.

[Edit: can too. See below.]

@komuw
Copy link
Contributor

komuw commented Apr 25, 2018

/cc @wora

@bcmills
Copy link
Contributor Author

bcmills commented Apr 26, 2018

On further reflection, I think this proposal can express “must be non-nil” constraints: we just have to express them as a positive capability (“can be nil”) rather than a negative constraint (“must be non-nil”)

That suggests another built-in capability, as follows:


Types with a nil zero-value are called nillable. All nillable types have the built-in capability Nil, which allows comparisons with the predeclared identifier nil. nil can be assigned to a variable only if it has both the Setter and Nil capabilities.

When a variable of a nillable type without the Nil capability is initialized (explicitly or implicitly), including as a function argument, struct field, or array or slice element, the compiler must be able to verify statically that the value is not nil. (TODO: propose a precise algorithm.)

  • Slice elements with indices ≥ len are treated as uninititalized. If a slice with elements of nillable type that lack the Nil capability is resliced past its length, a run-time panic occurs, analogous to reslicing any slice past its capacity. (See Proposal: use a value other than nil for uninitialized interfaces #21538 (comment) for further discussion.)
  • The result of a channel receive or map index expression always has the Nil capability if the element type is nillable, even if the element type itself lacks the Nil capability.
  • A type-assertion to a nillable type without the Nil capability fails if the underlying value is nil. (That is, the type does not match.)
  • Each named result parameter without the Nil capability must be non-nil as of the first read or write of the parameter on each path through the function, including implicit writes in return statements.

The operator sequence &* panics if its pointer argument is nil, so it can be used as a handy way to assert non-nilness to the compiler. If we adopt this proposal, we should consider defining &* as an explicit “panic if nil” operator for all nillable types.

@bcmills
Copy link
Contributor Author

bcmills commented Apr 26, 2018

I believe this proposal also subsumes #23764. If a struct type has a field without the Getter property, that struct cannot be copied: we would have to read the field in order to copy it.

@bcmills
Copy link
Contributor Author

bcmills commented Apr 26, 2018

I do think the syntax for #Getter and #Setter is a bit verbose. If we like the general idea, I can try to find a cleaner shorthand for the built-in constraints.

@jba
Copy link
Contributor

jba commented Apr 30, 2018

can only access methods and fields that are not associated with a Restriction.

  1. What's a Restriction? Do you mean Capability?

  2. Should there be an automatic conversion from string to ([]byte)#Getter?

  3. I tried to write the signature for a generalized strings.Trim. The best I could come up with is

func [T#Getter] Trim(s ([]T)#Getter, xs ([]T)#Getter) ([]T)#Getter 

But this is too restrictive; it says the return type has only Getter, but I want it to have all the Capabilities of the first argument. I can't write the return value as simply []T, because that would mean it has no capabilities attached. In your comment on locking invariants you say "V#C applied to an existing view type V would mean 'V augmented with C'." Does that apply here? If so, how do I indicate the I want to augment the first argument's capabilities, and not the second's?

I think you'll need generic capability parameters:

func [T#Getter, C] Trim(s ([]T)#{C, Getter}, xs ([]T)#Getter) ([]T)#{C, Getter}
  1. Here is a generic Clone for pointers:
func [T] Clone(x *T) *T {
    y := *x
    return &y
}

How can I express that neither the argument nor the return value have Nil? You mention &* as a way to express non-nilness, but how can I use that in a signature?

Maybe T#-C means "T without C"?

func [T] Clone(x (*T)#-Nil) (*T)#-Nil {
    y := *x
    return &y
}
  1. I'm writing a cache. I want to make sure callers of my Get method get deeply read-only views of my values, so users can't corrupt them. How can I express this? Get(Key) Value#Getter doesn't work, because the values may contain fields that are gettable, but can themselves be modified.

@bcmills
Copy link
Contributor Author

bcmills commented Apr 30, 2018

What's a Restriction? Do you mean Capability?

Yep, leftover from an earlier draft. Fixed.

@bcmills
Copy link
Contributor Author

bcmills commented Apr 30, 2018

Should there be an automatic conversion from string to ([]byte)#Getter?

That's not obvious to me one way or the other, especially given the need for capability parametricity. (That is: it may be that the need for parametricity implies that pretty much all string-or-[]byte functions are already generic to begin with.)

@ianlancetaylor
Copy link
Contributor

Are there any other languages with a facility similar to this?

Typically in programming language the term "capability" refers to the right to perform some operation. These attributes seem almost like the reverse of that: rather than granting certain rights, they remove them. "Capability" may not be the best name here.

The capability syntax is very general but it seems that specific capabilities have specific meanings implemented by the compiler. Are there meanings for capabilities other than those implemented by the compiler?

This seems like an experimental language feature with potentially far reaching consequences, not an established, proven mechanism. As a general rule, Go has avoided adding features that are not well understood.

@ianlancetaylor ianlancetaylor added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label May 1, 2018
@jba
Copy link
Contributor

jba commented May 1, 2018

Are there any other languages with a facility similar to this?

Some of it is like Hermes typestates. Unfortunately, the Hermes book doesn't seem to be available online. From my copy, you could define your own typestates that the compiler would semi-automatically propagate for you, and there were some built-in ones with special meanings.

"Capability" may not be the best name here.

It actually resonates with me: a capability is a reference to a value along with permissions on how you can use it. So actually what Bryan calls a View is a capability (or the type of a capability): *Buffer#Reader is a pointer to a buffer with the ability to read from it. It's just that it defaults open: the absence of #Foo on a type means you can do anything.

@jaekwon
Copy link

jaekwon commented May 6, 2018

Can you provide an example of subsuming #23161? Trying to understand.

  1. I do agree that it would be nice to be able to restrict what methods you can call on an interface.
  2. I agree with @jba and @ianlancetaylor that the reserved keyword should not be called a "Capability". To me, capability == (dis)ability, and "Object Capabilities" is the study/practice/engineering of capability/access specification/control in the context of a given programming language. But I'm still learning about this so...
  3. This gives me a related idea...

A Golang "Interface" is something that can be converted at runtime to a concrete type. What if we just declare a new type called a "Restriction", which is much like an Interface, except it cannot be converted back to an interface or any other type (besides a more restrictive type)?

type FooInterface interface {
    A()
    B()
}
type BarRestriction restriction {
    B()
}
type fooStruct struct {}
func (_ fooStruct) A() {}
func (_ fooStruct) B() {}
func (_ fooStruct) C() {}

func main() {
    f := fooStruct{}
    var fi FooInterface = f
    fi.(fooStruct).C() // FooInterface doesn't include C but we can convert.
    var br BarRestriction = f // OK
    var br BarRestriction = fi // OK
    var br BarRestriction = &fi // Pointers to interfaces don't implement restrictions.
    var fi2 FooInterface = br // NOT OK
    br.A() // NOT OK
    br.B() // OK
    br.C() // NOT OK
    br.(FooInterface).A() // NOT OK
    br.(fooStruct).A() // NOT OK
    br.(fooStruct).A() // NOT OK
}

This just saves us from declaring another structure to hold an unexposed reference to a reference object, and copying the method signatures that we want to expose from the referenced object.

type Foo struct {
    bar Bar
}
func (a Foo) A() { a.bar.A() } // THIS IS...
func (a Foo) B() { a.bar.B() } // ...TEDIOUS...
type Bar struct { ... }
func (b Bar) A() {}
func (b Bar) B() {}
func (b Bar) C() {} // HIDE ME!

@bcmills
Copy link
Contributor Author

bcmills commented May 10, 2018

Are there any other languages with a facility similar to this?

HP's Emily added capability verification to OCaml, but as a subtraction from the language rather than an addition. OCaml already has at least two other features that provide a similar sort of capability restriction: row polymorphism and object subtyping. With row polymorphism, capabilities are encoded as record fields (in Go, struct fields), and functions can accept input records with arbitrary additional fields (which the function cannot access). With object subtyping, capabilities are encoded as methods on an object type, and any object with at least those methods can be coerced to (but not from!) that type.

Mezzo represents static read and write permissions as types (as in this proposal), but also adds a consume keyword that allows a function to revoke the caller's permissions after the call, allowing for constraints such as “called only once”. However, it isn't obvious to me whether it allows for user-defined permissions: the Mezzo literature focuses on controlling aliasing and mutability.

Microsoft's Koka has user-defined effect types. Its effects track capabilities at the level of “heaps” rather than variables. (Heap- or region-level effects are typical of “effect systems”, which are mainly oriented toward adding imperative features to functional languages.)

Wyvern (described in this paper) checks capabilities statically, but at the level of modules rather than variables.

Wikipedia has a list of other languages that implement object capabilities. Admittedly, that list is mostly research languages, and many of those languages use a dynamic rather than a static form of these capabilities. (For example, the E programming language, from which many of the others derive, uses pattern matching and Smalltalk-style dynamic dispatch to construct facets with restricted capabilities. You can think of E facets as the dynamic equivalent of the static “views” in this proposal.)

Several other languages have build-in keywords that resemble the various built-in capabilities in this proposal, but do not allow for user-defined capabilities. (For example, consider mut in Rust or iso in Pony.)


The Nil capability described above does closely resemble Hermes typestates (thanks, @jba!).

For those with ACM or IEEE library access, there are a few shorter Hermes papers available:
https://dl.acm.org/citation.cfm?id=142148
https://ieeexplore.ieee.org/document/138054/

@bcmills
Copy link
Contributor Author

bcmills commented May 10, 2018

These attributes seem almost like the reverse of that: rather than granting certain rights, they remove them. "Capability" may not be the best name here.

There are two related concepts here: “capabilities” and “views”. A “capability” grants a (positive) permission for some method, field, or variable; a “view” restricts a variable (generally a function argument) to a particular subset of its capabilities.

(I'm not attached to the naming, but I couldn't think of anything more appropriate. Please do suggest alternatives!)

@bcmills
Copy link
Contributor Author

bcmills commented May 10, 2018

Are there meanings for capabilities other than those implemented by the compiler?

Yes: since methods and fields can be restricted to capabilities, they can have any meaning that can be expressed as a method set. That is both a strength and a weakness of this proposal: a strength in that it allows for user-defined behavior, but a weakness in that it partially overlaps with interfaces. (Views are not interfaces, however: methods removed by a view cannot be restored by a type assertion.)

For example, see the #Client and #Server capabilities in http.Request in the proposal.

@bcmills
Copy link
Contributor Author

bcmills commented May 10, 2018

@jaekwon The #&Getter view (a VarView) functions similarly to the const keyword from #23161:

// Read-only reference to an unrestricted slice.
var myArray #&Getter []byte = make([]byte, 10)

// Read-only struct.
var myStruct #&Getter = MyStruct{...}

// Read-only pointer to a mutable struct.
var myStructP #&Getter = &MyStruct{...}

// Read-only interface variable.
var myStruct #&Getter MyInterface = MyStruct{...}
var myStructP #&Getter MyInterface = &MyStruct{...}

// Read-only function variable
var myFunc #&Getter = func(){...}

@bcmills
Copy link
Contributor Author

bcmills commented May 10, 2018

(@jba, just so you know, I'm not ignoring your questions about Trim and Clone: they're subtle and interesting points that require more thought.)

@bcmills
Copy link
Contributor Author

bcmills commented May 12, 2018

On further consideration, I've decided to withdraw this proposal.

It was motivated in part by Ian's question in #22876 (comment):

Why put one promise into the language but not the other?

However, in order to make the result general I've had to sacrifice both conciseness and uniformity: each built-in capability and shorthand syntax would make the language less orthogonal, and without them the proposal is both too weak and too verbose: it addresses a broad set of problems but none cleanly. That's the opposite of the Go philosophy.

I do think it's an interesting framework for thinking about other constraints we may want to add, but it's not a good fit itself.

@bcmills bcmills closed this as completed May 12, 2018
@zhongguo168a
Copy link

zhongguo168a commented May 6, 2019

type T struct { … }

#Reader           
func (*T)  Read(p []byte) (int, error) { … }
#Seeker          #Atomic
func (*T) Seek(offset int64, whence int) (int64, error) { … }
#            #            #Atomic
func (*T) Write(p []byte) (int, error) { … }

#ReadSeeker #{Reader,Seeker}
type *T     =   *T

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge LanguageChange Suggested changes to the Go language NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Proposal v2 An incompatible library change
Projects
None yet
Development

No branches or pull requests

8 participants