-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Natural value equality #3213
Comments
This is legal method/function syntax :-) |
@CyrusNajmabadi good point, though semantics might save us (it doesn't return a value). 😄 |
If |
To be honest, even if I currently do not have a better suggestion, I personally dislike the value keyword in front of the class. It seems strange to be since in C /CLI this suggest to be a value type class. Although people might understand very quickly or painfully that this is not the case. I'm not quite sure aren't there also discussion for stack type classes? If this is the case I would not spend the value keyword and keep it for this case. |
Note: I'd prefer the discussion not devolve just into a syntax debate. But that said, I think 'naked value' will be confusing to people since the ecosystem discusses 'value types' all the time to refer to 'structs' which could now get very ambiguous. 'value class's is not ambiguous, and (to me at least) exactly matches how people would describe it. I.e "it's a value class". Big fan of the rest though! |
I made a few additions to clarify
|
Is there any reason not to have |
I like this approach. I think it solves the majority of use cases for which people want records without actually requiring records. I think opting members in/out is also worthwhile; I agree an attribute would be the cleanest way to do this. The LDT should make a decision about the IEquatable/== debate already and remove the ambiguity in these proposals- that's been an outstanding question for years that always muddies these debates. |
It would be very useful to have a way to autogenerate equality for structs so that the implementation is more efficient than the current reflection based one. |
value class A
{
public float X;
public float Y;
}
class B : A
{
public float Z;
[Key(false)]public float W;
}
class C : A
{
public float Z;
[Key(false)]public float K;
}
new B(0,0,0,0) == new C(0,0,0,0); // true or false? |
Will it be a breaking change if equality on structs is generated at compile time? |
@orthoxerox |
They do not have automatic value struct Point(public int x, public int y); and know that I'm getting |
If you start thinking about allowing the type of equality to be customised per-field, then a fairly natural way forward is something like: class Foo
{
public string Name
{
get;
init;
equals => StringComparer.OrdinalIgnoreCase.Equals(value, other);
hashcode => StringComparer.OrdinalIgnoreCase.GetHashCode(value);
}
} Or perhaps the cleaner but slightly more restrictive: class Foo
{
public string Name
{
get;
init;
equality => StringComparer.OrdinalIgnoreCase;
}
} (which would tie into the other proposals around property getters/setters like #133. Substitute Once you have this, then the following feels like a natural way to opt a property into value equality using class Foo
{
public string Name { get; init; equals; }
} (or This also gives you a route forwards to allowing the type to implement comparisons (and all of the peskly |
Interviewing people in 2023 should be fun: "What is the difference between value types, reference types and value classes?", "What is the difference between type classes, classes and types?". |
@MadsTorgersen int n = 0; // value
int? n = null; // nullable value
struct Point { … }
Point p = new Point(); // struct
Point? p = null; // nullable struct Can you explain which benefits brings value classes vs nullable structs? |
@0x000000EF That's probably out of scope for this issue. Hop onto Gitter -- you'll get some good opinions there. |
Just playing devil's advocate here, but has there been any call for structural equality on types that are not records (or DUs)? I know that records would end up being built off this feature, but I can't think of any time before when I've wanted structural equality in a type that is not also a record. It's not a strongly-held opinion though, so I'd be happy to hear that even the Roslyn codebase would make use of this. Also @MadsTorgersen, in your explanation of why fields might need to opt out of structural equality you give "transient or cached information" as the example: but surely these sorts of fields will almost always be private? Isn't it enough to say that only public fields and properties participate in equality? That also solves the issue around primary constructors and whether or not the value is captured into a field. As for the syntax, I also think Rust: #[derive(Eq)]
struct Point {
x: i_32
y: i_32
} Haskell data Point = Point { x:: Int
, y:: Int
} deriving (Eq) The advantage of this approach is how easily it can be applied to more functionality in the future, including pretty much everything in records. In C# the only way I can think of this being done is with attributes and strings, since C# can't really express the concepts of a type with structural equality or a type that can clone itself: [Derive("Equatable", "Cloneable", "Deconstruct")]
public class Point
{
public int X { get; set; }
public int Y { get; set; }
} |
I'm a bit unsure about structural equality and normal equality. In the end IStructuralEquatable exists, but I don't know how often it is used. If we could turn back time both would/should be implemented, IStructuralEquatable for just comparing values/references, and Equals for own implementations, or the other way around? For records I have no chance to override the Equality behavior, but for struct I do, even so for classes. So I can create weird behaviors where it's maybe not quite clear what's happening inside, since in a struct it falls back to predefined structural equality, but when Equals is overriden it leads to something different. But honestly I'm not sure how to trigger one (IEquatable or Equals) or the other (IStructuralComparable): == or Equals(object) or Equals(other, comparer). What happens to ReferenceEquals? Although I like the idea of a common behavior but will all of the negative impacts I would rather go with not making record, structs and classes equivalent with the equality behavior. I would simply use the IEquatable or IStructuralEquatable interfaces and provide an easy way (e.g. StructualEquality.Equals(object1, object2)) that does the work. So when I have to override one or both of these two methods I don't have to do the typing myself. Nevertheless I'm still uncertain what a different behavior in Equality means. |
To be clear, in this proposal the following record class Point(int X, int Y)
{
private double? cachedLength;
public double Length => cachedLength ??= Math.Sqrt(X * X Y * Y);
} the generated implementation of equality would include Options for dealing with this were discussed in the LDM today.
The alternative to this approach is to have equality defined by the set of primary members (identified in the type declaration's header). |
You missed two options in that discussion:
Since there is no copy-by-value for class-based value types, then the "value" of objects need to be immutable to avoid issues such as using them in keys for a dictionary. We do not (yet) have support for immutable fields in C#, so read-only seems the next best thing. |
IMO, by default I'd only include the primary members, and allow But most importantly there should be an escape hatch if the developer wants to define their own equality. |
@DavidArno Those options were not on the table. The point of this proposal is that it treats mutable and immutable the same, and the same as their default treatment in a |
I think the analogy with the default equals in a struct breaks down. When you use a struct as a key in a dictionary, there is not much risk of it mutating. If a caller gets a key out of the dictionary and calls a mutating method, only the copy is mutated, not the copy used as a key in the dictionary. On the other hand, if we take the same approach (including all fields, even private mutable data) for classes, it is much more dangerous, as the caller can easily mutate the same data that is held in the dictionary thus invalidating the dictionary’s invariants. @HaloFour I agree. As an escape hatch, the current plan is to permit the programmer to write the Equals and GetHashCode methods. |
Maybe this feature would do equality automatically only on public member, and use attribute to opt-in private member / opt-out public member |
As you admit yourself, this central tenet is flawed, because with structs, "only the copy is mutated". This is not the case with classes that use value-based equality. The team has a simple solution to this: value-based equality in records will only use read-only fields. This solves all of these problems in one go. Or the team can dig themselves into a hole with complex rules, escape hatches etc. And this approach isn't restrictive as the developer is still free to override the default equality code with their own if they have special-case requirements of value-equality with non read-only data. |
This would still run into problems. With the following, we have mutable primary members and so the same problems of mutating keys would exist: record class Point(public int X { get; set; }, public int Y { get; set; });
That's probably the compromise here. Only read-only (primary member?) fields and those marked with |
I just wanted to leave my 2cents: Also - if the attributes were left on the class, that would also give some really neat things one could do with reflection too... |
I think the equality should be over all fields (private/public mutable/immutable). I think the primary argument for only using readonly fields was the possibility to use it as key in a dictionary. Having a mutable object as a key (that overrides equals) is always a problem for dictionary. I don't think that this should be (tried to get) solved in this feature. Using all fields seems to be more simple. I also would prefer an attribute over a key word. As long as it is discoverable and not in a deep namespace like 'System.Runtime.CompilerServices'. What should be achieved feels like telling a source code generator to generate the Equals code. Attributes also have a benefit of parameters. This could be used to control what fields would be used for equality. Now a user discovers what equality is used while using the feature. public seald class FiledEqualityAttribut(EqualityType type = EqualityType.AllFields){}
public enum EqualityType
{
None = 0
AllFields = ~0,
ReadonlyFields = 1,
PublicFields = 2,
PrivateFields = 4,
} Generally I love this proposal. One of the first things I do when writing a class is generating Hash Equals == and IEquotable. And when I change something I delete the generated code and generate new one. I also don't understand why == does not call Equals by default. But I'm would like to have this also be overwritten by this feature. Maybe configurable via attribute ;) |
I would very much appreciate it if default behaviour was exactly as specified by Mads Torgersen above, but there was an ability to add these two memberwise-equality-related attributes to any field or property of the class: // excludes the property or field from Equals and GetHashCode // specifies the IEqualityComparer to be used for both equality and hash coding for this particular property or field. The most common occurence of this need for myself is the presence of collections (often immutable collections) in a class that requires memberwise equality checking and hashing including comparison and hashing of each item in the collection. I would like to write a custom IEqualityComparer such as this one: // The user has created a custom comparer that he/she wants used for comparing a certain field
public class EnumerableComparer<TItem> : IEqualityComparer<IEnumerable<TItem>> {
public bool Equals(IEnumerable<TItem> x, IEnumerable<TItem> y) {
if (ReferenceEquals(x, y)) return true;
if (x is null || y is null) return false;
var enumX = x.GetEnumerator();
var enumY = y.GetEnumerator();
while (enumX.MoveNext()) {
if (!enumY.MoveNext()) return false;
if (!EqualityComparer<TItem>.Default.Equals(enumX.Current, enumY.Current)) return false;
}
return !enumY.MoveNext();
}
public int GetHashCode(IEnumerable<TItem> obj) {
if (null == obj) return 0;
var hash = obj.GetType().GetHashCode();
unchecked {
foreach (var item in obj) {
// excuse me for the lazy hash example :)
hash = hash * 123 EqualityComparer<TItem>.Default.GetHashCode(item);
}
}
return hash;
}
} and then use it like this: [MemberwiseEquatable]
public sealed partial class SomeObject {
public string SomeStandardProperty { get; init; }
[EqualityComparer(typeof(EnumerableComparer<int>))]
public ImmutableList<int> Values { get; init; }
} PS: "Memberwise Equality" is pretty wordy, but it says what we're doing here, so something in that vein would secure my vote on the matter of naming the feature. Thank you all for the tremendous work done here. |
The biggest reason why I support auto-implemented equality for the structs is that struct equality is provided by base classes such as ValueType and Enum. Implementations provided by those classes cause enums to be boxed. So, even EqualityComparer.Default.Equals might cause both the |
@wesnerm There's a special case for enums which avoids boxing, but yes, structs will get boxed unless they implement |
Closing as the C# 9 records feature is now tracked by #39 |
Natural value equality in C#
As part of the focus on records, we've recently decided to add support for value equality on classes as a separate feature in C#. While we have a pretty good understanding of the mechanisms by which to implement it (including how to handle inheritance), we are less settled on how to enable it syntactically, and how it connects between record and non-record classes.
This is an approach to that. The key realization is:
Object.Equals
pairwise on the fields of the two struct values.The proposal is that value equality is always opted in at the type declaration level, with no explicit opt-in at the member level. If you choose value equality on a class declaration, you simply get the same field-based equality that C# already supports on structs.
The idea is to be in line with the other recent value equality proposals in terms of what gets generated (including how inheritance is handled). They also recursively call
Object.Equals
on relevant members, just like equality on structs today. The only thing different here is how members are selected to participate in equality.I call it "natural value equality" because it's based on the built-in notion of the state - or value - of an object: the aggregate contents of its fields.
There are several benefits to this approach:
The proposal
For a given struct declaration today:
Value-based
Equals
andGetHashCode
are provided based on the values of the fields of the struct - in this case, the underlying fields of the auto-propertiesX
andY
.The proposal is that you can mark a class declaration for value equality and get the same behavior:
The use of a
value
modifier is just a strawman - see a discussion of alternatives later. The important point is that the members are unmarked for participation in value equality - all fields are included by default.Natural value equality is incredibly simple to explain, and causes great regularity in the language. All value equality in C# is the same! It puts no restrictions on which classes can have value equality, and makes no additional requirements on the ones that choose to do so.
In the vast majority of cases, the "value" of an object corresponds exactly to its actual state. That is even more true the more "data-like" the class is. The rare situations where this is not the case should be the ones that bear any additional cost, whether in the form of manually implementing equality, or via some member-wise opt-in/out mechanism, discussed further below.
(A note on implementation: Today's default value equality on structs relies on a very inefficient reflection-based implementation provided by the runtime. Independently of our other choices we should compiler-generate an implementation instead.)
Interaction with other record-related proposals
Natural value equality coexists peacefully with our other record-related proposals, and doesn't interact much with them - a clear advantage in and of itself.
Value equality and primary constructors
Primary constructors as currently proposed can cause constructor parameters to be "captured" into fields by the class and become part of its state. This happens when the parameter is used in member bodies, beyond initialization.
In structs, such fields would naturally become part of the equality calculation, and this proposal would make the same true for value classes:
It's maybe a bit subtle that whether a constructor parameter is part of equality depends on whether it's captured. I don't think that's specific to equality, though, but an aspect of the larger subtlety of the capture happening at all. Whether to embrace this subtlety is one of the trade-offs we'll be making as we decide whether to include primary constructors in C#, and whether to have capture be part of that feature.
Value equality and records
It is our current plan that record declarations automatically imply value equality, but even if we should change our mind on that, we would want value equality to be available on records. Let's see how the proposed natural value equality interacts with records.
Records as currently envisioned provide two types of benefits:
Other proposals give these specific abbreviated member forms special meaning with respect to value equality. With natural value equality, though, they are truly just abbreviations: The value equality works the same way regardless of whether they are abbreviated or fully declared.
Given some orthogonal and inessential assumptions about how record members translate, the above declarations are equivalent to these expanded ones:
In both the abbreviated and expanded forms, the members are auto-properties with generated backing fields, and those backing fields are what value equality builds on. Thus, equality does not rely on the shape of a record, but only its data. The abbreviated member forms have no impact on equality.
Further design considerations
A few open issues were pointed out above.
Exact syntax
I used
value class
as a strawman above. I thinkvalue
sends the right signal for natural value equality, as it is imbuing the class with "value semantics" that are the same as what value types have. However, there are also downsides to calling these "value classes" when "value types" is already a term that applies only to structs.We've also looked at
key
as a modifier, though it works better as a modifier on members than on classes.One consequence of the proposal is that the modifier will never be used on structs - they already have natural value equality. Will it ever be used on other kinds of type declarations? Natural value equality doesn't make much sense on interfaces, which don't have fields. And delegates already have their own equality.
So the modifier is only ever going to modify a
class
declaration. That means it's an option to makeclass
optional (default), or even to just use a completely different keyword thanclass
! Again usingvalue
as a strawman:Then again, removing the
class
keyword from the declaration may be more confusing than helpful, even if it is a bit shorter.Opt-out and opt-in
While I am convinced that using every field for value equality is the right default for the vast majority of scenarios, there probably remain cases where that needs to be tweaked. The ultimate fallback of course is to require value equality to be implemented manually in those scenarios, but that seems harsh.
The most common example I hear is where a field represents transient or cached information, whose inclusion in equality would be redundant or wrong. We could provide an opt-out keyword, e.g.
transient
:The scenarios for opt-in are likely even more rare, but they do exist: Maybe some of the object's logical state is e.g. stored offline in a table and looked up on demand, or it is to be lazily extracted from a string of serialized wire data. Honestly, these may be rare enough that it's ok to fall off the cliff. But if we do want to provide an opt-in for off-line data, we could allow it by annotating a computed property for opt-in somehow:
We should consider whether these scenarios are rare enough that attributes are better than modifiers, even as much as we dislike attributes having semantic consequences. That may be especially true if we want to provide for both opt-in and opt-out, since an attribute could havea true/false parameter:
It's worth noting that an opt-in/out feature can be added later if we find that it is necessary. Unlike many other approaches, the opt-in/out syntax is not an essential part of natural value equality.
Other aspects of equality
There are at least two other ways for a type to "implement equality":
IEquatable<T>
==
and!=
operatorsIt's an open design question whether automatic value equality (of any of the proposals) should do either or both. If we decide to do so, I think we should also retrofit that on structs. I don't believe it would be (much of) a breaking change, since we'd only do it if the struct declaration didn't manually do so. For
IEquatable<T>
there may be a bit of an overload resolution problem where overloads acceptingIEquatable<T>
would suddenly become candidates on struct input, which can lead to breaks. So there's something to consider there.Conclusion
The existence of a separable notion of value equality in existing C# imposes a burden of proof for us to add a different notion of value equality. As it happens, the existing notion seems to also be optimal on its own merits: It has very desirable defaults, is trivial to explain, and is highly orthogonal to the other record-related features we are contemplating.
"When a type has value equality, equality is computed based on the contents of its fields." That's the whole feature explained!
LDM notes:
The text was updated successfully, but these errors were encountered: