The mathematical properties of equality:
Semantics of equality:
We have four functions that determine equality in C#:
public static bool ReferenceEquals(object left, object right); public static bool Equals(object left, object right); public virtual bool Equals(object right); public static bool operator ==(MyClass left, MyClass right);
Never redefine the following static methods:
ReferenceEquals(object left, object right)
Equals(object left, object right)
IEquatable<T>
when you override Equals()
in your type.When are two values considered equal?
The Object.ReferenceEquals method returns true if two references refer to the same object i.e., if they have the same object identity. It does not matter if the types are reference types or value types, ReferenceEquals
always tests object identity, not object contents. It means that ReferenceEquals
always returns false when comparing value types, even when you compare a value type to itself (because of boxing).
var a = 2; var b = 2; var r1 = Object.ReferenceEquals(a, b); // false var r2 = Object.ReferenceEquals(a, a); // false
The Object.Equals method determines whether two objects are equal when you don't know the runtime type of the two arguments. The static Equals
method uses the instance Equals
method of the left argument to determine whether two objects are equal.
Override the instance version of Equals when the default behaviour is inconsistent with your type. The default implementation of Equals
for reference types behaves exactly as Object.ReferenceEquals
- it uses object identity to determine whether two references are equal.
On the other hand, System.ValueType
overrides the instance version of Equals by checking if the two references of a value type have the same types and the same content. If so, they are considered equal. On the downside, the implementation is inefficient because it uses reflection. It compares all the member fields in any derived type.
Similarily, the default implementation of operator==
uses reflection. When you create a new value type, redefine the operator==
as well. Note that you don't need to redefine operator==
whenever you override Equals
only when you create a new value type.
For reference type, you rarely need to override operator==
as the default implementation follows reference semantics which .NET classes expect.
Equals
and the operator==
for efficiency. Equals
when you want to use value semantics instead of reference semantics.Equals
you should also override GetHashCode.
For reference types, whenever you override the instance version of Equals
, implement IEquatable<T>
. It tells the compiler that your type supports a type-safe equality comparison.
public class A : IEquatable<A> { public override bool Equals(object right) { // Check if the 'right' reference is null. No need to check the 'this' reference because // it is never null. if (Object.ReferenceEquals(right, null)) return false; // Check if the two references are the same by testing object identity. Equal object identity // guarantees equal contents. if (Object.ReferenceEquals(this, right)) return true; // Call GetType because the type may be derived from A. if (this.GetType() != right.GetType()) return false; // Compare the type's contents. We can use this.Equals here because we know that 'this' and 'right' // are of the same type. return this.Equals(right as A); } // IEquatable<T> implementation. public bool Equals(A other) { // ... return true; } }
Never throw an exception in Equals
. Return false for failures, such as null references or wrong argument types.
Implement IStructuralEquality
interface in types that implement value semantics. In .NET it is implemented on System.Array
and Tuple<>
. It's rare that you will want to implement IStructuralEquality
in your types. It is needed only for lightweight types.
The GetHashCode
method is used to define the hash value for keys in a hash-based collection such as the HashSet<T>
or Dictionary<K,V>
containers.
For reference types, the base class implementaion of GetHashCode
is inefficient. For value types, the implementation is often incorrect. For the types that are never used as the key in a container, GetHashCode
does not matter. You can rely on the base class implementation.
If you need to override the base class implementation of GetHashCode
, follow these rules:
Equals
method), they must generate the same hash value.GetHashCode
must be an instance invariant i.e., it must always return the same value for a given object.It follows that after an object is created, its hash code never changes.
System.ValueType
overrides GetHashCode
and returns the hash code from the first field defined in the type. It follows that the first field in the struct
has to be immutable. Otherwise, if the value of the first field changes, the hash code value also changes. It breaks GetHashCode
as an instance invariant.
Default implementations of GetHashCode
:
System.Object
uses the object identity which does not change during the object's lifetime.System.ValueType
uses the first field in the struct
. If you plan to use your value type as a key in a hash container, make your type immutable.
One technique to enforce field immutability is to introduce an explicit ChangeValue
method that would return a new instance of an object:
public class Company { public Company(string name) => this.Name = name; public string Name { get; } public override int GetHashCode() => Name.GetHashCode(); // The ChangeName method allows us to make the Name field immutable. public Company ChangeName(string newName) => new Company(newName); } ... // Create a company and add employees. var c1 = new Company("C1"); dictionary.Add(c1, employees); // Change the company's name. var c2 = c1.ChangeName("C2"); // Move employees to the company with the new name and remove the entry for the company with the old name. var empl = dictionary[c1]; dictionary.Remove(c1); dictionary.Add(c2, empl);
A commonly used algorithm to generate the uniform distribution of keys is to XOR all the return values from GetHashCode
on all immutable fields in a type. Keep in mind that if the fields' values are somehow related, this algorithm will cluster hash codes. As a result, your container will have few buckets each with many items.