User Tools

Site Tools


notes:csharp:equality

Equality in C#

The mathematical properties of equality:

  • reflexive - any object is equal to itself
  • symmetric - the order of comparison does not matter:
    • if a == b is true, then b == a is also true
    • if a == b is false, then b == a is also false
  • transitive - states that:
    • if a == b and b == c are both true, then a == c must also be true

Semantics of equality:

  • value semantics - comparing contents
  • reference semantics - comparing object identity

We have four functions that determine equality in C#:

public static bool ReferenceEquals(object left, object right);
public static bool Equals(object left, object right);
public virtual bool Equals(object right);
public static bool operator ==(MyClass left, MyClass right);

Never redefine the following static methods:

  • static bool ReferenceEquals(object left, object right)
  • static bool Equals(object left, object right)
  • Implement IEquatable<T> when you override Equals() in your type.

When are two values considered equal?

  • Two values of a reference type are equal if they refer to the same object, called object identity.
  • Two values of a value type are equal if they have the same type and they contain the same contents.

The Object.ReferenceEquals method returns true if two references refer to the same object i.e., if they have the same object identity. It does not matter if the types are reference types or value types, ReferenceEquals always tests object identity, not object contents. It means that ReferenceEquals always returns false when comparing value types, even when you compare a value type to itself (because of boxing).

var a = 2;
var b = 2;
 
var r1 = Object.ReferenceEquals(a, b); // false
var r2 = Object.ReferenceEquals(a, a); // false

The Object.Equals method determines whether two objects are equal when you don't know the runtime type of the two arguments. The static Equals method uses the instance Equals method of the left argument to determine whether two objects are equal.

Override the instance version of Equals when the default behaviour is inconsistent with your type. The default implementation of Equals for reference types behaves exactly as Object.ReferenceEquals - it uses object identity to determine whether two references are equal.

On the other hand, System.ValueType overrides the instance version of Equals by checking if the two references of a value type have the same types and the same content. If so, they are considered equal. On the downside, the implementation is inefficient because it uses reflection. It compares all the member fields in any derived type.

Similarily, the default implementation of operator== uses reflection. When you create a new value type, redefine the operator== as well. Note that you don't need to redefine operator== whenever you override Equals only when you create a new value type.

For reference type, you rarely need to override operator== as the default implementation follows reference semantics which .NET classes expect.

  • For value types, always override the instance Equals and the operator== for efficiency.
  • For reference types, override the instance Equals when you want to use value semantics instead of reference semantics.
  • When you override Equals you should also override GetHashCode.

For reference types, whenever you override the instance version of Equals, implement IEquatable<T>. It tells the compiler that your type supports a type-safe equality comparison.

public class A : IEquatable<A>
{
    public override bool Equals(object right)
    {
        // Check if the 'right' reference is null. No need to check the 'this' reference because 
        // it is never null.
        if (Object.ReferenceEquals(right, null))
            return false;
 
        // Check if the two references are the same by testing object identity. Equal object identity
        // guarantees equal contents.
        if (Object.ReferenceEquals(this, right))
            return true;
 
        // Call GetType because the type may be derived from A.
        if (this.GetType() != right.GetType())
            return false;
 
        // Compare the type's contents. We can use this.Equals here because we know that 'this' and 'right' 
        // are of the same type.
        return this.Equals(right as A);
    }
 
    // IEquatable<T> implementation.
    public bool Equals(A other)
    {
        // ...
        return true;
    }
}

Never throw an exception in Equals. Return false for failures, such as null references or wrong argument types.

Implement IStructuralEquality interface in types that implement value semantics. In .NET it is implemented on System.Array and Tuple<>. It's rare that you will want to implement IStructuralEquality in your types. It is needed only for lightweight types.

GetHashCode

The GetHashCode method is used to define the hash value for keys in a hash-based collection such as the HashSet<T> or Dictionary<K,V> containers.

For reference types, the base class implementaion of GetHashCode is inefficient. For value types, the implementation is often incorrect. For the types that are never used as the key in a container, GetHashCode does not matter. You can rely on the base class implementation.

If you need to override the base class implementation of GetHashCode, follow these rules:

  • If two objects are equal (determined by the instance Equals method), they must generate the same hash value.
  • GetHashCode must be an instance invariant i.e., it must always return the same value for a given object.
  • The hash function should generate a uniform distribution among all integers for all typical input sets.

It follows that after an object is created, its hash code never changes.

System.ValueType overrides GetHashCode and returns the hash code from the first field defined in the type. It follows that the first field in the struct has to be immutable. Otherwise, if the value of the first field changes, the hash code value also changes. It breaks GetHashCode as an instance invariant.

Default implementations of GetHashCode:

  • System.Object uses the object identity which does not change during the object's lifetime.
  • System.ValueType uses the first field in the struct.

If you plan to use your value type as a key in a hash container, make your type immutable.

One technique to enforce field immutability is to introduce an explicit ChangeValue method that would return a new instance of an object:

public class Company
{
    public Company(string name) => this.Name = name;
 
    public string Name { get; }
 
    public override int GetHashCode() => Name.GetHashCode();
 
    // The ChangeName method allows us to make the Name field immutable.
    public Company ChangeName(string newName) => new Company(newName);
}
...
// Create a company and add employees.
var c1 = new Company("C1");
dictionary.Add(c1, employees);
 
// Change the company's name.
var c2 = c1.ChangeName("C2");
 
// Move employees to the company with the new name and remove the entry for the company with the old name.
var empl = dictionary[c1];
dictionary.Remove(c1);
dictionary.Add(c2, empl);

A commonly used algorithm to generate the uniform distribution of keys is to XOR all the return values from GetHashCode on all immutable fields in a type. Keep in mind that if the fields' values are somehow related, this algorithm will cluster hash codes. As a result, your container will have few buckets each with many items.

notes/csharp/equality.txt · Last modified: 2020/06/30 by leszek