Open In App

Importance of Hashcode method in Java

Last Updated : 18 Jan, 2018
Improve
Improve
Like Article
Like
Save
Share
Report

Prerequisite: Equals() and hashCode() methods in Java

HashMap and HashSet use hashing to manipulate data. They use hashCode() method to check hash values. The default implementation of hashCode() in Object class returns distinct integers for different objects. Sometimes, we have to implement hashCode method in our program.
Consider the following example




// Java puzzle to illustrate use
// of hashcode() and equals() method
import java.util.*;
public class Name {
    private final String first, last;
    public Name(String first, String last)
    {
        this.first = first;
        this.last = last;
    }
    public boolean equals(Object o)
    {
        if (!(o instanceof Name))
            return false;
        Name n = (Name)o;
        return n.first.equals(first) && n.last.equals(last);
    }
    public static void main(String[] args)
    {
        Set<Name> s = new HashSet<Name>();
        s.add(new Name("Shubham", "Juneja"));
        System.out.println(
            s.contains(new Name("Shubham", "Juneja")));
    }
}


Output:

false
    Explanation:

  • A Name instance consists of a first name and a last name. Two Name instances are equal, as computed by the equals method, if their first names are equal and their last names are equal.
  • First names and last names are compared using the equals method defined in String. Two strings are equal if they consist of the same characters in the same order. Therefore, two Name instances are equal if they represent the same name. For example, the following method invocation returns true: new Name(“Shubham”, “Juneja”).equals(new Name(“Shubham”, “Juneja”)) The main method of the program creates two Name instances, both representing Shubham Juneja.
  • The program puts the first instance into a hash set and then checks whether the set contains the second. The two Name instances are equal, so it might seem that the program should print true. If you run it, it almost certainly printed false.
  • Why not expected output?

    • The bug is that Name violates the hashCode contract. This might seem strange, as Name doesn’t even have a hashCode method, but that is precisely the problem. The Name class overrides the equals method, and the hashCode contract demands that equal objects have equal hash codes. To fulfill this contract, you must override hashCode whenever you override equals.
    • Because it fails to override hashCode, the Name class inherits its hashCode implementation from Object. This implementation returns an identity-based hash code. In other words, distinct objects are likely to have unequal hash values, even if they are equal. Name does not fulfill the hashCode contract, so the behavior of a hash set containing Name elements is unspecified.
    • When the program puts the first Name instance into the hash set, the set puts an entry for this instance into a hash bucket. The set chooses the hash bucket based on the hash value of the instance, as computed by its hashCode method. When it checks whether the second Name instance is contained in the hash set, the program chooses which bucket to search based on the hash value of the second instance. Because the second instance is distinct from the first, it is likely to have a different hash value.

    Solution:

    • If the two hash values map to different buckets, the contains method will return false. Suppose that the two Name instances map to the same bucket.All HashSet implementations that we know of have an optimization in which each entry stores the hash value of its element in addition to the element itself. When searching for an element, the implementation selects the appropriate hash bucket and traverses its entries, comparing the hash value stored in each entry with the hash value of the desired element. Only if the two hash values are equal does the implementation check the elements for equality. This optimization makes sense because it is usually much cheaper to compare hash codes than elements.
    • Because of this optimization, it is not enough for the hash set to search in the right bucket; the two Name instances must have equal hash values in order for the hash set to recognize them as equal. The odds that the program prints true are therefore the odds that two consecutively created objects have the same identity hash code.
    • Results may vary depending on which Java implementation is used, but you are highly unlikely to see the program print true on any JRE we know of. To fix the problem, simply add an appropriate hashCode method to the Name class. Once this method is added, the program will print true as expected:

      public int hashCode() {
      return 63 * first.hashCode() + last.hashCode();
      }
      

Similar Reads