-
Notifications
You must be signed in to change notification settings - Fork 1
HashMap
HashMap works on the principal of hashing
Hashing in its simplest form, is a way to assigning a unique code for any variable/object after applying any formula/algorithm on its properties. A true Hashing function must follow this rule:
Hash function should return the same hash code each and every time, when function is applied on same or equal objects. In other words, two equal objects must produce same hash code consistently.As HashMap also allows null key, so hash code of null will always be 0.
HashMap uses Entry class array as instance variable. **HashMap **stores the Objects as Entry instances, not as key and value
/**.
* The table, resized as necessary. Length MUST always be a power of two.
*/
transient Entry[] table;
The HashMap has an inner class called as Entry Class which hold the key, value stuff. And there is something called as next, hash which you will get to know a bit later.
static class Entry<K,V> implements Map.Entry<K,V>
{
final K key;
V value;
Entry<K,V> next;
final int hash;
//More Code
}
Here is the implementation of put method
public V put(K key, V value)
{
if (key == null)
return putForNullKey(value);
int hash = hash(key.hashCode());
int i = indexFor(hash, table.length);
for (Entry<K,V> e = table[i]; e != null; e = e.next)
{
Object k;
if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
{
V oldValue = e.value;
e.value = value;
e.recordAccess(this);
return oldValue;
}
}
modCount++;
addEntry(hash, key, value, i);
return null;
}
- First, it checks if the key given is null or not. If the given key is null, it will be stored in the zero position, as the hashcode of null will be zero.
- Then it applies the hashcode to the key .hashCode() by calling the hashcode method. In order to get the value within the limits of an array, the hash(key.hashCode()) is called, which performs some shifting operations on the hashcode.
- The indexFor() method is used to get the exact location to store the Entry object. Its calculated using
index = hashCode(key) & (n-1).
- Then comes the most important part what happens if two different object has the same hashcode( eg : Aa,BB will have the same hashcode) and will it be stored in the same bucket. To handle this let's think of the LinkedList in data structure it will have a next attribute which will always point to the next object. The same way the next attribute in the Entry class points to the next object. Using this different objects with the same hashcode will be placed next to each other.
- In the case of the Collision, the HashMap checks for the value of the next attribute if it is null it inserts the Entry object in that location, if next attribute is not null then it keeps the loop running till next attribute is null then stores the Entry object there.
public V get(Object key)
{
if (key == null)
return getForNullKey();
int hash = hash(key.hashCode());
for (Entry<K,V> e = table[indexFor(hash, table.length)];e != null;e = e.next)
{
Object k;
if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
return e.value;
}
return null;
}
- First, it gets the hash code of the key object, which is passed, and finds the bucket location.
- If the correct bucket is found, it returns the value (e.value)
- If no match is found, it returns null.
- If two key has same hasCode then The same collision resolution mechanism will be used here. key.equals(k) will check until it is true, and if it is true, it returns the value of it.
As part of the work for JEP 180, there is a performance improvement for HashMap objects where there are lots of collisions in the keys by using balanced trees rather than linked lists to store map entries. The principal idea is that once the number of items in a hash bucket grows beyond a certain threshold, that bucket will switch from using a linked list of entries to a balanced tree. In the case of high hash collisions, this will improve worst-case performance from O(n) to O(log n).
Basically when a bucket becomes too big (currently: TREEIFY_THRESHOLD = 8), HashMap dynamically replaces it with an ad-hoc implementation of tree map. This way rather than having pessimistic O(n) we get much better O(log n).
Bins (elements or nodes) of TreeNodes may be traversed and used like any others, but additionally support faster lookup when overpopulated. However, since the vast majority of bins in normal use are not overpopulated, checking for existence of tree bins may be delayed in the course of table methods.
Tree bins (i.e., bins whose elements are all TreeNodes) are ordered primarily by hashCode, but in the case of ties, if two elements are of the same “class C implements Comparable“, type then their compareTo() method is used for ordering.
Because TreeNodes are about twice the size of regular nodes, we use them only when bins contain enough nodes. And when they become too small (due to removal or resizing) they are converted back to plain bins (currently: UNTREEIFY_THRESHOLD = 6). In usages with well-distributed user hashCodes, tree bins are rarely used. References