Friday, 23 March 2012

How HashMap works in Java ?

In computer science, a hash table or hash map is a data structure that uses a hash function to map identifying values, known as keys (e.g., a person's name), to their associated values (e.g., their telephone number). Thus, a hash table implements an associative array. The hash function is used to transform the key into the index (the hash) of an array element (the slot or bucket) where the corresponding value is to be sought.

Ideally, the hash function should map each possible key to a unique slot index, but this ideal is rarely achievable in practice (unless the hash keys are fixed; i.e. new entries are never added to the table after it is created). Instead, most hash table designs assume that hash collisions—different keys that map to the same hash value—will occur and must be accommodated in some way.The Java uses

Separate chaining collision resolution strategy to handle such events.

Hash Function

Hash Function calculates an index from the data item's key and use this index to place the data into the array.The syntax of hash function, f is

index = f(key, arrayLength)

Consider one simple hash function

f(key, arrayLength)=key%arrayLength
Assume the size of the associative array is 10.I.e. we have 10 buckets indexed from 0-9 as given below.
  





















buckets
0
1
2
3
4
5
6
7
8
9

We have to store the following key-value pairs in the hash table.


Rollno
Mark
1
58
5
76
8
80
9
65
12
90

Let us find out the associated buckets for each keys using the hash function f

Rollno 1 -> f(1,10)=1%10=bucket 1
Rollno 5 -> f(5,10)=5%10=bucket 5
Rollno 8 -> f(8,10)=8%10=bucket 8
Rollno 9 -> f(9,10)=9%10=bucket 9
Rollno 12 -> f(12,10)=12%10=bucket 2

The marks will be stored in the hash table as given below




58
90


76


80
65
buckets
0
1
2
3
4
5
6
7
8
9

Using this hash table you can easily retrieve the marks associated with a rollno.
Suppose you want to look-up the mark of rollno 12, then find out the hash table index for the rollno 12 using the hash function f.I.e
index=f(12,10)=12%10=bucket 2
The mark of rollno 12 is stored at bucket 2 ,so mark is 90.
In hash table data look-up is in the order of O(1) [ Note: only in best case]

Separate chaining -collision resolution strategy

The above hash table implementation won't work in real-time scenarios.The reason is hash collisions—different keys that map to the same hash value.In the above example , consider you want to store mark 95 of rollno 11.Calculate the corresponding hash table index,

Rollno 11 -> f(11,10)=11%10=bucket 1

Notice that hash collision happened,already there is one mark stored in bucket 1 and is that of rollno 1.The reason is that rollno 1 also generates the same hash index.

Rollno 1 -> f(1,10)=1%10=bucket 1

We can't store more than one data under one array index.Don't worry, Java is using Separate chaining -a collision resolution strategy to handle this issue.

Here each slot of the bucket array is a pointer to a linked list that contains the key-value pairs that hashed to the same location. Lookup requires scanning the list for an entry with the given key. Insertion requires adding a new entry record to either end of the list belonging to the hashed slot. Deletion requires searching the list and removing the element.

The marks will be stored in the hash table as given below



How will you retrieve if two different objects have same hashcode?

Let us check how we will retrieve mark of rollno 11.At first step we will find out hash code of rollno 11,that is f(11,10)=11%10=1.We can understand that mark of rollno 11 is stored in bucket 1.But bucket one contains more than one rollno-mark pairs in the form of a linked list.Now we have traverse this linkedlist and find out a match for the key rollno 11 and retrieve the corresponding mark.

4 comments: