Hashing
Hashing
Hashing
Algorithms
1
Trees
Graphs
Hashing
File Organization
2
UNIT 3 HASHING
Support very fast retrieval via a key
Contents
3
1. Hash Table
◻ Hash function, Bucket, Collision, Probe
◻ Synonym, Overflow, Open hashing, Closed hashing
◻ Perfect hash function, Load density, Full table, Load factor, rehashing
2. Issues in hashing
◻ Hash functions- properties of good hash function
◻ Division, Multiplication, Extraction, Mid-square, Folding and
universal, Collision
3. Collision resolution strategies-
◻ Open addressing and chaining
4. Hash table overflow - extended hashing
5. Dictionary- Dictionary as ADT, ordered dictionaries
6. Skip List- representation, searching and operations- insertion,
removal.
Searching - most frequent and prolonged tasks
Searching for a particular data record from a large amount of
data.
Consider the problem of searching an array for a given value.
If the array is not sorted, the search requires O(n) time
If the value ISN’T there, we need to search all n elements
If the value IS there, we search n/2 elements on average
If the array is sorted, we can do a binary search
A binary search requires O(log n) time
About equally fast whether the element is found or not
More better performance ?
How about an O(1), that is, constant time search?
We can do it if the array is organized in a particular way
4
Search performance
5
The hash function provides a way for assigning numbers to the input
such that the data can be stored at the array index corresponding to
the assigned number.
Hashing is similar to indexing as it involves associating a key with a
relative record address.
With hashing the address generated appears to be random —
No obvious connection between the key and the location of the
corresponding record.
Sometimes referred to as randomizing.
With hashing, two different keys may be transformed to the same
address
Two records may be sent to the same place in a file – Collision
Two or more records that result in the same address are known as
Synonyms.
Hash Function
9
Hash
String ∑ASCII characters % table_size
Value
519 % 12 = 3
Key
Definition
Term
Hash Hash table is an array [0 to Max − 1]
Table of size Max
For better performance – keep table
size as prime number.
Hash A hash function is a mathematical
Function function that maps an input value into
an index / address.
(i.e. transforms a key into an address)
Bucket A bucket is an index position in a hash
table that can store more than one
record.
When the same index is mapped with two keys, both the records are stored
in the same bucket - This is called as collision for bucket size 1.
Alternative – Buckets with multiples sizes.
15
Key Terms
16
25 Key % Table_size
5
25 % 10
55 Key % Table_size
5
55 % 10
COLLISION
Key Terms
18
Collisions
Key % 10 are stored
outside the
table.
Application - Open / External Hashing
21
LINUX
Key Terms
22
Collisions result in
storing one of the
records at another slot
in the table.
Limits the table size.
Key Terms used in Hashing
23