🔁 What is a Collision in Hashing?

A collision occurs when two different keys produce the same hash value (i.e., they map to the same index in a hash table).

🧪 Example:

If h(25) = 5 and h(35) = 5, then both values are trying to go to index 5.

🚧 Collision Resolution Techniques

There are two primary ways to resolve collisions:

Open Addressing (Closed Hashing)
Chaining (Open Hashing)

1️⃣ Open Addressing

In open addressing, when a collision occurs, the algorithm searches for the next available slot in the hash table until an empty slot is found.

The entire table is treated as a single array. Only one key is stored at each index.So at any point, the size of the table must be greater than or equal to the total number of keys (Note that we can increase table size by copying old data if needed). This approach is also known as closed hashing.

📌 General Formula:


h(k, i) = (h(k) + probe(i)) % m

Where:

h(k) is the base hash function
probe(i) determines how we look for the next slot
i is the probe number (attempt number)
m is the table size

🔄 Types of Open Addressing

a) Linear Probing

Check the next slot (sequentially) until an empty spot is found.


h(k, i) = (h(k) + i) % m

🧪 Example:

If h(25) = 5, and index 5 is full, check 6, then 7, etc.

🔄 Advantages

Simple and Easy to implement

❌Drawback: Primary Clustering

Prone to clustering, where consecutive slots get filled, potentially leading to further collisions.

b) Quadratic Probing

Search for empty slots using a quadratic function.


h(k, i) = (h(k) + i²) % m

🧪 Example:

Try h(k), h(k)+1², h(k)+2², etc.

✅ Advantage:

Reduces clustering

❌ Drawback:

May not examine all slots (table size should be prime).May still result in clustering, and the quadratic function needs to be chosen carefully to avoid infinite loops.

c) Double Hashing

Uses a second hash function to compute the probe step size.


h(k, i) = (h1(k) + i * h2(k)) % m

Where:

h1(k) is the primary hash
h2(k) is the secondary hash, not equal to 0

✅ Advantage:

Better distribution, minimal clustering

❌Disadvantage:

Disadvantages: Requires a good choice of the second hash function, and if not carefully designed, it may still lead to clustering.

1️⃣ Insertion in Open Addressing

Algorithm:

Compute hash index: index = h(k)
If the slot is empty, insert it there.
If occupied, probe the next slot using linear probing: (index + 1) % m, and so on.
Stop when an empty slot is found or the table is full.

2️⃣ Search in Open Addressing

Algorithm:

Compute hash index: index = h(k)
Check if the key is at that index.
If not, probe the next slot: (index + 1) % m until:

You find the key → Success
You find an empty slot → Key not found
You loop back to the starting index → Key not found

3️⃣ Deletion in Open Addressing

Problem:

You can’t just make the slot empty (None or _) because it breaks the probing chain for future searches.

✅ Solution:

Use a special marker like "DELETED" to indicate that the slot is available for insertion but not empty for search purposes.

Algorithm:

Search for the key using probing.
If found, replace it with "DELETED".
For insertion, you may reuse "DELETED" slots.

⚠️ Notes on Open Addressing:

Deletion is tricky: Requires special markers (e.g., “DELETED” flags)
Table must not be full for successful insertion
Load factor (number of elements / table size) must be kept low (typically < 0.7)

2️⃣ Chaining (Open Hashing)

In chaining, each slot in the hash table points to a linked list (or chain) of entries that hash to the same index.

🧪 Example:

If h(25) = 5 and h(35) = 5, both are stored in a list at index 5.


Index 5 → 25 → 35 → ...

✅ Advantages of Chaining:

No limit on the number of keys stored (as long as memory allows)
Deletions are simple
Works well even if the load factor > 1

❌ Disadvantages of Chaining:

Extra memory needed for pointers
May result in longer search time if chains are long

1️⃣ Insertion in Chaining

🧠 Idea:

Compute the hash index
Insert the key at the beginning (or end) of the linked list at that index

📌 Algorithm:


INSERT(key):
1. index ← h(key)
2. if key is not already in chain[index]:
       insert key at the beginning of chain[index]

Note: Inserting at the beginning is faster in linked lists (O(1))

2️⃣ Search in Chaining

🧠 Idea:

Compute the hash index
Traverse the list at that index to find the key

📌 Algorithm:


SEARCH(key):
1. index ← h(key)
2. Traverse the linked list at chain[index]
3. If key is found, return True
4. Else return False

3️⃣ Deletion in Chaining

🧠 Idea:

Compute hash index
Traverse the list at that index
Remove the node if found

📌 Algorithm:


DELETE(key):
1. index ← h(key)
2. Traverse the linked list at chain[index]
3. If key is found, delete it from the list

This is standard linked list deletion (requires pointer manipulation)

📝 Summary Table

Operation	Time Complexity (Average)	Time Complexity (Worst)
Insertion	O(1)	O(n) (if poorly distributed)
Search	O(1)	O(n)
Deletion	O(1)	O(n)

📊 Summary Table

Technique	Storage Structure	Handles Full Table?	Deletion Complexity	Clustering
Open Addressing	Same array	No	Difficult	Can be an issue
Chaining	Array + Linked Lists	Yes	Easy	Less likely

💡 When to Use What?

Use Case	Best Method
Low memory system	Open Addressing
Unpredictable number of keys	Chaining
Simpler deletion needed	Chaining
Faster search when load is low	Open Addressing

🔁 Hashing Techniques: Chaining vs Open Addressing

Feature	Chaining (Separate Chaining)	Open Addressing
Storage Structure	Uses linked lists at each index	All elements are stored within the table itself
Collision Handling	New elements are added to the list at the index	Uses probing (linear/quadratic/double hashing) to find next slot
Table Size	Can handle more elements than table size	Limited to table size (`n ≤ m`)
Memory Usage	Extra memory for pointers in linked lists	No extra memory (in-place storage)
Insertion Time (avg)	O(1)	O(1) if load factor is low
Search Time (avg)	O(1)	O(1) if load factor is low
Worst-case Time	O(n) – if all keys hash to same index	O(n) – if table is full or many collisions occur
Deletion Complexity	Easier – just remove from the linked list	Harder – needs special handling (e.g., tombstones)
Load Factor (α)	Can go beyond 1 (α = n/m can be > 1)	Must be ≤ 1 (since all data stored in array)
Cache Performance	Poor (due to linked list dereferencing)	Better (elements are in contiguous memory)
Implementation Complexity	Requires linked list or dynamic memory management	Simple array-based implementation
Ideal For	Unpredictable or large data sets	Small or memory-constrained environments

✅ Summary:

Chaining is more flexible and simpler for deletion.
Open Addressing is space-efficient but needs careful probing and deletion logic.

Example Problems

University Questions

Let the size of a hash table is 10. The index of the hash table varies from 0 to 9. Assume the keys 73, 54, 15, 48, 89, 66, 37, 18, 41, 22, 62 are mapped using modulo operator.

Show how the keys are distributed using chaining method.

Apply the hash function h(x) = x mod 7 for linear probing on the data 2341, 4234,2839, 430, 22, 397, 3920 and show the resulting hash table

Note:red indicates collision and the data is placed in free position considering the linear probing technique.

Example:

Hash the following keys using open chaining method and closed linear probing method. Size of table is 7 and the Hash function H(K) = K mod 7.Keys ={16, 21, 23, 50, 19, 26}

linear probing

chaining

Example:

Hash the following keys using open chaining method and closed linear probing method. Size of table is 11 and the Hash function H(K) = K mod 11.

Keys ={17, 22, 34, 23, 19, 66}

Linear probing

Chaining

Collision Resolution in Hashing

🔁 What is a Collision in Hashing?

🧪 Example:

🚧 Collision Resolution Techniques

1️⃣ Open Addressing

📌 General Formula:

🔄 Types of Open Addressing

a) Linear Probing

🧪 Example:

🔄 Advantages

❌Drawback: Primary Clustering

b) Quadratic Probing

🧪 Example:

✅ Advantage:

❌ Drawback:

c) Double Hashing

✅ Advantage:

❌Disadvantage:

1️⃣ Insertion in Open Addressing

2️⃣ Search in Open Addressing

3️⃣ Deletion in Open Addressing

Problem:

✅ Solution:

Algorithm:

⚠️ Notes on Open Addressing:

2️⃣ Chaining (Open Hashing)

🧪 Example:

✅ Advantages of Chaining:

❌ Disadvantages of Chaining:

1️⃣ Insertion in Chaining

🧠 Idea:

📌 Algorithm:

2️⃣ Search in Chaining

🧠 Idea:

📌 Algorithm:

3️⃣ Deletion in Chaining

🧠 Idea:

📌 Algorithm:

📝 Summary Table

📊 Summary Table

💡 When to Use What?

🔁 Hashing Techniques: Chaining vs Open Addressing

✅ Summary:

Example Problems

Comments

Post a Comment

Popular posts from this blog

Data Structures and Algorithms PCCST303 KTU 2024 Scheme- Dr Binu V P

Memory Allocation - First Fit

Performance Analysis - Time and Space Complexity