Hash Tables — O(1) Lookups and How They Work
What Is a Hash Table?
A hash table (also called a hash map, dictionary, or associative array) stores key-value pairs with O(1) average-case lookup, insert, and delete. That’s the fastest possible for a general-purpose data structure.
Python’s dict is a hash table. JavaScript’s objects and Map are hash tables. Java’s HashMap is a hash table. You use them constantly — this lesson explains how they actually work underneath.
The core idea: a hash function converts a key into an array index. Instead of scanning every element to find a value, you compute exactly where it lives.
How Hashing Works
The Hash Function
A hash function takes a key (string, number, whatever) and returns an integer. That integer modulo the table size gives the array index. A good hash function distributes keys evenly across the array.
Collisions
Two keys can hash to the same index — this is called a collision. There are two main strategies:
Chaining: each bucket holds a linked list (or array). Colliding keys are appended to the list at that index.
Open addressing: if the target bucket is occupied, probe for the next empty slot (linear probing, quadratic probing, or double hashing).
Load Factor
The load factor is the ratio of stored elements to table size. When it exceeds a threshold (typically 0.7), the table resizes — usually doubles — and rehashes all entries.
class HashTable:
def __init__(self, size=8):
self.size = size
self.buckets = [[] for _ in range(size)]
self.count = 0
def _hash(self, key):
return hash(key) % self.size
def put(self, key, value):
index = self._hash(key)
for i, (k, v) in enumerate(self.buckets[index]):
if k == key:
self.buckets[index][i] = (key, value)
return
self.buckets[index].append((key, value))
self.count += 1
if self.count / self.size > 0.7:
self._resize()
def get(self, key):
index = self._hash(key)
for k, v in self.buckets[index]:
if k == key:
return v
return None
def _resize(self):
old = self.buckets
self.size *= 2
self.buckets = [[] for _ in range(self.size)]
self.count = 0
for bucket in old:
for key, value in bucket:
self.put(key, value)
ht = HashTable()
ht.put("name", "Alice")
ht.put("age", 25)
ht.put("city", "Sydney")
print("name:", ht.get("name"))
print("age:", ht.get("age"))
print("missing:", ht.get("email"))
print("Table size:", ht.size)| Operation | Average | Worst Case | Notes |
|---|---|---|---|
| Get | O(1) | O(n) | Worst case when all keys collide |
| Put | O(1) | O(n) | Amortised O(1) with resizing |
| Delete | O(1) | O(n) | Same as get + removal |
| Contains | O(1) | O(n) | Same as get |
Key Takeaways
- Hash tables provide O(1) average-case lookup, insert, and delete.
- A hash function converts keys to array indices. Good hash functions distribute keys evenly.
- Collisions are handled by chaining (linked lists) or open addressing (probing).
- The load factor triggers resizing — keeping the table efficient at the cost of occasional O(n) rehash.
- Python’s
dictandsetare hash tables. So is Java’sHashMapand JavaScript’sMap.