Week 7 - Homework 5 (unassessed)
Week 8 - Bloom Filters
This week we will be focusing on hash functions. First we will look at a probability question.
Balls in bins problem
If you randomly throw
We are going to show that …
So first look at
So by letting
Best of 2 approach
Best of 2(n)
Input: positive integer n
Output: allocation of n balls to n bins
1. For i = 1 -> n
2. Randomly select two bins j & k.
3. If load(j) < load(k) assign ball to bin j
4. Otherwise assign ball to bin k
In this approach we get the max load is
Note if we increase 2 to
Hashing
Unacceptable passwords
We want to check a user doesn’t put in a password that is considered unacceptable. We have a huge universe of possible passwords
. We want to maintain of unacceptable passwords. For a given password
we want to be able to answer is ?
In Chain Hashing, we use a hash table (
For notation let
In Chain Hashing the length of the linked lists behave similarly to the balls in the bins problem. Therefore, if
Best of 2 hashing
Instead of using a hash table with only one hash function instead lets use two independent hash functions
- To add
we: - Compute
and , and - add
to the least loaded of and .
- Compute
- To check if
is in - Compute
and , and - check for
in and .
- Compute
From the analysis we did above if
Bloom filters
Bloom filters are another way to solve the unacceptable passwords problem.
- Checks take
time, - It uses less space than Chain Hashing, and
- it is a simpler algorithm.
However, there is a small probability of false positives
but we say it is!