Are Bitvectors Optimal?
Static Membership problem
Given
Universe U = [m]
Set S containing keys of U
Query
For some u, is u in S?
Main goal
Use few probes
Small storage space
Errors in Responding to Query
One-sided
if u is in S, correct ("yes")
if u is not in S, sometimes correct ("no"), sometimes incorrect ("yes")
Two-sided
if us is in S, sometimes correct ("yes"), sometimes incorrect ("no")
if us is not in S, sometimes correct ("no"), sometimes incorrect ("yes")
Space requirements
One-sided Error (minor result)
S is a subset of U.
Let G = (U,V) be a bipartite graph where U is the universe and
V is locations in memory.
For each u in S, color each of u's neighbors 1.
Note that if u is NOT in S, there may nevertheless
exist an edge from u to one of its neighbors.
Thus the one-sided error.
Query
Suppose we are given the query "is u in S?" for some u in U.
Then we probe a random neighbor of u. If this neighbor is 1, then
we return true, else false.
Result
We aim to exhibit a construction of G such that for all u' not in S,
fewer than ε |Γ(u')| neighboring locations are 1.
This guarantees that when probing u', the probability of an incorrect
response is less than ε |Γ(u')|.
We assume that |S| ≤ n.
Our construction is an (m,l,a)-design .
A family F is an (m,l,a)-design if F has m sets each of size l,
and any two sets in F have at most a elements in common.
A Theorem from an earlier work shows that an (m,l,a)-design can be guaranteed
when m, s, a and l are related in a certain way.
We use an (m,l,a)-design with a = log m, l = (n log m)/ε.
Correctness
If u is in S, then all of Γ are 1, thus we correctly return "yes".
If u is NOT in S, then we have that |U ∩ Γ(u)| ≤ na. Thus
we have that the probability of erroneously getting a 1 is bounded by
na/|Γ(u)| = na/l ≤ ε.
Two-sided Error (main result)
Consider bipartite graph (U, V, E) where U=[m] and V=[s].
We store the set S, where |S| ≤ n.
We use the same query scheme as before: for the query "is u in S?" we return
true if a random vertex of Γ(u) is colored 1, false otherwise.
We want the error probability to be at most ε. We seek a two-coloring
&X:V →{0,1} such that:
∀ u ∈ S, at least (1-ε)|Γ(u)| locations are colored 1
∀ u' ∉ S, at least (1-ε)|Γ(u)| locations are colored 0
Definition: Let C1, C0 be subsets of 2V. We say that
(C1,C0) is ε-2-colorable if there exists
X:V →{0,1} such that:
Further, we say that C is (n,ε)-2-colorable if for all subsets
(C1,C0) of C
such that |C1| ≤ n and C0=C-C1, we have that
(C1,C0) is ε-2-colorable.
Our Goal
We wish to find a bipartite graph G=(U,V,E) such that U=[m], V=[s]
and {Γ(u)}u in U is (n,ε)-2-colorable. This will give
ensure that our query procedure is never wrong more than ε fraction of the
time.
We show that such a graph exists, with |V|=O(n/ε2 log m):
G = (U, V, E) is an (m, s, n, d ε)-expander
G=(U,V,E) is an ε-expander => {Γ(u)}u in U has (n,ε)
intersection property
This is proved via induction on |C1| and |C0|.
C has (n,ε)-intersection property => C is (n,ε)-two-colorable
Definition: G=(U,V,E) is an (ms,n,d,ε)-expander if U=[m], V=[s],
each vertex in U has degree d and for all subsets S of size less than or equal
to 2n we have that |Γ(s)| ≥ (1-ε/2) |S|d .
We prove the contrapositive. We assume that we do not have the (n,ε)-intersection
property. Then we show that:
|Γ(S)| < (1-ε/2)|S|d, a violation of the definition of an expander.
Given
Universe U = [m]
Set S containing keys of U
Query
For some u, is u in S?
Main goal
Use few probes
Small storage space
Errors in Responding to Query
One-sided
if u is in S, correct ("yes")
if u is not in S, sometimes correct ("no"), sometimes incorrect ("yes")
Two-sided
if us is in S, sometimes correct ("yes"), sometimes incorrect ("no")
if us is not in S, sometimes correct ("no"), sometimes incorrect ("yes")
Space requirements
| Error | Probes | Space (upper) | Space (lower) |
| 2-sided | 1 | O(n/ε2 log m) | Ω (n/(ε log(1/ε)) log m) |
| 1-sided | 1 | O((n/ε)2 log m) | Ω(n2/(ε2 log n/ε)log m) |
One-sided Error (minor result)
S is a subset of U.
Let G = (U,V) be a bipartite graph where U is the universe and
V is locations in memory.
For each u in S, color each of u's neighbors 1.
Note that if u is NOT in S, there may nevertheless
exist an edge from u to one of its neighbors.
Thus the one-sided error.
Query
Suppose we are given the query "is u in S?" for some u in U.
Then we probe a random neighbor of u. If this neighbor is 1, then
we return true, else false.
Result
We aim to exhibit a construction of G such that for all u' not in S,
fewer than ε |Γ(u')| neighboring locations are 1.
This guarantees that when probing u', the probability of an incorrect
response is less than ε |Γ(u')|.
We assume that |S| ≤ n.
Our construction is an (m,l,a)-design .
A family F is an (m,l,a)-design if F has m sets each of size l,
and any two sets in F have at most a elements in common.
A Theorem from an earlier work shows that an (m,l,a)-design can be guaranteed
when m, s, a and l are related in a certain way.
We use an (m,l,a)-design with a = log m, l = (n log m)/ε.
Correctness
If u is in S, then all of Γ are 1, thus we correctly return "yes".
If u is NOT in S, then we have that |U ∩ Γ(u)| ≤ na. Thus
we have that the probability of erroneously getting a 1 is bounded by
na/|Γ(u)| = na/l ≤ ε.
Two-sided Error (main result)
Consider bipartite graph (U, V, E) where U=[m] and V=[s].
We store the set S, where |S| ≤ n.
We use the same query scheme as before: for the query "is u in S?" we return
true if a random vertex of Γ(u) is colored 1, false otherwise.
We want the error probability to be at most ε. We seek a two-coloring
&X:V →{0,1} such that:
∀ u ∈ S, at least (1-ε)|Γ(u)| locations are colored 1
∀ u' ∉ S, at least (1-ε)|Γ(u)| locations are colored 0
Definition: Let C1, C0 be subsets of 2V. We say that
(C1,C0) is ε-2-colorable if there exists
X:V →{0,1} such that:
| For all T in C0 | |X-1(1) ∩ T| ≤ ε |T| |
| For all T in C1 | |X-1(0) ∩ T| ≤ ε |T| |
Further, we say that C is (n,ε)-2-colorable if for all subsets
(C1,C0) of C
such that |C1| ≤ n and C0=C-C1, we have that
(C1,C0) is ε-2-colorable.
Our Goal
We wish to find a bipartite graph G=(U,V,E) such that U=[m], V=[s]
and {Γ(u)}u in U is (n,ε)-2-colorable. This will give
ensure that our query procedure is never wrong more than ε fraction of the
time.
We show that such a graph exists, with |V|=O(n/ε2 log m):
| G = (U, V, E) is an (m, s, n, d, ε)-expander |
| |
| {Γ(u)}u in U has the (n, ε)-intersection property |
| |
| {Γ(u)}u in U is (n, ε)-2-colorable |
| |
| (n, m, s, 1)-randomized scheme with error ε |
G = (U, V, E) is an (m, s, n, d ε)-expander
G=(U,V,E) is an ε-expander => {Γ(u)}u in U has (n,ε)
intersection property
This is proved via induction on |C1| and |C0|.
C has (n,ε)-intersection property => C is (n,ε)-two-colorable
Definition: G=(U,V,E) is an (ms,n,d,ε)-expander if U=[m], V=[s],
each vertex in U has degree d and for all subsets S of size less than or equal
to 2n we have that |Γ(s)| ≥ (1-ε/2) |S|d .
We prove the contrapositive. We assume that we do not have the (n,ε)-intersection
property. Then we show that:
|Γ(S)| < (1-ε/2)|S|d, a violation of the definition of an expander.

0 Comments:
Post a Comment
Links to this post:
Create a Link
<< Home