We began by defining a
negligible function. A function β(n) is said to be negligible if ∀ c > 0 β(n) < 1/(n^c) for sufficiently large n.
We then defined a one-way permutation: a polynomial-time computable permutation f is
α(n)-one way (denoted by α-OWP) if for any probabilistic polynomial-time algorithm A:
Pr[A(f(x)) = x] ≤ 1 - α(n) + β(n)
where the probability is over all strings x of length n
and the coin-flips used by the algorithm, and β(n) is some negligible function. Intuitively, f is α-OWP if the fraction of 'hard' inputs is close to α.
Finally, for sets X and Y where X ⊆ Y, we say that X has density μ if |X| = μ|Y|
It is not obvious that one-way permutations exist, but our first theorem (due to Yao) shows that we can construct strong one-way permutations from weak ones. Given f, a (1/p(n))-OWP on n-bit strings for some polynomial p(n), we define:
F(x
1, x
2, ..., x
t) = f(x
1)f(x
2)...f(x
t)
where |x
i| = n, and t = n⋅p(n). That is, F is a function on n
2⋅p(n)-bit strings formed by applying f to each part of size n, and concatenating the results. To invert F, we must invert f on each part; the large number of parts means that we expect several of them to be hard for f. It can be shown that in fact, F is hard to invert on all but a negligible fraction of the range.
The chief problem with this method is that the input length blows up by a large polynomial factor. If f used n bits, F uses n
2p(n) bits. Therefore, the hardness of inverting F on large inputs (say of length m) relates to the hardness of inverting f on much smaller inputs (those of length m
ε, where ε is a constant ≤ 1/2).
A better idea to amplify hardness is to run f on many inputs, but instead of selecting the inputs independently at random, construct an expander graph where each node corresponds to an n-bit input string. Now, select the first node randomly, and then perform a random walk in the graph. Run f on every node v in the walk; the nodes are 'independent enough' that with good probability, we will find some node v such that f(v) is hard to invert.
Lemma 1: Let α(n) be at least 1/p(n) for some polynomial p. The following are equivalent:
(a) f is α-OWP
(b) For any polynomial-time randomized algorithm A, ∃ H ⊆ Σ
n of density ≥ α such that
Pr[A(f(x)) = x | x ∈ H] is negligible.
Intuitively, H is the 'hard' set for algorithm A; that is, the lemma says that if f is an α-OWP, there is some set H (of density at least α) of inputs that is hard to invert, and the converse. It follows almost immediately from the definition of an α-OWP that (b) implies (a). The other direction is also not difficult to prove; if there is an algorithm A with a hard set of density < α, repeated applications of this algorithm allow us to invert f with probability > 1 - α, contradicting the fact that f is α-OWP.
Lemma 2: Let G be an (N, d, λ) expander. (G has N vertices, is d-regular, and λ is the normalized second-largest eigenvalue.) For any subset S of the vertices, with density μ, the probability that a random walk of length k is contained entirely within S is ≤ μ
k/2.
The proof of this lemma uses
a technique we saw earlier. Let P be the projection vector with 1s in positions corresponding to elements of S, and A the normalized adjacency matrix of G. If u is the uniform unit vector, the desired probability is ||(PA)
k u||
1. (To see this, observe that we begin with a vertex picked uniformly at random; the probability that we are in S after a single step is given by the L
1-norm of PA(u) because Au gives us the probability distribution over vertices after a single step, and we sum over all the probabilities in S.)
The algebra required to bound ||(PA)
k u|| is a little too tedious for a blog post (with applications of the law of cosines, the Cauchy-Schwartz inequality, etc.), but it is fairly straightforward.
Theorem 1: (Main) If f is (1-α)-OWP on n-bit input strings (where α ≥ 1/2), we can construct F
k, which is (1 - α
k/2)-OWP on input strings of length n + k log d, where k is a polynomial and d a constant.
Proof: Let {G
n} be an expander family on 2
n vertices, with degree d, and λ ≤ 1/2. We label edges such that R
G_n(v, i) = (u,i) for all v and i. That is, we label edges such that if the edge between v and u is the i
th edge of v, it is also the i
th edge of u; an alternate way to think about this is that the edge labels give us d disjoint perfect matchings in the graph. We now define F
k(x, σ
1, σ
2, ..., σ
k) (where |x| = n, and each σ
i ∈ {1...d}) as the output of the function:
for i ← 1 to k
x ← f(x)
x ← R
G_n(x, σ
i)
return (σ
1, σ
2, ..., σ
k, x)
That is, we apply f on x, take a step in the graph, and repeat this process k times. Another interpretation is that we take a k-step walk based on the σ
is in the graph G'
n, which has the same vertex set as G
n, but has an edge between u and v ⇔ G
n has an edge between f(u) and v. Since f is a permutation, G'
n is also an expander with the same properties as G
n.
To see that the function F
k is a permutation, it is sufficient to see that it is injective. Suppose two different inputs produced the same output; this implies that the walks ended at the same vertex and that the sequence of &sigma
is is identical. But two such walks that begin at distinct vertices can never converge, because f is one-to-one, and the &sigma
ith edges from two different vertices cannot lead to the same vertex, because of the way edges are labelled.
We show that F
k is a (1-α
k/2)-OWP, by using any algorithm A that inverts F
k to construct an algorithm A' that inverts f; if the hard set for A has size < 1-α
k/2, the hard set for A' will have size < α. Given A, the algorithm A' works as follows to invert an input y:
Pick i from {1, 2, ..., k}
∀ 1 ≤ j ≤ k Pick σ
j uniformly at random from {1, 2, ..., d}
Set y' ← F
k-i-1(R
G_n(y, σ
i), σ
i+1, ..., σ
kUse A to find (x', σ
1, σ
2,...,σ
k) ← F
k-1(y', σ
1, σ
2,...,σ
k)
Return x ← F
i-1(x', σ
1, σ
2,...,σ
i-1)
Intuitively, to find the pre-image of y, A' picks i to be some position in a k-step walk. The algorithm follows the walk to the end, inverts it, and then follows the first part of the walk, stopping just before it reaches y. This vertex will be the predecessor of y in the walk. If A successfully inverts F
k, A' will invert f.
Since f is a (1-α)-OWP, A' must have a hard set of density ≥ 1-α. Let H be the set of vertices corresponding to the inputs in the hard set, and H* be the set of length k-walks in the graph that intersect H. From lemma 1, we know that H* has density at least (1-α
k/2). If we can show that H* is a hard set for A, it follows that F
k is a (1-α
k/2)-OWP.
If H* is
not a hard set for A, then the algorithm succeeds with non-negligible probability (≥ 1/p(n)) on H*. For every invocation of A by A', with probability ≥ 1/k, the index i points to some vertex x ∈ H. Considering all paths in H*, the probability of A succeeding is ≥ 1/p(n). Therefore, the probability of A' succeeding on H is ≥ 1/(k ⋅p(n)), which is non-negligible; this implies that H is not hard for A'. But this is a contradiction, and so H* must be a hard set for A. This completes our proof of the main theorem.
Theorem 2: If f is a 1/p(n)-OWP for some polynomial p(n), then for
any polynomial q(n), ∃ a (1- 1/q(n))-OWP with only an O(log n) (additive) increase in input length.
Proof: We use the construction of theorem 1 in two stages: First, starting with a (1 - (1-1/p(n)))-OWP, set k = O(1) and construct a (1 - (1-1/p(n))
k/2)-OWP by applying theorem 1; this only increases the input length by Θ(1) bits. Repeating this process O(log p(n)) = O(n) times, we can achieve a (1- 1/2)-OWP, with a total of O(log n) increase in input size. Now, set k = log q(n) = O(log n) and apply the construction to the (1- 1/2)-OWP; this gives us the desired (1-1/q(n))-OWP. The total increase in the iterations of stage 1 was O(log n), as it was in the single iteration of stage 2.