Home Random Page


CATEGORIES:

BiologyChemistryConstructionCultureEcologyEconomyElectronicsFinanceGeographyHistoryInformaticsLawMathematicsMechanicsMedicineOtherPedagogyPhilosophyPhysicsPolicyPsychologySociologySportTourism






Iteration of Morphisms

Many public-key cryptosystems based on the theories of automata and formal languages have been proposed. Some of them will be discussed in this and the next section. As we have emphasized already before, the purpose is rather to give a feeling of the diverse possibilities to construct public-key cryptosystems than to evaluate the resulting systems. Apart from security issues, such an evaluation should take into account also other aspects: ease of legal application, length of cryptotexts, etc. Some of these aspects will be mentioned below. Language- theoretic notions will be explained to the extent they are needed for the under­standing of the systems. Some further language theory will be used without detailed explanations, for instance, in cryptanalysis. As regards language theory, the inter­ested reader may consult [Sal].

Let I and A be alphabets. Recall that Z* denotes the set of all words over Z, including the empty word /. In what follows, Z and A may be equal, disjoint or partially overlapping. A mapping h: Z* A* is termed a morphism iff h(xy) = h(x)h(y) holds for all words x and y over Z. It follows that always /?(/) = / and that a morphism is completely determined by its values for the letters of Z. A finite substitution ct is a mapping of Z* into the set of finite subsets of A* such that o(xy) = a(x)(j(y) holds for all words x and y over Z. The two conclusions made for morphisms hold also now. For instance, let Z = A = {a, b} and

a (a) = {a, ab}, a(b) = {b, bb} .

Then

a(ab) = {ab, abb, abbb} .

Observe that a(ab) contains only three elements because the word abb is obtained in two different ways. In the sequel it will sometimes be convenient to use the notation (x)h instead of h(x), and similarly for finite substitutions. If L is a language, then

<j(L) = {y\y e a{x), for some x e L} .

We now begin the description of the cryptosystem. Consider two morphisms h0, hx : Z* -» Z*, as well as a nonempty word w over Z. We say that the quadruple G = (X, h0, h1, w) is backward deterministic iff the condition

(w)hh ...hin = (w)hu ...hjm

always implies the condition

i, . . . in ...jm.

Here each i, and j, belongs to the set of indices {0,1}. Thus, backward determinism means that in an application of a sequence of morphisms, one after the other, the outcome uniquely determines the sequence; it is not possible that two different sequences lead to the same outcome.

Example 5.1. Consider the morphisms defined by

h0(a) = ab, h0(b) = b, ht(a) = a, hl{b) = ba.

If we choose w = a, then the resulting quadruple is not backward deterministic because the outcome a is obtained by a sequence of l's of any length. The same conclusion holds if w = b. On the other hand, the quadruple ({a, b}, h0, h}, ab) is backward deterministic. This follows because the last letter of a word reveals the morphism last applied. Using this principle one can "parse" a word w' back to the initial word, provided w' was obtained by some sequence of morphisms from w. □



Backward deterministic quadruples G can be used as classical cryptosystems in the following obvious fashion. A sequence of bits i, . . . i„ is encrypted as the word (w)hu . . . hin. Backward determinism guarantees that decryption will be unique.

For instance, if G is the quadruple in Example 5.1 with w = ab, then some plaintexts are encrypted as follows.

Plaintext Cryptotext
abb
aba
abbb
ababa
abbab
abaa
abaabaa

 

Of course, G has to be kept secret if it used as a classical cryptosystem in the sense described. Otherwise, there will be no difference between legal decryption and cryptanalysis. Cryptosystems of this type are referred to as functional. In general, a Junctional cryptosystem is specified by two functions f0 and /, and an initial value x. A sequence of bits i1 ... i„ is encrypted as the value (x)fh .. ,fin. A condition corresponding to the backward determinism defined above has to be satisfied to guarantee the uniqueness of decryption. More than two functions are needed if plaintexts contain more than two characters.

An obvious way to transform a functional cryptosystem into a public-key one is to provide a trapdoor leading from the publicized functions and values to some easily parsable situations. More specifically, we know the initial value x and functions f0,flt as well as a value y such that

y = (x)fi,

for some composition of the functions f0, /,. With this information it is hard to find the sequence of bits i, . . . i„ determining the composition, although we know the sequence is unique. However, with the trapdoor information the equation can be transformed into the form

/ = (x')gh . . . gin,

where x', y', g0, are known. Moreover, now the sequence of bits (which is the same as the original sequence) can be found easily.

Let us see how the trapdoor is constructed when the two functions are morphisms. In fact, the trapdoor will lead to two easily parsable morphisms. The publicized setup uses an alphabet much bigger than I, and two finite substitutions instead of two morphisms. The substitutions and the initial word are defined in such a way that the bit sequence remains unaltered when the trapdoor is used to go from the "public" equation to the easily parsable one.

More specifically, let G = (£, h0, h1, w) be backward deterministic. Let A be an alphabet of a much greater cardinality than I. Typically, I consists of five letters, whereas zl consists of 200 letters. Let g:A*—>Z* be a morphism mapping every letter to a letter or to the empty word in such a way that g~x{a) is nonempty for all letters a of Z. This means that every letter d of A is either a descendant of some letter in I or a dummy. The letter d is a descendant of a if g(d) = a. The additional condition of g'1(a) being nonempty implies that every letter of Z has at least one descendant. The letter d is a dummy if g{d) = X.

Consider a quadruple H= (A, <j0, ap u), where <r0 and c, are finite substitu­tions on A defined below and u is a word over A satisfying </(w) = w. Equivalently, u belongs to g~1 (w). In general, u is not unique because dummies may occur in arbitrary positions and each descendant may also be chosen arbitrarily.

Also the finite substitutions ct0 and are not unique. For each d in A, a0(d) is a nonempty finite set of words y such that if h0 maps g(d) into x in Z*, then g(y) = x. Equivalently, aQ(d) is a finite nonempty subset of g~ l(h0(g(d))). (Custom­arily we write the arguments of functions on the right as here. It should cause no confusion that we write them on the left while encrypting, in order to preserve the proper order in the bit sequence.) A substitution a, is defined in the same way, using hl.

The quadruple H = (A, <r0, au u) is publicized as the encryption key. A bit sequence ij . . . i„ is encrypted by choosing an arbitrary word x from the finite set

("K-, ■ • • ffi„ ■

If the bit sequence is long, it can be divided in an arbitrary fashion into blocks that are encrypted separately.

Everything else, that is, Z,h0,hl,w,g remains as a secret trapdoor. The essen­tial item is the "interpretation" morphism g: all other items can be computed from g and the public information. We mention in passing that in the terminology of L-systems G is a DTOL-system and H a TOL-system. L-systems, named after A. Lindenmayer, are mathematical models very suitable for computer simulation of biological growth processes. The reader is referred to [RS] for details.

The idea behind the public-key cryptosystem just described is that a crypt­analyst has to parse according to the messy TOL-system H, whereas the legal receiver who knows the trapdoor can operate in the simple and easily parsable DTOL-system G. More comments will be given below. That the public-key cryptosystem works as intended is a consequence of the next lemma.

Lemma 5.2. Let G = (Z, h0, ht, w) be backward deterministic, and let g and H = (A, a0, <x, , u) be defined as above. Use G and H to encrypt bit sequences in the way described above. Then decryption according to H is unique. Moreover, if the bit sequence . . . in is encrypted as y according to H, then ij . . . i„ is the decryption of g(y) according to G.

Proof. Consider the last sentence. Assume that y is a word in the set (u)<7M . . . oin. Then

g(y) = (g(u))hu . . . hin = (w)hu . . . hin.

This follows by the definition of the substitutions and u. (An algebraically minded reader will notice that substitutions, as well as morphisms, commute with catena­tion by their very definition.)

To prove the uniqueness of decryption according to H, assume that some y can be decrypted both as the bit sequence i and the bit sequence j. By the last sentence, g(y) is decrypted both as i and j according to G. Since decryption according to G is unique by backward determinism, we must have i = j. □

Continuing Example 5.1, let A = {c,, c2, c3, c4, c5} and define the interpreta­tion morphism by

g(cx) = b, g(c2) = g(cj = a, g(c3) = g(c5) = A .

Thus, c2 and c4 are descendants of a, ct is the only descendant of b, and c3 and c5 are dummies. We choose u = c4c3cj, then g(u) — ab = w. To construct the substi­tutions, recall that the morphisms were defined by

h0\a-*ab, b b; ht:a->-a, b-*ba.

We now define <j0 and a^ using the same descriptive notation.

o0:c1-+c1,c3c1 Cj: Cj —» CjC2, c3CjC4

c2 —* C4C1(C2c1c5 c2 —» c2, c3c5c4

c3->cs,c3c3 c3^c3,c5c5

c4 ► C4Cj, , C4C j c3 c4 *c2,c4c3

cs —* C5, C3C5C3 c5—>c3,c5c3

This is a correct definition because when the interpretation morphism g is applied, a0 and reduce to hg and ht:

cr0: b -* b, b ct, : b ba, ba

a-> ab,ab a->a,a

X-> A, A k-*k,X

a -* ab, ab, ab a a, a

A->A,A A-*A,A

To encrypt 011 using the public-key, we first choose the word y, = c4clc5c1 from (u)ff0, then the word y2 = c2c3c1c4c3c1c2 from (y1)al and, finally, the word

y = C2C5C5C1C2C2C3C1C2C2

from (>'2)cri ■ The legal receiver may compute

g(y) = abaabaa ,

from which the plaintext 011 can be immediately recovered using the special property of h0 and h1 mentioned above. □

Not all DTOL-systems, that is, quadruples G = (Z,h0,h1,w) are backward deterministic. For instance, if all words h0(a), where a ranges over letters of I, are powers of the same word x, then G cannot be backward deterministic. This follows because it is easy to verify that

(w)hQhxh0h0 = (w)/i0/i0/i,/i0 .

On the other hand, backward determinism does not guarantee easy parsing. For this purpose, the notion of strong backward determinism is more appropriate.

By definition, a quadruple G = (I, h0, ht, w) is strongly backward deterministic iff the condition

(w)hh ...hin = (x)h,

always implies the conditions

t = in and x = (w)his . . . hin , .

Thus, every word generated by a strongly backward deterministic G has a unique predecessor in I*, and is derived from this predecessor by a unique morphism. This means that the parsing sequence of a word in a strongly backward deterministic DTOL-system depends only on the word and, consequently, parsing (decryption) can be carried out from right to left without any look-ahead. This is not necessarily true if G is only backward deterministic. In order to find the last bit, one may even have to go back to the axiom.

Example 5.2. Clearly, every strongly backward deterministic DTOL-system is backward deterministic. Consider G = ({a, b}, h0, ht, ab), where

hQ: aab, b -» bb; hx\ a -» bb, b^ab .

That G is backward deterministic is easy to show by an inductive argument: a counter example immediately leads to a shorter counter example, which is of course impossible. On the other hand, G is not strongly backward deterministic because

(ab)h0h1 = bbababab = (abbb)hl = (baaa)h0 .

One can prove that strong backward determinism is a decidable property, whereas backward determinism is undecidable. L

An issue important in system design is the word length. Cryptotexts should not be too long compared with plaintexts. Fortunately, there are big classes of DTOL- systems with linear growth rate. In the transition to TOL-systems, the growth rate remains essentially the same as regards descendants of letters. The substitutions for the dummies should be defined in such a way that exponential growth is not likely to occur. Besides, block division of the plaintext can always be used to reduce growth.

As regards cryptanalysis, preprocessing is not likely to succeed. Consider trapdoor pairs (G, g) such that G is a DTOL-system resulting from H by the interpretation morphism g. Given H, there may be several such trapdoor pairs. Only one of them, say (Gl,g1), has been used by the cryptosystem designer. If some other pair (G2, g2) giving rise to H is found, it can be used in decryption with the following warning. G2 is not necessarily backward deterministic and, therefore, a cryptotext may lead to several plaintexts. However, the correct plaintext is always among them.

This observation does not make the cryptanalysis by preprocessing essentially easier. It can be shown that it is an /VP-complete problem to find any trapdoor pair. (Some other preprocessing method might still exist.) Consequently, also finding the dummy letters is an /VP-complete problem. For if the dummies have been found, the construction of a trapdoor pair will be easy. This result means that it does not help much to know that dummies always have to be replaced by words consisting of dummies.

Altogether, the cryptosystem seems to be safe against cryptanalysis by pre­processing: trapdoor pairs are not easy to be found. A cryptanalytic algorithm running in time kn3, where n is the length of the intercepted cryptotext and the constant k is fairly large, can be constructed using the theory of finite automata, [Kar2],

In the following generalization, finding the dummy symbols is no longer sufficient for successful cryptanalysis.

Recall that the interpretation morphism g was supposed above to assume as its values only letters or the empty word. Such a very restrictive definition is not necessary. We now assume that the interpretation morphism is any surjective morphism g: A* -» I*. This means that all words of I* appear as values of g, a condition certainly satisfied by our original definition. Otherwise, the crypto­system design remains unaltered. However, let us be more specific.

As before, assume that G = (I, h0, h^, w) is backward deterministic (preferably: strongly backward deterministic). Choose A to be much bigger than I, and let the morphism g: A* -> I* be surjective. Let u be a word over A such that g(u) = w. Such a u exists because g is surjective. For d in A, let a^d), i = 0,1, be a finite nonempty set of words x with the property

g(x) = hMd)) ■

Again, the surjectivity of g is needed to assure that there are such words x. As before, the decryption of a cryptotext y can be carried out by parsing the word g(y) according to G.

Lemma 5.2 remains valid also now. Cryptanalysis by preprocessing seems to be more difficult. However, the cubic-time algorithm mentioned above for analyzing intercepted cryptotexts is applicable also to the generalized system.

Example 5.3. Consider G = ({a, b}, h0, hi,ba), where the morphisms are defined by

h0: a^ab hl: a ba b->b b-*a

Then G is strongly backward deterministic by the obvious reasons: the last letter of a word determines the morphism used, and the predecessor of a word is unique because both morphisms are injective. We choose A = {c,,. .., c10} to consist of


5.2 Iteration of Morphisms 173 ten letters, and define the interpretation morphism by

C1 -> ab c6 -* A
C2 b c7 -» bab
C3 -* A C8 -»A
C4 ->b c9 -»ba
C5 -* a C10 aa

 

Next we may choose u = cg because g(c9) = ba. To complete the definition of H, we define the substitutions a0 and a, by

Ci -+c 1C4 ffl: C1 ~~* C4C10' c9c5
c2 -"C6c2 C2 c8c5
c3   C3 "►C6C8
C4 —►c2, c3c4 C4 -♦C3C5
c5 —► c,, c5c2   —► C9, c2c5
c6 -*cac3 C6 ->C6C3
c7 —* c7c8c4 c7 C1C10
c8   C8 -"cg
c9 ^ c3c7, c4cj c9 —> cjc6c5, c5c,
C10 -» c5c7, c!c5c2 C10  

 

This definition is correct because, for all i and d, a^d) contains only words x satisfying g(x) = h^gid)). For instance, from the first and two last lines we obtain

0(c4clo) = g(c9c5) = baa = h^ab) = Mfffci)) > 3(c3c7) = g(cAc,) = bab = h0(ba) = h0(g(c9)), 0(C5C7) = 9(cic5c2) = abab = h0(aa) = h0(g(clQ)).

The plaintext 01101 is encrypted according to the public key H, for instance, as follows:

C9 —* C^Cj —*C 3C ^C9C ^ * c^cgc9c ^c9c9 —* CgC3CsC3C-jC1C4C1C4Cl

c8c6c8c8c6c8c1c10c4c10c3c5c9c5c3c5c4c10 = ^ ■

For the legal decryption one first computes

g(y) = abaabaaabaaabaa .

By the parsing rule of G, one further obtains the equations hi(bababbabbab) = g(y), h0(baababa) = bababbabbab , h^abaa) = baababa , h^bab) = abaa , h0(ba) = bab ,

where the indices of h give the plaintext 01101.

In the generalized version of the cryptosystem, where the interpretation mor­phism is chosen more freely, dummies are not essential for safety, as they are in the basic version. On the contrary, careless use of dummies may be a security risk. In the illustration above, some cryptanalytic conclusions can be based on the first six characters of the cryptotext y. This issue will become clearer if we assume that in the cryptosystem design the choice was u = c9c3 instead of u = c9. Then one is immediately to separate the suffix z of any cryptotext, generated by the letter c3 in u. This follows because c3 generates only letters c3, c6, c8 and, moreover, the last letter of any word generated by c9 is not among the three mentioned. In z all occurrences of cs may be ignored, and parsing can be based on the morphisms s0 and Sj determined by c0 and a^.

V C3~*C3C6 Sl' Ci~*C6

c6 ->Ci C6~* C6C3

For instance, z = c3c3c3c3c6c3c3c3c3c6c3 can be analyzed as follows:

sO(C6C6C6C3£'6C6C6C3C6) = 2 '

S1 (C3<-3C6C3C3C6C3) = C6C6C6C3C6C6C6C3C6

so(C6C3C6C3C6) = C3C3C6<-3C3C6C3 '

«l(C6C6C3)= C6C3C6C3C6 ,

Si(c3c6) = C6C6C3 ,

5o(C3) = C3C6 ■

The plaintext 011010 can be read from the indices of s. In many cases this decryption method might be even easier than legal decryption based on G!


Date: 2015-02-16; view: 664


<== previous page | next page ==>
Chapter 5. Other Bases of Cryptosystems | Automata and Language Theory
doclecture.net - lectures - 2014-2024 year. Copyright infringement or personal data (0.012 sec.)