Showing posts with label Encryption. Show all posts
Showing posts with label Encryption. Show all posts

Wednesday, May 26, 2010

How To Password Protect Your Pen Drive

A nice how-to article on protecting USB drives with a password and encryption using Windows Vista or 7 and Bitlocker.

Do you carry sensitive data in your pen drive? Then you should carefully keep your pen drive. Oh! You mean you are not that careful too. Then I would suggest that you should password protect your pen drive. Yes folks, you can do this by a simple method. This is an added advantage to Windows Vista and Windows 7 users that they can easily password protect their pen drives with the help of BitLocker Drive Encryption. Its an inbuilt feature of both of these operating systems.

 

image

Reblog this post [with Zemanta]

Tuesday, April 13, 2010

AES-128 versus AES-256

Last November I posted about an email response by John Callas, CEO of PGP, trying to dispel the perception that a government agency might have the computing power to break 128-bit keys. People seemed concern that while breaking 128-bit keys is beyond the resources of most people or groups, governments agencies still had a good shot. He thought this extremely paranoid, using the example of a computing cluster that enveloped the entire earth to a height of one metre high would still require 1,000 years on average to recover a 128-bit key.

I just put up on Scribd a 2008 whitepaper from Seagate that discusses their reasoning for choosing AES-128 over AES-256 for hard disk encryption. The whitepaper states that
  • NIST has concluded and recommended that all three key-lengths (128-bit, 192-bit and 256-bit) of AES provide adequate encryption until beyond calendar year 2031.
  • NIST’s recommendation above includes the threat model not only of predicting the key, but also of cracking the encryption algorithm. The difference between cracking AES-128 algorithm and AES-256 algorithm is considered minimal. Whatever breakthrough might crack 128-bit will probably also crack 256-bit.
Further, Seagate wanted to maximize the success of its solution by considering the additional business-side concerns:
  • Must promote compliance with laws controlling export from the U.S. and import to other nations
  • Must be cost-optimized
  • Must be able to meet the needs of ALL target markets
AES-128 is sufficient or exceeds all the above criteria.
They also went on to discuss the computational task of recovering 128-bit keys, where assuming
  • Every person on the planet owns 10 computers
  • There are 7 billion people on the planet.
  • Each of these computers can test 1 billion key combinations per second.
  • On average, you can crack the key after testing
    50 percent of the possibilities.
it follows that the earth’s population can crack one encryption key in 77,000,000,00,000,000,000,000,000 years! The graphic form of the argument looks like

image

Nonetheless AES-256 is being widely deployed since it conveniently lies at the intersection of good marketing and pragmatic security. In upgrading from AES-128 to AES-256 vendors can legitimately claim that their products use maximum strength cryptography, and key lengths can be doubled (thus squaring the effort for brute force attacks) for a modest 40% performance hit.

Sunday, March 21, 2010

In Search of Encrypted Search

Last December Moxie Marlinspike announced a $34 cloud service for auditing the strength of WPA passwords. The service tries to reconstruct the password from the user-supplied WPA session material and a hand-crafted dictionary of 135 million passwords. Marlinspike's cloud service searches through the optimised dictionary in just 20 minutes, a computation that would otherwise take 5 days or so on a standard desktop. Notice then that Moxie learns the password if it is found in his dictionary. This might be an example of where customers would prefer Moxie to perform an encrypted search on your behalf – the parameters are encrypted and Moxie just learns a simple yes or no as to whether the corresponding password was found.

This is an encrypted search problem that could be solved in principle by the “fully homomorphic” encryption system devised by IBM researcher Craig Gentry in June of last year. There was much fanfare around the breakthrough and googling “ibm homomorphic encryption” returns over 21,000 results today. There is a long and informative article in Forbes under the headline IBM's Blindfolded Calculator, and you can also see a lecture by Gentry on his result here, given at the Fields Institute in Toronto last December.

Homomorphisms in RSA

But what is homomorphic encryption? Well RSA gives us a nice example, and recall that the “textbook” RSA encryption of a message M is given as

E(M) = M^e mod N

It is not hard to verify that

E(M1) * E(M2) = E(M1 * M2)

which in words states that the product of two encryptions is equal to the encryption of their product. And this is the defining property of a homomorphism – given a function (here encryption) and two inputs (here messages) with a binary operation defined on those inputs (here multiplication), we can interchange the order of applying the function and the operation to get the same result. So we say that the RSA function is a homomorphism with respect to multiplication, or more simply, just a multiplicative homomorphism.

So for an unknown message M2, if we are given E(M2) we can multiply E(M2) by E(M1) to produce E(M1*M2) without needing to decrypt M2. In the early days of RSA this homomorphic property was deemed to be undesirable, due to posited scenarios like the following one (which in fact seems more likely today than they did 20 years ago). Let’s say Alice is going to transfer $M to the account of Bob and sends an encrypted message E(M) instructing her bank to perform the transfer. But Bob intercepts the message and multiples it by E(2), the encryption of 2. The new message becomes E(2)*E(M) = E(2*M) and Bob doubles his money without having to decrypt E(M). The PKCS 1 format has removed the threat of manipulating messages in this way by adding additional padding and fixed fields to messages which can detect such (multiplicative) manipulation.

While RSA is multiplicatively homomorphic it is not additively homomorphic, since in general

E(M1) + E(M2) != E(M1 + M2).

Getting back to the encrypted search problem, does RSA being a multiplicative homomorphism help us? Well not really. Say we are searching for a string S in some data D, where we are only given E(D). We could encrypt S to get get E(S), and then compute

E(S) * E(D)  =  E(S*D)

but E(S*D) won’t tell us that S is somewhere in D because treating S and D as numbers then multiplying them is not related to searching. So we need to find an encryption scheme that has more homomorphic properties than just multiplication if we are to perform encrypted search. And this is what Mr. Gentry did.

Fully Homomorphic Encryption

This elusive property is what Gentry has created in his “fully homomorphic” encryption system, a system which can support the secure homomorphic evaluation of any function given an inout, such as search over data, and not just simple arithmetic functions like multiplication. The underlying computational problem that furnished this property was neither RSA nor discrete logarithm systems but rather a geometric construction referred to as an ideal lattice. Lattices are special vector spaces that furnish computationally difficult problems that have been used in several other cryptosystems such as NTRU, and provide an alternative to the “standard” hard problems used from number theory.

Using the ideal lattice Gentry was able to show how to compute a “circuit” for problems such as encrypted search that can be evaluated without decrypting the data. Both the encrypted term and data are used as inputs to the circuit, and the trick is to devise a method for securely and correctly evaluating the gates of the circuit. Your computer uses built-in gates and circuits to evaluate computations, such as performing the multiplications and divisions to evaluate an RSA encryption or decryption during an SSL session for example. And as such the computations are small and fast. However for encrypted search the corresponding circuit needs to be built and represented in software, which leads to large and inefficient computations. So this is why the result is a breakthrough but not a practical one at the moment. Gentry has estimated that building a circuit to perform an encrypted Google search with encrypted keywords would multiply the current computing time by around 1 trillion.

The Outlook for Encrypted Search

It remains to be seen whether Gentry’s theoretical breakthrough can be converted into a practical service, in the cloud or otherwise. Gentry has worked with some other researchers from IBM and MIT to remove the requirement of using lattices and to just treat the fully homomorphic encryption as operating over the integers, which provides a simpler conceptual framework. Making the central ideas more accessible will increase the chances of finding more practical solutions. There is a recent review of homomorphic encryption by Craig Stuntz, giving more details and promising code in an upcoming post. Stuntz also points to an article by Gentry in the respected Communications of the ACM which begins

Suppose that you want to delegate the ability to process your data, without giving away access to it. We show that this separation is possible: we describe a "fully homomorphic" encryption scheme that keeps data private, but that allows a worker that does not have the secret decryption key to compute any (still encrypted) result of the data, even when the function of the data is very complex. In short, a third party can perform complicated processing of data without being able to see it. Among other things, this helps make cloud computing compatible with privacy.

Reading further requires a subscription but it is clear that the work of Gentry and his co-authors is being positioned as a core service for cloud platforms, and I am suspecting IBM’s platform in particular. However, just at the moment, you will have to trust Moxie with your WPA password.

Wednesday, December 2, 2009

Google Maps and Crypto Laws Mashup

Simon Hunt, VP at McAfee, has a great Google map mashup application that visually maps crypto laws to countries around the world, including individual US states. The map was last updated in September.

image

Tuesday, November 17, 2009

More on TMTO and Rainbow Tables

I recently posted about the new project to compute rainbow tables for the A5/1 encryption algorithm of GSM. As is often the case, I end up writing more than can fit in a reasonable length post (though that does not stop me from occasionally writing apparently unreasonable length posts such as here and here). Rainbow tables, and more generally Time-Memory trade-off (TMTO) tables, are fascinating subjects that I have a bit more to say about.

Babies and Giants

I think the best place to start with understanding TMTO is to return to one its first applications, which turns out to have implications for the security of public key systems. The Baby-steps Giant-steps method was proposed by the number theorist Daniel Shanks in the early 70's to solve the discrete logarithm problem (DLP). We recall that in the DLP we are given a generator g of the prime p and a value 0 < X < p, and we seek the value x such that

X = g^x mod p

Since g is a generator then { g, g^2, g^3, ..., g^(p-1) } is equal to {1, 2, 3, ..., (p-1)}. That is, every non-zero number less than p can be expressed as a power of g mod p. So we know that X has a logarithm to the base g, but how do we find it? The brute force approach is to just start searching through 1, 2, 3, ... until we reach x, which is equivalent to computing the following sequence of exponentiations (note that we omit the "mod p")

g, g^2, g^3, ... , g^{x-1}, g^x = X

That is, we start with g and keep multiplying by g until we reach X. But there is another approach to finding X based on the following sequence

g*X, g^2*X, g^3*X, ... , g^{p-2}, g^{p-1} = 1

In this second sequence we start with X and keep multiplying by g and stop when we reach 1. We know from Fermat's Little theorem that the log of 1 must be (p-1), and since g is a generator, no other value smaller than (p-1) will exponentiate to 1. Now to determine x, if it took k multiples to reach 1 from X then it follows that x = (p-1) - k.

So which approach is better? Well if we write the sequences as exponents, then the first sequence is 1, 2, 3, ..., x while the second is x, x+1, x+2, ..., (p-1). On average both sequences are about p/2 steps in length if we assume x is randomly chosen. However the second sequence contains the germ of a good idea, which is as follows. Even though we do not know the value of x we can increment it towards a value whose log we do know, and then subtract off the number of increments to determine x. In another way, repeatedly multiplying X by g increments the target value x towards 1 which has the known log of (p-1), and x is just (p-1) less the number of increments.

Now we can generalise this approach by creating more values with known logs. For example, we can first compute g^{(p-1)/2} and then start our search sequence as before

g*X, g^2*X, g^3*X, ...

but this time stopping if we reach 1 = g^{p-1} or g^{(p-1)/2}. If we consider the sequence 1, 2, 3, ..., (p-1), then (p-1)/2 is the midpoint and (p-1) is the endpoint, and x will reach (p-1)/2 first if it is in the first half of the list, and will reach (p-1) first if it is in the second half of the sequence. The point is that the average distance from x to one of these two known points is now about p/4, and in fact x must reach one of the known value after at most p/2 steps.

But we can just keep going and introduce more known points. For 3 points we could use the exponents (p-1)/3, 2(p-1)/3 and (p-1), where we need to round to the nearest integer as required. Now the search sequence x, x+1, x+2, ... can be stopped whenever it encounters one of these three values, which will occur after at most p/3 steps or p/6 steps on average. In general to pre-compute k known points that can be used to partition 1, 2, 3, ..., (p-1) into k non-overlapping segments, take multiples of p/k (rounded to the nearest integer).

As it turns out the optimal choice of k that trades-off pre-computation time of the known points and the search time is k = sqrt(p). The Baby-steps Giant-steps (BSGS) name comes from first computing the exponent sequence sqrt(p), 2*sqrt(p), 3*sqrt(p) ..., which are considered the giants steps space at intervals of sqrt(p). Then to find the nearest Giant step we next compute the exponent sequence x, x+1, x+2, ... x + sqrt(p), which we think of as Baby steps of unit increments.

What principle is at work here?

The point is to improve over the brute force approach of just searching through 1, 2, 3, ... until we reach x, and by "improve" we mean reduce the number of elements to be searched before we find x. In the BSGS the improvement is achieved by searching forward from X to a known value rather than starting at an arbitrary value and searching forward towards X. We can pre-compute a set of values or points in the keyspace which limits the number of values that need to be considered in the searching forward process. But the whole searching forward process exists because we are dealing with a cyclic group, and repeated multiplication by g the generator moves us forward through the keyspace. The more values that are pre-computed, the shorter the time for forward search.

Hellman's TMTO for Block Ciphers

In 1980 (around the time Kerberos was coming into being) the famous cryptographer Martin Hellman, co-inventor of public key cryptography, proposed what he called a time-memory trade-off (TMTO) for recovering keys for block ciphers. His goal was to pre-compute a set of values that would limit the fraction of the keyspace that would need searching to find a particular key. This sounds like BSGS, but we have a big problem in that practical block ciphers are not cyclic - no one encryption transformation can be used to generate all the other possible set of encryptions under different keys.

The Ideal TMTO

To explain TMTO let's consider a cipher for which the length of the plaintext, ciphertext and key are all the same. So this is the case for AES-128 for example, but is not true for DES or AES-256, and the TMTO attack has to be modified for these latter cases. The idea is to build a TMTO table that covers all the possible encryptions of a given plaintext P.

The attacker starts by selecting a random key R1 and then encrypts P to produce C1 = E(R1, P), which can also be interpreted as another key K1 = C1 since keys are ciphertexts are bit strings of the same length. The attacker then encrypts P with K1, to yield K2 = E(K1, P) an so on. Assume that the attacker repeats this operation N times to produce a sequence of keys

R1, K1, K2…, K(N-1), KN = S1

where R1 is called the start point and S1 is called the endpoint, and overall the sequence of encryption is called a chain. The attacker then stores the pair (R1, S1) to represent the first row of the TMTO table. Then the attacker selects another start point key R2, and produces another chain of length N represented by the pair (R2, S2). More chains are generated, using different random keys for starting points, until all or a large fraction of the key space is expected to be covered by (or represented in) the chains of the table.

Let the attacker produce M table rows in this fashion at a cost of computing M chains of length N and a storage cost of 2*N*M (the start and end point for each chain). How large should N and M be made? Well the attacker wants each possible key to appear in some chain of the table. So if the key has n bits, then selecting N = M = sqrt(2^n) = 2^{n/2}, will ensure that N*M is approximately 2^n. Note that, if everything works out nicely, we get a compressed form of the keyspace, where the keys have been reordered into chains and each chain is compressed just be storing its start- and endpoints.

Computing these tables will be a large effort, proportional to the number of possible keys. But the advantage here is that once computed, the tables can be repeatedly used to break keys efficiently, which is achieved as follows. Say an attacker has a plaintext-ciphertext P, C computed under some unknown key K. What the attacker can do is then create chain beginning with C as the startpoint, and extends the chain until an endpoint in the table is reached. The found endpoint then defines the row of the table where K lies, and the attacker can then use the startpoint for the row to re-compute the keys of the row, searching until K is found. The advantage over a brute-force search is that the attacker will need to only search over one row in the table, which will be much smaller than searching over the whole key space.

The TMTO Principle at Work

Before we consider the practical issues with implementing TMTO as described above, let's step back and look at the principle of TMTO and draw out its advantages. If an attacker is given a plaintext-ciphertext pair (P,C) then to perform an exhaustive search the attacker selects some order for the keys, and then performs trial encryptions until a key K is found that encrypts P to C. So the attacker has only one point in the key space which can be used to stop the search. This is pretty much the same as trying to solve the DLP as described above by searching through 1, 2, 3, ..., x - you only have one stopping point in your search.

The principle of TMTO is to identify more keys that can be used to terminate a search or define a relatively small interval of the keyspace where K must lie. Each (key) chain of a TMTO table defines such a interval, and when the attacker starts a chain using C as the initial key, then reaching an endpoint indicates the row that contains K. In the ideal case described above, the attacker can store about 2^{n/2} end points which can then be used to identify a subset of 2^{n/2} keys that must contain K. Thus the search for the key K is reduced to about 2^{n/2} encryptions as opposed to 2^{n-1} encryptions on average for an exhaustive search. So the endpoints in the table correspond to the giant-steps in the BSBG method, and the searching forward from a given ciphertext roughly corresponds to the baby-steps.

TMTO in Practice

We have left out many details in the description above but we are mainly giving the idea behind the TMTO principle. The main problem in practice is that the chains as not very well-behaved. What we want is that each chain produces distinct values and also that distinct chains produce distinct values as well - in short, one new and distinct key for each chain encryption. However due to the random properties of encryption we cannot expect this, which reduces the coverage of the keyspace in the tables. More chains need to be generated to compensate.

The diagram below (taken from here) shows nicely behaved chains, with startpoints (SP) and endpoints (EP) that neatly cover some distinct subset of the keyspace. The red dot shows a given ciphertext that when iterated will lead to the endpoint EP1, and then the chain can be reconstituted from SP1 and searched forward for the key K, which is the predecessor to C in the chain.

zrclip_001n7476813b

In practice, constructing TMTO tables that give a high probability of recovering keys is much more messy than described for the ideal case. The main issue is that each key from the keyspace should appear in some row of the TMTO table. However chains based on iterating (repeatedly applying) an encryption function need not produce distinct keys in a given chain, or produce distinct keys between different chains. A given chain can loop back upon itself or two chains may hit upon a common key and then merge in the table. In both cases the coverage of the keyspace in the TMTO table is reduced. It is therefore possible that the chain constructed by the attacker using C as the initial key may not reach an endpoint in the table. Even if it does, regenerating the endpoint chain may not lead to C since the chain for C may start outside the table and only merge into some row of the table. In practice chains can look more like the diagram below, snaking and merging so that K need not even be a predecessor on the chain beginning at SP, even though iterating C forward leads to EP.

zrclip_002n7c477e3e

Also we have assumed that each encryption in a chain defines a new key, but this is only true when keys and ciphertexts are the same bit length. When this is not the case then an additional function must be used to reduce (or expand) the ciphertext to match the stated key length.

Most of these difficulties are addressed in a collection of optimizations on TMTO tables that result in rainbow tables, originally devised by Philippe Oechslin in 2003 in the context of improving TMTO for recovering Windows NT passwords. The optimizations are subtle but the basic idea is to introduce a series of iteration functions (not just encryption) that also incorporates the position in the chain as a parameter in the iteration, which reduces the looping and merging of chains. Even so, the parameters need to be carefully chosen to ensure appropriate coverage of the keyspace.

Rainbow Tables for A5/1

Given this background we can now take a more informed look at the rainbow chain parameters being proposed by Karsten Knol in his recent project for producing rainbow tables for A5/1. The opening sentence on the page states that the tables are a mixture of 2 concepts, namely distinguished points and rainbow tables.

Distinguished Points (DP) are an optimisation, suggested by Ron Rivest, to reduce the number of endpoint lookups. In theory such a lookup must be made after each iteration, which in practice means a costly disk access outside the cosy realm of RAM. The DP optimisation is to create endpoints which have a simple distinguishing property, such as the first or last 10 bits being all zero. Thus the iteration function need only be interrupted for a disk search if the next chain value has this distinguished property.

Each A5/1 chain has an average length of 2^{20}, and each rainbow table is expected to have 2^{28.5} such chains. To get the right coverage just over 2^8 tables need to be generated and stored for a total effort of 2^{57}. These 2^8 tables will be generated in a distributed manner (by different nodes), and it is part of Knol's project to find volunteers for this effort. The total storage for the rainbow tables is about 2.5 TB or just 6 GB per node for the distributed network that Knol is proposing. To find a key from a given ciphertext requires examining about 3.5 billion chain values, which on one machine with a good graphics card will take about 2 hours, or just over a minute from the network.

However the project page cautions about the accuracy of these estimates and some data is presented on the likelihood and extent of chain merging. The objective of the above chain calculation is to cover just under 2^{49} elements in the keyspace, but this apparently is optimistic. Some heuristics for coping with degenerate chains are discussed but it seems that we will have to wait for the project to complete (that is, for the tables to be generated) to determine the success rate.

Final Remarks

Research into TMTO-based cryptanalysis continues, since on the one hand, there are many variations and optimizations of TMTO that can be applied, while on the other, cheaper hardware and faster processors make this attack approach more viable, particularly for ciphers like A5/1 for which brute-forcing the key length is on the boundary of a feasible computation. If you want to read more about TMTO then Karsten Knol has a great article here on the topic, as well as this excellent survey paper from Mark Stamp.

Thursday, September 17, 2009

The Luxembourg Attacks and AES-256

There have been several attacks on AES-256 in the last few months, referred to as the "Luxembourg Attacks". I had hoped to write about the attack announced in June before I went on summer holiday, but ran short on time - and then another attack was announced in July. It is the first attack that I wish to address here to emphasize some points I made in an earlier post.

In May I wrote about the topic of AES-256 and Reputational Risk where I argued that the huge 256-bit key makes it possible for someone to construct an attack which has "big savings" over exhaustive search yet still does not effectively change the margin of security. What suffers in this case is the reputation of AES-256, and the post concluded with

Imagine you posed the following question to a group of top physicists. You asked them to present you with ideas for new research projects where they could assume that the budget included all the money that we have, all the money that has ever been, and the total financial assets of the world for the next 10 million years. Would the resulting proposals be credible?

AES-256 puts cryptanalysts on the same research agenda.

So the first Luxembourg attack gives you an example of the type of arguments that can be proposed when working with 256-bit keys. The attack needs to work with 2^{119} plaintext/ciphertext pairs, requiring about 2^{77} storage, all to yield 156 bits of the key. The remaining 100 unknown bits of the key are guessed, at an average cost of 2^{99} encryptions. This guessing step is hardly worth mentioning as compared to the main 2^{119} computation, since it represents something like one part in a million of the total work.

Some Details of the Attack

The basis of the attack is called differential cryptanalysis (DC), a powerful attack method that gained widespread popularity in the late 80's, and was standard fare by the time AES was announced. In this attack the cryptanalyst finds a pair of values (~P, ~C) such that two plaintexts P1 and P2 of difference ~P  = P1 + P2 are biased towards producing ~C as the difference C1 + C2 of their corresponding ciphertexts (here the difference operator + usually means XOR but it varies according to how the key is combined internally in a cipher). In short, plaintexts of difference ~P produce ciphertexts of difference ~C with some probability that is much higher than 2^{-N} where N is the block size of the cipher. Given such a pair of differences (~P, ~C) an attacker can use statistical methods to gain information about parts of the key that were used in the last or next to last round for example. The remaining bits are then guessed.

DC is typically quite data intensive. Even though plaintexts of difference ~P may be biased towards producing ciphertexts of difference ~C, this may only happen with probability 2^{-50} for example, which sounds quite low but it is much higher than you would expect for a 128-bit block cipher. Therefore the attacker will need to obtain  2^{50} plaintext pair encryptions of difference ~P to get on average just one ciphertext pair of difference ~C. Part of a successful DC attack is designing a filtering strategy to discard most ciphertext pairs that don't have ~C as a difference before they are spuriously used as the basis to give information about some part of the key. So a large number of encryptions are needed, where the majority will be winnowed away, leaving a precious few that lead to information about the key.

In the  first Luxembourg attack (LA1), the basic DC attack is upgraded to a boomerang attack which eliminates the need to thread the differences (~P, ~C) to each other through all the intermediate rounds of the cipher. LA1 also uses related keys to coerce differences into a desired form - in the vanilla DC attack we are trying to keep plaintext differences in a known state, but by using related keys, the key can also be used to bring the internal state of the cipher into a known difference.

The basic step of LA1 is to obtain 2^{61} encryptions of specially crafted quartets (4 related values), and then combine these values amongst themselves to generate the required internal differences. It is this combination process and its associated filtering which drives up the cost of the attack to 2^{119}. The result is that the remaining unfiltered quartets identify all but 100 bits of the key which, as we mentioned above, can be guessed at a small marginal cost.

Luxembourg 2.0

Even though the attack has unrealistic resource requirements, it works on the full-version of AES-256, which is a significant result – something that no other researchers has been able to do – and that’s not due to lack of trying. And the reputation of AES-256? Well the Register for example  reported that LA1 represents an improvement over brute force search for AES-256. Well even that is debatable since you need a huge amount of plaintext/ciphertext encrypted under several keys to mount the attack, while brute force  needs just a few blocks to execute. What’s in the future then? Well I guess its here already in the second Luxembourg attack, which is on my reading list.

Related Posts

Tuesday, August 25, 2009

Solo Desktop Factorization of an RSA-512 Key

It was recently announced that Benjamin Moody has factored an RSA-512 bit key in 73 days using only public software and his desktop computer. The first RSA-512 factorization in 1999 required the equivalent of 8,400 MIPS years over an elapsed time of about 7 months. The announcement was made all the more intriguing in that it came from a mailing list associated with a user forum for Texas Instruments calculators.

The public key in question is used to verify signed OS binaries for the TI 83 Plus and the factorization means that

… any operating system can be cryptographically signed in a manner identical to that of the original TI-OS. Third party operating systems can thus be loaded on any 83+ calculators without the use of any extra software (that was mentioned in recent news). Complete programming freedom has finally been achieved on the TI-83 Plus!

The original post from Moody was not very informative, but the subsequent Q & A thread drew out more details. Moody used public source factoring software on a dual-core Athlon64 at 1900 MHz. Just under 5 gigabytes of disk was required and about 2.5 gigabytes of RAM for the sieving process. The final computation involved finding the null space of a 5.4 million x 5.4 million matrix.

Most security people would rightly claim that RSA-512 is known to provide little security as a key length, however this does not mean that 512-bit public keys are not in use today by companies other than Texas Instruments. By the way, someone is offering an RSA 512 factoring service with a “price indication” of $5000.

CORRECTION, Feb 19, 2010: The original post said the factoring was done in 73 hours when the correct value is 73 days, as pointed out in the comment below. Thank you to Samuel for pointing out the mistake.

Self-Destructing Digital Data with Vanish

University of Washington researchers recently announced a system for permanently deleting data from the internet. The solution, called Vanish, can be used for example to delete all copies of an email cached at intermediate relay sites and ISPs during transmission from sender to receiver. Advertising that Vanish provides self-destructing data conjures up the digital equivalent of a tape that bursts into flames after its message has been played. But data protected by Vanish neither self-destructs nor does the system actively seek out data at third party sites for deletion.

Vanish works by encrypting data with a secret key (say an AES-256 key), splitting the key into secret shares, and then storing the shares at a randomly selected set of nodes in the distributed hash table (DHT) of a public P2P network. In this way Vanish creates an encrypted object for inclusion in email, for example, that the sender can transmit to a receiver or group of receivers.
image
When a receiver opens the encrypted object, Vanish attempts to access a sufficient number of DHT nodes with shares so that the key can be recovered and the data decrypted.

The self-destructing aspect is that the key shares will be deleted as part of the natural node churn in the DHT, quite independent of the actions of both the sender and the receiver. The lifetime of a share is about 8 to 9 hours in the DHT, after which there is a low probability that there will be a sufficient number of shares to reach the recovery threshold.

So the encrypted data does not self-destruct - but rather key recovery information is placed in volatile public storage that is very likely to delete that information after a short delay as part of its normal operation. And with that deletion also disappears logical access to all copies of the unencrypted data.

The full paper that describes the Vanish system has extensive results of experimenting with the Vuze P2P network, as well as other local simulations, examining the interplay of various parameters such as the number shares versus the deletion rate.

Friday, April 3, 2009

Three Security maps in FreeMind and Flash

Here are the Mind Maps that I constructed to help me sift through information on three security topics from last year - an improved attack on A5/1, the Cold Boot Attack, and the Debian Crypto flaw. In each case there is a considerably more detail and references in the Mind Maps than the posts that were derived from them.

In February 2008 an improved attack on A5/1 was announced, the cipher used in the encryption of GSM mobile phones. While A5/1 is not considered strong, the new attack claimed faster recovery of keys using less assumption and data. This Mind Map provides an overview of the issues and what was claimed.

Also in February last year, the Cold Boot Attack was devised by Princeton researchers. This Mind Map gives an overview on what was claimed, what were the reactions and a lot of opinion on how this attack came about. In short, many professionals knew the attack could work in principle but it took an actual demonstration to convince them thoroughly.

In May 2008 Luciano Bello discovered a flaw in the random number generator of OpenSSL, which lead to the discovery that the Debian Etch distribution had been producing weak (public) keys for well-known protocols such as SSL and SSH over the previous 18 months. This Mind Map provides an annotated set of links on the topic.

Related Posts


Tuesday, August 19, 2008

Quantum Computing: are you Shor?

qc1 Quantum computing ringing the imminent death knell of encryption is a familiar refrain in the popular press. The end of a dynasty is close at hand. A recent article in IT News is typical when it states that advances in quantum computing are likely to "render[ing] digital encryption techniques all but obsolete". Not quite.

Quantum computers are very different from the ones we have on our desktops now, usually referred to as classical computing devices. A potential misconception is to think that quantum computers will teleport us to a future where the benefits of Moore's Law compounded over several centuries can be reaped. Intel's leading chip line, the x86, has steadily progressed from 286, 386, 486, Pentium and so on, but quantum computers will not be "1000-86" devices - unimaginably faster versions of what we have today.

Quantum computers are built from different computational primitives than our present devices. It is difficult to formally explain these differences but the main idea is that quantum computers can operate in many states simultaneously whereas classical computers are built from (bipolar) components that are interpreted to operate in one of two states - 0 or 1, on or off.

The massive type of parallelism furnished by quantum logic cannot as yet be used to solve general computational problems. A specific quantum algorithm must be tailored for each problem class of interest - such algorithms are more a bespoke than off-the-shelf technology. The quantum algorithm that has received the most attention with respect to security was devised by Peter Shor, and proves that factoring is efficient on a quantum computer. When you read that current encryption methods will be rendered useless by quantum computation, the authors are usually referring to Shor's algorithm.

Shor's Algorithm

shor How does Shor's algorithm work? Well it uses a quantum algorithm to solve a particular equation whose solution is likely to factor a number. For the last 30 years or so, most modern factoring algorithms have been converging on a single idea, which is the following. Say that N = P*Q is the number that you wish to factor. Assume that you can find two numbers X and Y such that

X^2 = Y^2 (mod N)

which means that the squares of X and Y leave the same remainder when divided by N. It then follows that

X^2 - Y^2 = 0 (mod N)

(X - Y)(X + Y) = 0 (mod N)

so that (X - Y)(X + Y) divides N. It can be shown that there is at least a 50% chance that either (X - Y) or (X + Y) will be divisible by P or Q, and therefore share a nontrivial factor with N. If this is the case, then N can be factored by computing gcd(X - Y, N) and gcd(X + Y, N).

 

Let's call X and Y a magic pair, and if we can generate enough magic pairs for a given N then we can factor it (think of flipping a coin and waiting for it to turn up heads). So given that factoring is generally believed to be difficult then generating magic pairs must be difficult as well. Most modern factoring algorithms are in fact clever ways to generate magic pairs. While these methods are much faster than brute force approaches, they are far from being efficient.

The breakthrough from Shor was to devise an efficient quantum method for generating magic pairs, which therefore also makes factoring efficient. He did this by reducing the problem of finding magic pairs to what is called computing orders.

Let Z be a number such that 1 < Z < N, and gcd(N, X) = 1 (that is, a number less than N and relatively prime to N). When these conditions hold, there is a smallest number K < N such that

Z^K = 1 (mod N)

This value of K is called the order of Z (modulo N). If K is even, say equal to 2*D, then we have that

Z^(2*D)  =  (Z^D)^2  =  1 (mod N).

But we can take the square root of both sides of the equation

(Z^D)^2 = 1^2 (mod N)

to yield a magic pair by setting X = Z^D and Y = 1. So finding a number with even order is the same as finding a magic pair.

Shor invented a brilliant quantum algorithm for efficiently computing orders (the details of which are to complicated to reproduce or even summarise here). Then to factor a number N, just generate a random number Z and quantum compute its order. If the order is even, then create a magic pair and try to factor N; if N does not factor, or the order is not even, then choose another Z and repeat. You wont have to do too many iterations before N will be factored with high probability.

So it is not so much that factoring can be solved easily on a quantum computer but rather that computing orders is easily solved on a quantum computer and factoring can be reduced to computing orders. Any cryptosystem whose security can be reduced to computing orders will also be insecure. Unfortunately this is the case for most well-known public key cryptosystems, including RSA, discrete logarithm based system like Diffie-Hellman and DSA, as well as elliptic curve systems. It is easy to see how Shor's Algorithm quickly becomes a the-sky-is-soon-to-be-falling story line in the press.

However, there are public key systems that are not impacted by Shor's algorithm. One such scheme is the McEliece cryptosystem whose security is derived from intractable problems in coding theory, which has nothing to do with computing orders. The McEliece cryptosystem is not commonly deployed but perhaps its day will come.

Quantum Speculation 

We remarked at the beginning of this post that quantum computing is not a technology for cheating Moore's Law - quantum computers are not general devices that will bring with them exponential improvements over what we have today. We need to craft individual quantum algorithms for each class of problem that we want to solve rapidly on a quantum platform. Computing orders is one such general problem whose quantum solution has dire implications for most of today's common public key systems.

The quantum future that brings the end to the RSA dynasty is quite a few  decades away, if not more. Your great grandchildren's RSA keys may be under threat, if we are still using RSA then. You may also have some reason for concern if you are encrypting or signing data with public keys that you are assuming will be secure for the next 50 years or so.

Today from a risk perspective quantum computers are a low-probability high-impact event - a coming catastrophe for many extant public key systems. In 100 years quantum computers are likely to be a high-probability high-impact event - that is, a commonplace tool being used for better or worse.

Our RSA keys are safe for some time yet, until quantum computation matures, and it may even turn out that large scale implementations with many quantum logic gates are difficult to engineer. Certain public key systems will survive even sophisticated quantum computer deployments in any case, unless as yet undiscovered specific quantum algorithms can be tailored to these systems as well. One can speculate endlessly.

Friday, August 8, 2008

Long Tail of Vulnerability for A5/1

Stream ciphers are a special class of cipher, often used for fast encryption of data streams such as dedicated network links or fax lines. Stream ciphers produce a key stream that is simply added (XOR-ed) with the data stream to produce ciphertext, and as such can achieve high encryption rates especially if implemented in hardware.

Below is a diagram of the A5/1 stream cipher, invented in 1987 to protect GSM voice and data traffic. A5/1 loads a 64-bit secret key into its 3 shift registers, which are then clocked and combined with a frame vector to produce an initial state. A5/1 is then clocked as required to produce keystream bits to encrypt the next 114 bit GSM frame.

For over 10 years theoretical attacks on A5/1 have been postulated that lower the cost of attack significantly over searching the 64-bit key space directly. Amongst cryptographers and mobile security people, A5/1 is considered to be insecure and its 64-bit key size woefully inadequate by today's standards.

In February David Hulton, director of applications for the high-performance computing company Pico, and Steve Muller, a researcher for mobile security firm CellCrypt, announced new results which claim that A5/1 keys can be recovered in 30 minutes for $1000. The attack is both passive (no network injection or basestation masquerading) and can be executed remotely (no proximity to the target device required). Further, Hulton & Muller (H&M) are patenting their optimizations and intend to sell an A5/1 key recovery tool commercially. Cell call snooping may soon be affordable by many groups and people, not just well-resourced intelligence agencies.

The attack is based on pre-computing rainbow tables that enable the direct "lookup" of A5/1 keys when indexed by a small amount of observed traffic (3 to 4 GSM frames). While it remains impractical to create a rainbow table that has one entry for each each of the possible 2^(64) A5/1 keys, H&M have devised a method that only requires a table of size 2^(58). Dedicated FPGA devices are currently being deployed to speed up the required rainbow table computations, and are expected to be completed by the end of March. By way of comparison, it is estimated that attempting the same computation on a lone PC would take 33 thousand years.

The resulting rainbow tables will be several terabytes in size, and will be made available to the public. H&M also intend to sell a fast cracking version of the software that can break GSM encryption in just 30 seconds, charging between $200,000 and $500,000 for this privilege. Perhaps they will also offer service of recovering of GSM keys based on submitted traffic samples, giving rise to a instance of cryptanalysis-as-a-service (CaaS).

While researchers and security professionals view the H&M approach as cheaper and more efficient than previous attacks, the attack itself comes as no surprise given the known weaknesses of A5/1. The practicality of the H&M approach should be the final piece of evidence to convince operators to move off A5/1 as a basis for secure and private for GSM communication. David Pringle, a spokesperson for the GSM Association (GSMA), has declined to comment on the H&M attack until further details are known.

A5/1 has operated unchanged for the last 21 years but it has now reached its cryptographic end-of-life, engulfed by the march of Moore's Law. However, the operational end-of-life of A5/1 may still be decades away as there are approximately 2 billion GSM subscribers, commanding about 80% of the global mobile market. This would be a tough product recall indeed. A5/1 is well-positioned to become the NT of the mobile crypto world, and I see the makings of a long tail of GSM vulnerability.

You can find the research used to produce this post as a FreeMind mindmap rendered into Flash here.


Related Posts

Tuesday, July 22, 2008

The Cold Boot Attack

In February, a team led by researchers at Princeton announced how to recover laptop encryption keys directly from the DRAM (dynamic RAM) of the device. This attack, dubbed "Cold Boot", began a flurry of posts and articles stating dismay, seeking clarification and listing defences. The contents of DRAM takes several minutes to be cleared after power is shut off, and an attacker with physical access to the machine has an opportunity to recover the contents of memory over those few minutes. Contrary to popular belief, dynamic RAM is only somewhat dynamic.

Reproduced below are a series of images from the Princeton team depicting the DRAM decay of a Mona Lisa image over a period of 5 minutes. The striped image on the far right shows the ground states for the DRAM, which are the 0 or 1 states that the memory elements will eventually return to after power is removed. By the time DRAM reaches the striped ground state all information on encryption keys (and any other parameters) in memory is lost.

Picture1

A common reaction from experienced security professionals was to state that Cold Boot attacks are well-known to exist in principle, and that the Princeton work has filled in the practical details. “[The cold boot attack has] been in the toolbox of forensics examiners for some time,” said Murugiah Souppaya, a researcher at the National Institute of Standards and Technology. The attack is more of a wake-up call than a bolt of lightening.

Hard-disk encryption vendors were also quick to point out that Cold Boot attacks exploit hardware vulnerabilities and not encryption weaknesses. The true risk comes from software that relies on power loss to purge keys from memory rather than the software explicitly clearing memory itself. The lesson for users is to ensure that power is shut off to DRAM when they not are using their laptops (either by shutting down the laptop or by putting it into hibernate mode).

When the keys bit are extracted from memory some will already have false values due to ground state decay. Pictured left is a decayed Mona Lisa image which contains sufficient information to recognize the original image. The Princeton researchers have devised algorithms for detecting and recovering key bits corrupted by ground state decay, which, after stripping away all the fanfare of the attack, is the most significant contribution of the work. How do we recover the true key from a corrupted copy?

Coding theory is the study of exactly this problem, in the context of sending and receiving messages where messages bits can be corrupted (flipped) during transmission. Additional information, known as redundancy, must be added to each message so that errors can be detected and corrected. Without some redundancy there is little hope to find a handle on what errors have occurred.

Luckily the keys do in fact come with their own redundancy in DRAM. A user-supplied key to a block cipher is converted into an internal format that matches the operation of the cipher. For example, in the case of DES the user-supplied 56-bit key is converted into an 768-bit extended key, corresponding to the 48-bit sub-keys used in each of the 16 rounds in DES. Now the sub-keys themselves consist of bits that are just copies of bits from the user-supplied key. Each bit of the user-supplied key is guaranteed to occur at least 14 times in the extended key.

For efficiency reasons, the extended key is stored in DRAM rather than the user-supplied key. So for DES, and its triple-DES variants, each key bit will be stored at least 14 times in DRAM. Thus there will be (at least) 14 copies of each key bit to work with, which we can think of as 13 bits of redundancy for each key bit. PGP CTO, John Callas, likened this key decoding process to completing the remaining squares of a Sudoku game. While the analogy is not perfect, it is suggestive of the type of work that must be performed to decode partially decayed keys from memory.

For me, the Cold Boot incident is reminiscent of Richard Feynman's involvement in the 1986 Challenger disaster inquiry. There was long debate on whether the O-rings would or would not expand under cold conditions, and Feynman demonstrably settled the question by placing a clamped O-ring into a glass of ice water and showing that it did not expand when removed. At the time Freeman Dyson noted that this was an example of Nature providing a simple answer when asked a simple question.

The Princeton team asked Nature the simple question of whether DRAM is cleared on power loss, and the simple answer is no.

You can find the research used to produce this post as a FreeMind mindmap rendered into Flash here.


Sunday, July 13, 2008

A Blackish Swan for Debian Crypto

debian In May Luciano Bello discovered a flaw in the Debian version of Linux which, as described in CVE-2008-0166, yielded "a random number generator that generates predictable numbers, which makes it easier for remote attackers to conduct brute force guessing attacks against cryptographic keys". While this sentence is unlikely to win any awards for clarity, the graveness of the situation should nonetheless be apparent.

Since September 2006 the Debian Etch distribution has been producing weak (public) keys for well-known protocols such as SSL and SSH, as well as certificates for personal encryption and signing. A few days after the announcement SANS raised its INFOCon level to yellow, indicating an increased threat against the general internet infrastructure (the INFOCon level has since returned to green). While the offending Debian code is easily patched (only two lines need to be uncommented) it will take much longer to assess the impact of over 18 months of weak keys being handed out by Debian.

The Best Laid Plans

The root cause of the incident centres around memory allocation. Programs frequently request additional memory when they are running. The operating system (OS) satisfies such requests by finding an available block of memory, marking the block as allocated, and returning the starting address of the block to the requesting program for access.

The memory allocated to a program request will not be empty - the block will always contain some values in terms of zeros and ones. These values are typically what was left behind by the last program that used the allocated locations.

When a program is allocated memory it is good programming style for the programmer to assign known values to the locations so that subsequent computations involving these locations begin from a known state. For example, if one of the allocated locations was to be used for a variable hits to count the number of visitors to your web site, then hits should be initialised to zero. If the last value assigned to the memory location of hits was 192 or -517, then simply incrementing hits for each new web site visitor will not achieve the desired result.

Code scanning tools are able to detect instances of memory that are allocated to a program but are not initialised before being used. The scanning tools will return warnings and advise that initial values be assigned. In general this is good advice unless the programmer had an unconventional reason for using uninitialised memory.

doh But the programmers of the OpenSSL PRNG apparently had such a reason. They actually wanted to have memory allocated for the seed to the PRNG and use the previous values contained in these locations as part of the seed. Later, another Debian programmer seeing that code scanning tools were complaining about uninitialised memory in the OpenSSL PRNG took the well-intentioned step of commenting out the offending code, removing the code from the resulting executable program.

The offending code consisted of changes to just two lines.

Commenting out these two lines (unfortunately) did not stop the PRNG from working but had the effect of reducing its seed to be just the process identifier (PID) for the program. The PID is a value used by the OS to distinguish one process from another for administration purposes. The PID is typically a 15-bit value that is able to represent numbers in the range 0 to 32,767, or less than 33,000 distinct values in total. Cryptographic keys that were then generated from such a seeded PRNG are selected from only a small number of possible values and can therefore be easily broken. Instead of keys being based on hundreds or thousands of bits of entropy, keys were based on at most 15 bits of entropy.

So in summary: the seed generation process for OpenSSL PRNG was designed to rely on using previously assigned values to dynamically allocated memory. This unusual programming practice was not well-communicated, and a subsequent Debian programmer removed the two lines of code that seeded the PRNG through memory allocation. The effect was to produce a PRNG that was only able to produce 33,000 different values, which leads to a predictably small number of cryptographic keys.

The impact of changing two lines of code

dominos

The scope of the impact of Debian creating weak keys is multidimensional: the time scales, other code distributions, security protocols, security credentials and finally data. With respect to time, the PRNG programming flaw has been present in Debian since September 2006 (the Etch release), and was not publicly detected until May this year, meaning that weak keys have been generated on the Debian distribution for over 18 months now. Further, other Linux distributions derived from Debian, including versions of Knoppix and Ubuntu, have also inherited the flaw over this time period.

Security protocols whose underlying cryptography was weakened include VPN, SSH, SSL and its derivatives such as FTPS, as well as all X.509 key material (general public/private key pairs). The full list of impacted security protocols and applications is quite extensive. Further, internet-facing servers with weak SSL certificates bound to a domain name, such as www.mycompany.com, will need to be revoked and reissued by the company administering that domain. An article in the Register, called Debian's Epic SSL Blunder, states that the number of SSL certificates that may need replacing could be in the hundreds of thousands or even millions. So while the OpenSSL PRNG code can be easily patched identifying and replacing all the weak keys generated by the flawed code is a big operational headache. It may be months or years before all the weak keys and their corresponding certificates are tracked down and upgraded.

Another serious threat is the potential exposure of data, including security credentials such as passwords, which were transported over public networks under the false assumption that they were protected by strong cryptography. It has already been suggested in a Heise Security article that secret service organisations have been exploiting the weaknesses in communication.

A Blackish Swan?

Black Swan According to Wikipedia, a black swan "is a large-impact, hard-to-predict, and rare event beyond the realm of normal expectations". Are we dealing with a Debian Black Swan? It is too early to judge this incident as an outright Black Swan, or the catalyst for an imminent Black Swan, since the consequences have not fully played out. Certainly the incident has highlighted potential weaknesses in the open source programming model, if not in general, then at least in the interaction between open source developers and vendor developers. Ben Laurie of OpenSSL originally blasted Debian for changing code that they "did not understand", but in a subsequent post his tone was more conciliatory as evidence of the change being discussed on OpenSSL mailing lists was presented. His final remark was "I welcome any suggestions to improve this situation".

Flaws related to encryption always make good copy, and on occasion, strike at the heart of our fundamental beliefs in security. When encryption falters the whole edifice of security seems shaken.

You can find the research used to produce this post as a FreeMind mindmap rendered into Flash here.


Sunday, June 22, 2008

RSA is 30, and counting

rsa-photo

This year is the 30th anniversary of the journal publication for the RSA cryptosystem:

R. Rivest, A. Shamir, L. Adleman. A Method for Obtaining Digital Signatures and Public-Key Cryptosystems. Communications of the ACM, Vol. 21 (2), pp.120–126. 1978.

This is one of the most cited papers in all of computer science. The authors are pictured above around the time their RSA paper was published (Shamir, Rivest, Adleman - left to right). I learned about the mathematics of RSA in 1985 as part of a cryptography course based on Dorothy Denning's excellent introductory text Cryptography and Data Security. Over the years quite a few security professionals have confided in me that they do not really understand the basic mathematics of how RSA works. In this post I hope that I can tell the RSA story one more time, in a way that makes it easier to understand.

Background

Let E be a generic encryption operation, and let D be its corresponding decryption operation. For any message M, a basic property of any cryptosystem is that

M = D(E(M))

or in words, that decrypting an encrypted message yields the original message. We can then think of D(E( )) as a self-mapping, taking messages M to themselves under the action of some key. To explain RSA we are going to look at several well-known self-mapping functions from number theory, and then show how we can decompose these self-maps into two parts, which can be thought of as encryption and decryption.

All references below to numbers will mean non-negative integers: 0, 1, 2, 3 and so on. Number theory is mainly concerned with these numbers and their properties. Also, we will use the terms number and message interchangeably, since any message has a binary representation that can be interpreted as a number (see the standard PKCS #1 on a specific method for doing this).

Modular Exponentiation

Our source of self-mappings will be derived from modular exponentiation, the fundamental operation in many public key systems including RSA, DSA, Diffie-Hellman, and El Gamal systems.

We start with the simpler notion of exponentiation. Exponentiation is the operation of raising a base M to an exponent e, written as M^e, which simply means to multiply M by itself e times. When e = 2 exponentiation is called squaring, and called cubing when e = 3. For larger values of e we also describe M^e as "raising M to the e-th power".

There are two rules which can be used to simplify exponentiations, both of which are used in RSA: the product rule

M^(e*d) = (M^e)^d

and the sum rule

M^(e + d) = (M^e) * (M^d).

In modular exponentiation we are given an additional parameter m and we need to compute M^e mod m. For any value x , x mod m means to return the remainder of dividing x by m. So 17 mod 5 = 2 since 17 = 3*5 + 2, or in words, 5 goes into 17 three times with 2 left over (the remainder). Here m is called the modulus. So M^e mod m means to first compute M^e and then return the remainder upon dividing by m.

Flawed RSA v1

After that introduction, we need one simple result to take us forward, called Fermat's Little Theorem: let p be a prime and let M be any value in the range 0 < M < p, then

M^(p - 1) = 1 mod p

Since M^p = M^(p-1) * M by the exponentiation sum rule, and M^(p-1) = 1 mod p it then follows that

M^p = M mod p.

This is good news since we have now found a self-mapping function. Can we somehow decompose this self-map into two parts, one that we can call encryption and the other decryption?

Well imagine that Alice can find two values e and d such that e*d = p, and let Alice publish the pair (e, p) as her public exponent and her public modulus, respectively. If Bob wants to send her a message M she instructs him to encrypt it using the following modular exponentiation

C = M^e mod p

and Bob then sends C to Alice. When Alice receives C she recovers M by the following modular exponentiation

M = C^d mod p.

But why does this operation return M to Alice? Well its because

C^d = (M^e)^d = M^(e*d) = M^p = M.

Here we have used the exponentiation product rule: the exponentiation applied by Bob is itself exponentiated by Alice to return the original message.

But we have a snag. We said that Alice found e and d such that e*d = p, but since p is prime then the only possible solutions for e and d are 1 and p. But if e = 1 or e = p then the encryption operation M^e mod p applied by Bob has no effect on M since

M^1 = M^p = M mod p.

Encryption would be doing nothing.

But here is the key observation.

The property that permits Alice to recover M is not e*d = p but rather that

e*d = (p - 1) + 1 = 1*(p-1) + 1

and Alice can still recover M if e*d is equal to 2*(p-1) + 1, or 3*(p-1) + 1, or k*(p-1) + 1 for any k > 0. Why? Well using the product rule we have that

M^(k*(p-1)) = M^((p-1)*k) = (M^(p-1))^k = 1^k = 1 mod p,

and then using the sum rule

M^(k*(p-1) + 1) = M^(k*(p-1)) * M = M.

So we are back in business if we can find e and d such that e*d = k*(p-1) + 1, for some k > 0. But using modular arithmetic, this is the same as saying that

e*d = 1 mod (p - 1).

As it turns out there is something called the Euclidean algorithm which is a method for efficiently solving for x in modular equations of the form a*x = b mod m. So we can just pick a random value for e in the range 1 < e < p and then use the Euclidean algorithm to find the corresponding value of d for which e*d = 1 mod (p - 1).

Let's see an example of this. Let p = 61 and then select e as e = 17. Then we have that d = 53 since 17 * 53 = 1 mod 60, or more specifically, 17 * 53 = 15 * 60 + 1. So encryption would then be defined as C = M^17 mod 61 and decryption defined as M = C^53 mod 61. Here our messages would need to be less than 6 bits in length.

But now we have another snag - quite a serious one this time. When Alice publishes her public key (e, p), what distinguishes her from other people is her knowledge of her decryption key d. But since e*d = 1 mod (p - 1) then another person Carol can easily determine d by simply using the Euclidean algorithm to solve e*d = 1 mod (p - 1) since e and p - 1 can be directly derived from the public (e, p).

To fix this snag we need to change our tactics and move from a public modulus that is a prime p to a value n = p*q that is the product of two primes p and q.

Correct RSA v2

The importance of Fermat's Little Theorem in the last section is that it gave us a self-mapping modulo a prime p. There is a generalisation of this result for composite numbers known as Euler's Theorem: For any n let a be any value in the range 0 < a < n, that shares no factors with n then

a^phi(n) = 1 mod n

where phi(n) is the number of values in the range 1 to n that share no factors with n. Note here that for a prime p that phi(p) = p - 1 so that Euler's criterion includes the result of Fermat's Little Theorem.

If we let n = p*q for two primes p and q then it is easy to show that phi(n) = (p-1)*(q-1). Now we can repeat the construction in the previous section by replacing computations with p - 1 (derived from Fermat's Little Theorem) with computations based on phi(n). To this end we select a random e in the range 1 < e < phi(n) and use the Euclidean algorithm to solve for d such that

e*d = 1 mod phi(n) or e*d = 1 mod (p-1)(q-1).

Alice's public key becomes (e, n) and Bob can encrypt a message M for her as M^e mod p. Alice decrypts the message as M = C^d mod p which works because

C^d = (M^e)^d = M^(e*d) = M^(k*phi(n)+1 ) = 1^k * M = M mod n.

Let's illstrate these principles with the RSA example from Wikipedia. Let Alice choose as her two primes p = 61 and q = 53. Then n = 61 * 53 = 3233 and phi(n) = (61 - 1)*(53 - 1) = 3120. If Alice selects her public exponent to be e = 17 then d = 2753 since

17 * 2753 = 46801 = 15 * 3120 + 1.

So the public key of Alice is then (17, 3233) and anyone can encrypt a message for Alice by computing C = M^17 mod 3233. Alice recovers the message by computing M = C^2753 mod 3233.

By now you should be able to follow this T-shirt description of RSA.

rsa

Let's think about Carol again. Can she easily determine Alice's private key d from her public parameters (e, n)? Well if she could determine phi(n) from n then she could use the Euclidean algorithm to solve for d in the equation e*d = 1 mod phi(n). But as was shown in the original RSA paper, if Carol could derive phi(n) from n then this would be equivalent to Carol having the power to factor n, which is known to be a computationally hard problem.

We have a left out many technicalities in this description of RSA, and to implement RSA in practice you need to solve many other issues, such as how to find large primes. For RSA to be secure it must be the case that n is sufficiently large to defeat the best known factoring algorithms. Using current estimates p and q should be at least 500 bits in length, and often over 1000 bits for additional security. The construction for RSA as outlined above is totally general - you can pick the primes as small or large as you like - the math still all goes through. However, larger keys takes longer times to process.