Diffie-Hellman

Introduction

The Diffie-Hellman key exchange is an asymmetrical crypto system that allows one to establish a common secret (usually a common secret key) over an insecure channel. As such, the Diffie-Hellman system doesn't, at first sight, allow one to encrypt or sign directly some data, as one would expect from a crypto system, but can allow this indirectly by establishing a common secret key that can be used afterwards with a symmetrical crypto system. The D-H system is historically based upon the multiplicative group of a modular field.

There are extensions to the original D-H system which do allow one to encrypt data, and to sign data ; there are also extensions of the D-H principle to other cyclic groups than the multiplicative group of a modular field. In this contribution, we will concentrate on the historical version in the multiplicative group of a modular field, and just in its function as a common key establishment mechanism. Even though this seems to be rather limited, one shouldn't underestimate the power of the D-H key exchange: it allows one to establish a secure common secret key over an insecure connection, which was the "question begging itself" in symmetric cryptography: if you have a secure channel, why using cryptography, and if you don't have a secure channel, how do you share the secret key ? D-H is the solution to the latter riddle.

We will implement a version of the D-H key exchange principle in our big number code system.

Diffie-Hellman system set-up

Contrary to the RSA crypto system, where there's no a priori convention to be made, apart the very fact of using the RSA concept, the D-H key exchange needs both parties to agree upon a specific set up. The set-up in standard D-H key exchange is the choice of a particular modular field (hence, a particular prime number) and a particular generator of the multiplicative group (or a sub-group !). This set-up is entirely public and it is tempting to go for a "known" D-H set-up which is published. However, there are reasons to consider finding your own D-H set-up. Of course, pedagogically, it is always good to be able to do things "from scratch". This also helps you to be independent of anything but your own knowledge and computing. But there are other reasons to prefer one's own set up. Indeed, it is now believed that the "commonly used" D-H set-ups, well-designed as they may be, are so much used that specific attacks against a particular set-up, needing serious investment, are worth-while because there are so many D-H systems using it. By investing in an attack on a specific group, one opens the possibility of cracking a lot of communications. It is believed that certain state-sponsored agencies have invested in exactly that. If you generate your own system, chances are nobody is going to invest (or already have a solution) of your particular system.

That said, one has to be careful when setting up a D-H system, for two reasons. If we have a modular field, it means that the modulus is a prime number p. But the multiplicative group of the modular field has one element less than the field: zero is excluded. As such, the order of the multiplicative group is p - 1, which is an even number, and hence cannot be a prime. As such, it has non-trivial divisors, and for each of these divisors, there is a cyclic sub group. If one now picks a "generator" randomly in the group, it can just as well be a generator of a cyclic subgroup than it is a genuine generator of the whole group. The actual group one is working in, is then possibly much, much smaller than the order of the multiplicative group of the field, and the "impossibly difficult" discrete logarithm problem on which the security of D-H is based, may in fact be not so hard at all. Second, even to a D-H system with a good generator (that is, an actual generator of the whole group, and not just of a sub group), the attacks are easier if there are subgroups. So, one should find a prime number p such that p - 1 doesn't factor in too many relatively small numbers. Given that p - 1 is even, it will already contain a factor of 2, that's unavoidable. However, if we have picked a very large prime, it is essentially impossible to find the big prime factors of p - 1. The whole security of the RSA system is based upon that assumption ! So if we pick a big prime, we have in fact no idea what are the subgroups of the multiplicative group. This is why it is better to construct this prime, in the following way:

we pick a large prime q.
we calculate p = 2q + 1
we test whether p is a prime or not. If not, we go back to 1.

As such, we know the prime factors of p - 1: it is 2 (unavoidable: even number) and q, a prime ! Primes constructed that way are called "safe primes" (sometimes confused with "strong primes").

Given that we know the factors of the order of the multiplicative group, namely q and 2, we know that there are only two cyclic subgroups. A group of order 2, and a group of order q. This is about the best situation we can have for the D-H problem. Ideally, our generator is a genuine generator of the whole group. But even if it is only a generator of the subgroup of order q, that is still a very large group. So we can pick just any random number in the group as a generator (then we have about one chance out of 2 to be in the subgroup of order q, or have a genuine generator of the whole group). Or we can test whether this number, to the power q, gives us 1, in which case we know it is a generator of the subgroup, and pick another number. We should ideally also test whether it is not THE generator of the group of order 2, by testing whether its square isn't 1 ; that said, the probability to fall onto this single element is essentially zero.

If we find a big, safe prime, and on top of that, we pick a good group generator, we have a good D-H set-up.

This is what we establish with the following piece of code, where we would like to remind: Code examples

 
void diffiesetup(int nbits, number* dhprime, number* dhgen)
/* this procedure generates a Diffie-Hellman setup */
  {
  number oldmodulo;
  number zero;
  number one;
  number two;
  number nbitsbig;
  number allbits;
  number ran1;
  number dummy;
  number ranlim;
  number firstprime;
  number firstprime2;
  number test;
  int i;

  /* we put the current modulus away and set the modulus to its maximum */
  for (i = 0 ; i < nofdigits ; i++)
    {
    oldmodulo.digit[i] = modulo.digit[i];
    modulo.digit[i] = basis - 1;
    zero.digit[i] = 0;
    one.digit[i] = 0;
    two.digit[i] = 0;
    }
  one.digit[0] = 1;
  two.digit[0] = 2;
  /* we calculate the highest possible number in agreement with our number of bits (minus 1) */
  makebig(nbits-1,&nbitsbig);
  power(&two,&nbitsbig,&allbits);
  /* we try to find a strong prime, that is a prime number which is equal to 2*q + 1
     where q is also a prime number */
  do
    {
    /* we find the prime firstprime */
    do
      {
      randomgen(&ran1);
      dividenum(&ran1,&allbits,&dummy,&ranlim);
      findnextprimerabin(&ranlim,&firstprime,10);
      }
    while (isbigger(&firstprime,&allbits) != -1);
    //printf("got a first prime \n");
    //printnum(&firstprime);
    /* now, we repeat, such that 2*firstprime + 1 is also a prime: dhprime */
    timesnum(&firstprime,&two,&firstprime2);
    sumnum(&firstprime2,&one,dhprime);
    }
  while (isprimerabin(dhprime,10) == 0);
  /* we now put our Diffie-Hellman strong prime as a modulus */
  for (i = 0 ; i < nofdigits ; i++)
    {
    modulo.digit[i] = dhprime->digit[i];
    }
  /* we find the generator dhgen and test whether it is a primitive root */
  do
    {
    randomgen(&ran1);
    // note that at this point, firstprime2 is dhprime minus 1
    dividenum(&ran1,&firstprime2,&dummy,dhgen);
    power(dhgen,&firstprime,&test);
    }
  while (isbigger(&test,&one) == 0 || isbigger(dhgen,&zero) == 0);
  /* we now have a dhprime which is a strong prime, and

  /* we put the original modulo parameter back */
  for (i = 0 ; i < nofdigits ; i++)
    {
    modulo.digit[i] = oldmodulo.digit[i];
    }   
  } // end diffiesetup

/* ----------------------------------------------- */

The fact of being "master" of the set-up makes that once the principle is understood, we are free of any externally given information. But, we can also use established D-H set-ups which are published.

Diffie-Hellman key pair generation

The set-up being public, there's no real need for a cryptographically secure random number generator. However, for the secret key, this is essential. We warn the reader that the code here uses the standard C-library random number generator, which is absolutely not cryptographically secure, and that its usage is only for illustrative purposes. Don't generate a real private key in a real application with this generator !

The key generation itself in D-H is in fact quite easy: the secret key is a random element of the group, and the public key is the generator of the set-up, raised to the power of the secret key.

 
void diffiekeypair(number* dhprime, number* dhgen, number* pkey, number* skey)
/* This routine generates a key pair if a given Diffie-Hellman setup is specified */
  {
  number oldmodulo;
  number ran1;
  number dummy;
  number one;
  number dhprimem1;
  int i;  

  /* we put the current modulus away and set the modulus to its maximum */
  for (i = 0 ; i < nofdigits ; i++)
    {
    oldmodulo.digit[i] = modulo.digit[i];
    modulo.digit[i] = dhprime->digit[i];
    one.digit[i] = 0;
    }
  one.digit[0] = 1;
  difnum(dhprime,&one,&dhprimem1);
  do
    {
    randomgen(&ran1);
    dividenum(&ran1,&dhprimem1,&dummy,skey);
    }
  while (isbigger(skey,&one) != 1);
  power(dhgen,skey,pkey);

  /* we put the original modulo parameter back */
  for (i = 0 ; i < nofdigits ; i++)
    {
    modulo.digit[i] = oldmodulo.digit[i];
    }   
  } // end diffiekeypair

The code speaks for itself.

The generation of the common secret key

If two users of a D-H system want to establish a common secret key over an insecure channel, they have to communicate of course which set-up they use (this can be implicit if they use a given software package, but there needs to be some dialogue of the set-up in one way or another), and they send each-other their respective public keys. Using the other party's public key, and one's own secret key, a common secret key can be calculated shared between both parties, without the need to communicate anything sensitive. This is what the following piece of code establishes. The variable names are inspired from the point of view of user B.

 
void diffiecommon(number* dhprime,number* pkeyA, number* skeyB, number* comkey)
/* given one's own private key, and the other party's public key 
the common secret key is found within a given DH setup */
  {
  number oldmodulo;
  int i;

  /* we put the current modulus away and set the modulus to its maximum */
  for (i = 0 ; i < nofdigits ; i++)
    {
    oldmodulo.digit[i] = modulo.digit[i];
    modulo.digit[i] = dhprime->digit[i];
    }

  power(pkeyA,skeyB,comkey);

  /* we put the original modulo parameter back */
  for (i = 0 ; i < nofdigits ; i++)
    {
    modulo.digit[i] = oldmodulo.digit[i];
    }     
  } // end diffiecommon

Comments

The conceptual simplicity of the Diffie-Hellman key exchange in its classical form has a similar consequence as the simplicity of the RSA crypto system: it is not difficult to implement it "from scratch" just with the conceptual knowledge of how it works. As such, this kind of cryptography is forever out of the Box of Pandora, and there is no way to disallow people to use it, as no "common software" is necessary to set it up, whose central distribution could be put in difficulties by authorities wanting to outlaw secure cryptography.

That said, the classical form of the Diffie-Hellman key exchange needs very large keys (at least 1024 bits, and if one wants to have some serious security, more than 2048 bits), and is hence computationally expensive. Moreover, contrary to the RSA system which is a full crypto system, in its classical form, the Diffie-Hellman system only allows people to have a common secret key, but now they still have to use that key to communicate or do something else.

There are two visions on the D-H system. One is that it is a general way to have public keys amongst many people, so that each individual pair of users can have its own secret key. This implies that all people agree upon a given choice of the D-H set-up, but has the advantage to have generally known public key distribution systems. Another view is to use the D-H system as a temporary system to set up a common secret for one single communication, between two parties that couldn't exchange securely a common key in advance. In this case, one of the parties has to take the initiative to start the conversation, ideally by sending a proposed D-H set-up (a safe prime and a generator), together with a public key. The other party can then take that set-up, generate one's own key pair, and reply with his public key. That is sufficient to have a common secret key, that can be used for a symmetric cypher communication. It is this form of communication that doesn't need any other convention, software or whatever.