Home
swineone's Journal
 
[Most Recent Entries] [Calendar View] [Friends]

Below are the 3 most recent journal entries recorded in swineone's LiveJournal:

    Thursday, July 7th, 2005
    12:34 am
    A cryptographic data escrow protocol
    So here's what I've been working on:

    For a P2P protocol that I'm trying to design, I needed a secure way of `selling' a piece of data to an untrusted person (the lack of trust is symmetrical, in fact: the buyer doesn't trust the seller either). Immediately one might suggest the introduction of a third party that's trusted by both of them, to perform an escrow service. OK, that's fine, except that this is going to be used in a P2P protocol, so you can't shift back the load to a central server. It occurred to me that the seller might send an encrypted version of the data, then use the escrow authority to perform an escrow of the key only. While not completely decentralized, this seems suitable.

    There are two problems with the naive protocol as stated:

    • Why should the buyer trust that the seller will not simply send random data instead of what was promised?

    • Why should the seller trust that the buyer is not going to lie about receiving correctly the data and kill the transaction?

    Note that this isn't about being sure that the encrypted data was correctly transmitted; that's easy to do. The point is whether the seller even bothered to encrypt the correct data at all, and if they didn't, it's obvious that no amount of error correction is going to change that.

    Now let's suppose we had a hash function h() with an interesting property: given a message m, an encryption function E_k() where k is the key, and the hash h(m), then h(E_k(m)) can be efficiently deduced from k and h(m) (besides being directly computable given just E_k(m)). Obviously h() should be a strong hash function otherwise: hard to find collisions and preimages, etc. In that scenario, assuming the escrow authority knows h(m), the escrow protocol could proceed like this:

    1. Escrow authority generates a key k and sends it to the seller, and computes h(E_k(m)), given that it knows k and h(m);

    2. Seller encrypts m with k: c = E_k(m), and sends it to the buyer;

    3. Buyer computes h(c) = h(E_k(m)) and checks with the escrow authority whether it matches the expected value;

    4. If affirmative, buyer is sure that seller was honest and completes the transaction.

    Initially it occurred to me that one could seek a linear hash: one for which h(m XOR s) = h(m) XOR h(s). This, combined with a strong stream cipher (or a block cipher in a mode of operation that emulates a stream cipher, say CTR mode) would do the job. My advisor did have one at a hand (a MAC, rather): the Galois/Counter Mode (GCM) of operation for a block cipher. If the message m could be written as the concatenation of block m0, m1, m2, ..., then GCM computes something like m0 H + m1 H2 + m2 H3 + ..., where H is a polynomial in GF(2128), and the entire computation is performed in GF(2128) -- that is, modulo an irreducible binary polynomial of degree < 128. The polynomial H is the encryption of a string of zeros under some key (recall that this is a MAC, not an ordinary hash). The problem is, once this key (and hence H) is revealed, it becomes easy to forge a hash: just equate the expression above (m0 H + m1 H2 + m2 H3 + ...) with the desired hash value, then fix all coefficients (the mi's) save for one, then solve the corresponding polynomial equation.

    As a sidenote, I have an argument that the inherent idea of a linear hash is flawed (but not a MAC: rest assured that GCM is still unbroken), which I'm sure has been published already, but here goes: if your hash has b bits, consider the vector space GF(2)b (note: this isn't the same as GF(2b)). Generate a few random random bitstrings (all of the same length) and map them to GF(2)b using your hash, and take note of these preimages. Now, one expects that b+1 or b+2 bitstrings should be necessary until you have a set of b linearly independent vectors in GF(2)b. Now, take these as a basis of your vector space, and perform a change of base to the canonical basis, applying the same operations to the preimages that were stored before. Now you will have a bitstring which hashes to 100..000, another which hashes to 010...000, and so on. It's easy to XOR these together to obtain any desired hash value. (Note that the change of basis is not strictly necessary, but it's certainly easier than solving a linear system each time a preimage is desired; either way the hash is broken.)

    But I digress. So it seems that the whole idea has to be scrapped? I believe not. To fake a hash in GCM, one must solve polynomial equations in GF(2128), which is easy. In fact, it can be as simple as solving a linear equation of the form ax = b for x, which requires computing an inverse of a. To prevent inverse computation, I thought of hiding the group/ring/field order away. The only algebraic structure I'm aware of where this would work is the multiplicative group (Z/nZ)* for n = pq a product of two large primes -- the same setting as RSA. The group order is phi(n) = (p-1)(q-1), so anyone who doesn't know the factors of n can't compute phi(n) either (or so it is hoped). In fact, the whole premise behind RSA is that, given the public exponent e, one can't compute the private exponent d which is the inverse of e modulo phi(n).

    Initially I thought of computing something similar to a GCM hash, say h(m) = h(m0 || m1 || m2 || ...) = bm0 H + m1 H2 + m2 H3 + ... mod n for some integers b and H. Now, if one replaces the XOR operation is a stream cipher by integer addition, then h(m + s) = b(m0 + s0) H + (m1 + s1) H2 + (m2 + s2) H3 + ... mod n = h(m) h(s). In fact one doesn't need to break up the message at all: just interpret it as an integer in binary (in fact that's a special case of this definition, for a choice of H as a certain power of two.) Then h(m) is simply b^m mod n.

    The only problem with this scheme is efficiency: for each bit of the message, the buyer needs to compute a big modular multiplication (512-bit at the very least). The throughput for such a calculation should be less than a megabyte per second, I believe. The escrow authority is privileged in that it knows phi(n) and can reduce the message modulo phi(n) to gain a lot of time, and the seller doesn't need to compute this hash at all. The buyer though doesn't have a choice. Well, at least it's a start, and I'm thinking of ways to improve this without losing security.

    Current Music: Stand Away (Angra)
    Saturday, June 25th, 2005
    9:57 pm
    First (real) post
    To kick off the journal, I'm going to mention a couple of mistakes that I've made in some C++ code that almost drove me up the walls here. Hope that's helpful to someone out there.

    Consider this piece of code:

      if (switchedToLList)
        if(!llist[x-minx].empty())
          for (std::list::iterator my_iter = llist[x-minx].begin(); my_iter != llist[x-minx].end(); my_iter++)
            fprintf(fptr, "\n%d %d", x, *my_iter);
      else
        for (y=minyforx[x-minx]; y<=maxyforx[x-minx]; y+=2);
          if (array[x-minx][Y_BYTE] & Y_BIT)
            fprintf(fptr, "\n%d %d", x, y);

      whatever();

    Now if the linked list in question is empty, then the whatever() statement is executed, right? Wrong! The indentation in this code does not reflect the way the compiler parses it; the else statement is linked to the first if, not the second. Hence, for this code to perform what the indentation implies that it does, one should include explicit braces like this:

      if (switchedToLList)
      {
        if(!llist[x-minx].empty())
          for (std::list::iterator my_iter = llist[x-minx].begin(); my_iter != llist[x-minx].end(); my_iter++)
            fprintf(fptr, "\n%d %d", x, *my_iter);
      }
      else
        for (y=minyforx[x-minx]; y<=maxyforx[x-minx]; y+=2);
          if (array[x-minx][Y_BYTE] & Y_BIT)
            fprintf(fptr, "\n%d %d", x, y);

      whatever();

    Took me forever to figure this out. In general, if you have nested if statements, the else statement is linked to the innermost if; remember to use explicit braces if that's not what you want.

    On to the second problem. Suppose you have some code like

      std::list mylist;
      // ...
      for (std::list::iterator i = mylist.begin(); i != mylist.end(); i++)
      {
        // Operate on i
        // ...

        if (some_condition)
          mylist.erase(i);
      }

    This code is going to segfault. Now that should be obvious, it's an elementary mistake, but sometimes things are not so `in-your-face' as in this example, and it's not the first time I've been bitten by this kind of bug. Just remember to do something like

      std::list mylist;
      // ...
      for (std::list::iterator i = mylist.begin(); i != mylist.end(); i++)
      {
        // Operate on i
        // ...

        if (some_condition)
        {
          i--;
          mylist.erase(i);
        }
      }

    That's all for today, folks. See you next time.

    Current Music: Six Degrees of Inner Turbulence (Dream Theater)
    9:28 pm
    Hello!
    Welcome to my journal. Here I'll attempt to document the geeky things that I've been doing. Those might range from weird mathematics to low-level programming and everything in between. When I'm not in a computer/maths geek mood, you might read about the latest album or movie that I downloaded, or what song I'm learning to play in the guitar (hopefully keyboard as well soon). Please leave your comments so I can pretend I'm not writing to myself (ha, who am I kidding?)

    Current Music: The Great Debate (Dream Theater)
About LiveJournal.com

Advertisement