Post by Ilya O. LevinI would like to bring to your attention the paper
"Sapparot-2. Fast Pseudo-Random Number Generator".
The purpose of this paper is to introduce an enhanced
version of Sapparot. It is a high-speed pseudo-random
number generator efficient in both software and hardware
yet very simple.
The paper is available at
http://www.literatecode.com/get/sapparot2.pdf
Your paper talks about this as a PRNG, and yet you have a section in it
about security, and you're posting to sci.crypt.research; both of these
lead me to the impression that you think it might be useful as a stream
cipher. By the standards stream ciphers are currently held to, I don't
think it is. Although it might be a perfectly good PRNG for statistical
or gaming applications.
There is a class of attacks called "Guess-and-Determine" attacks. In
these we first tally up what we know about the various unknown values,
usually considered as single bits. Things like "this xor that xor carry
= known". Choose the equation with the least unknowns, and guess the
value of one of them. Eventually this will allow us to infer
("determine") the value of some other unknown bit, so we get to a place
where we start to get two-for-the-price-of-one. Soon after that
happens, we start to get to a point where we can check whether the
results seem to "add up" -- where, if we guessed wrong, we'll get a
contradiction, and can backtrack. Our paper
http://www.qualcomm.com.au/PublicationsDocs/new_snow_gd3.pdf shows one
application of this technique in all its glory, where we manage to
recover the key for the original version of SNOW in
less-than-brute-force time. (And I'm not talking about time-memory
tradeoffs here, which you mention in your paper.)
I can see a fairly straightforward way to apply this kind of attack to
Sapparot-2. There's only one thing in there that prevents it being
pretty easy, and that's the variable rotation by the B column, applied
to the C column. (The carries from the integer additions are only a
minor complication, best handled by working from least significant bits
upward where possible.) So, we work around the rotation. Assuming that
it really is a fairly OK PRNG, we can assume that any desired value for
that rotation occurs uniformly at random. (In all the discussion that
follows, I'll use numbers for the t = 32 bit version.) It seems to me
that it will be advantageous to the attacker to keep the incoming "A"
column and the C column "lined up", so we will first look for places
where the high order 5 bits of the incoming B are "00111". This will
rotate the C column by 7 places, same as A. To do this we'll have to
gather about 33 consecutive words of output from the generator, and try
to apply the attack to each pair; generally it will fail, but we expect
it to work for one of them. This multiplies the effective complexity of
the attack by 2^5, but then gives us 5 bits of information we can use
already.
Anyway, we now can assume the value for those five bits is what we
needed it to be. Note that these are also the same five bits that get
xored to the low-order values taken from A into the B column! (This is
lucky for this attack; you might want to change that 5 bit rotation to
something different.) Now look at the least significant bit of the
first output. It's the xor of the three LSBs. But the first thing you
do to it xor A into C (we're talking about the LSB here, so add is
equivalent to xor since there's no carry.) So we know that that bit is
actually, now, just the LSB of B xor C... one variable has already
disappeared! If we guess the LSB of input B, for example, we now know
the corresponding LSB of input C, which becomes bit 7 of output C. Now
look at the B column; we already guessed that LSB, so we know it. It
gets xored with a 1 bit (that we knew from the rotation amount) and a
constant 1 bit from the A column, which cancel out, so the LSB of
output A is now known. ...
Now this kind of analysis is most amenable to machine optimization, but
even in the example I started above, where I just consider one pair of
consecutive outputs, we have 64 bits of information about the state,
and only 96 bits of state. Our experience tells me that we can probably
recover the internal state in something not much worse than 2^32
guesses, counting backtracking when we can check internal
contradiction; the depth of the search tree will probably be greater
than 32, but many branches will get pruned before they are anywhere
near that deep. Note: I'm describing the attack as if I just look at a
single pair of words, but in reality I think it will probably work
better on three or four consecutive outputs, with corresponding
requirements for more plaintext. The tradeoff here is that you get much
more information about the bits, but have to manage more complexity. I
wish I had finished my "generic G&D attack" program, this would be a
great exercise for it. But I never did, and I don't have time to go off
and implement this specific attack.
Hmmm... going back and looking again, I think that choosing the
rotation to be 5, and not 7, actually works even better. But anyway.
Any honours student out there who wants to take my start and run with
it?
Anyway, I could be wrong, particularly about the expected complexity of
such an attack, but I really don't think this generator is
cryptographically sound.
Greg.