Modular Exponents in Racket

DRAFT

Come August, I’ll be back as the TF for CS201, which is taught in Racket. As a way of refreshing my functional programming (FP) memory, I want to implement efficient algorithms for:

The constraint is that both functions must run in time (whenever I refer to in this post, I mean ). The identical time complexities hint that we can solve both problems with the same basic approach.

Along the way, I want to introduce some practical tips:

Exponents

In Lisp-based languages, the purely recursive solution to is elegant; however, it runs in linear time (there are multiplications).

; recursive O(exp) solution
(define (power base exp)
  (cond
    [(= exp 0) 1]
    [else (* base (power base (sub1 exp)))]))

One way to improve on the memory demands of power1 is to make it tail-recursive, so that the the multiplication operation does not have to wait around for the result of the recursive call to bubble back up. We can keep the canonical signature in a wrapper, pow1, and then add a parameter that remembers the running total:

; tail-recursive O(exp) solution
; wrapper for keeping the function signature clean
(define (pow1 base exp)
  (power1 base exp 1))

; note how final application is simply procedure itself
(define (power1 base exp result)
  (cond
    [(= exp 0) result]
    [else (power1 base (sub1 exp) (* base result))]))

Can we do even better? Think about an expression like . We could rewrite this as , using the identity . Recursively, will become the new and we will continue until . This will take steps.

; tail-recursive O(log exp) solution using identity a^b = a^(b/2)*a^(b/2) 
(define (pow2 base exp)
  (power2 base exp 1))

(define (power2 base exp result)
  (cond
    [(= exp 0) result]
    [(odd? exp) (power2 (* base base) (quotient exp 2) (* base result))]
    [else (power2 (* base base) (quotient exp 2) result)]))

Why does this work? Notice that base is multiplied with its current value at each level of the recursion. This squared value becomes the new base. We have already shown that there will be levels, since we are repeatedly dividing exp by two and then truncating (incidentally, this is why we need to use quotient and not /). We can represent the successive squarings of base by writing raised to the power . Since the original root is base, we then have:

This is because from the definition of what a logarithm is (i.e. the undoing of exponentiation). So we can be satisfied that the updates to base in the recursive calls are correct and will yield an answer of the right magnitude. But what about result and the odd? predicate?

Notice that in my example of decomposing into two halves of , everything is nice and symmetrical. But what about or ? It’s easier to consider the case . In fact, as we divide exp by two, the final two values of exp will inevitably be and . You can verify this for yourself by checking the odd and even cases. When exp equals one, we can think of there being an extra base factor hanging around that needs to be incorporated into the result of the exponentiation.

Put more generally, the number of times that the odd? predicate will evaluate to true is a measure of how far exp is from being a power of 2 (in which case we could imagine a perfectly balanced binary tree being combined into the solution: the leaves have the value base and these join with their siblings to become and in turn and and so on). But to evaluate something like , there are three extra base factors hanging around since seven is not quite . In fact, when the initial value of exp is of the form , exp will always be odd until the maximum depth is reached and exp is 0. The values of the arguments throughout the computation (for our example) are:

depth exp odd? base result
0 7 true 3 1
1 3 true 9 3
2 1 true 81 27
3 0 false 6561 2187

At the top level (depth = 0), we have an odd exponent, so we need to update result so that at the next call (depth = 1) the “missing” factor is accounted for (again, by “missing” I mean “distance from being a perfect power of 2”). How many factors we are missing scales with since this is a doubling algorithm. Thus, we pick up of the three missing 3’s when we are at depth 1 and odd? is true. The exact same insight would apply to or any base raised to an arbitrarily large odd exponent: since base can only take on values of the form , we need to pick up of the missing original base factors at that depth and fold them into result. By the way, this is the motivation for returning result and not base in the terminating case.

Modular exponents

As it turns out, we can use the structure of power2 to solve a related problem: what is the value of modulo some integer ? The very helpful algorithms compendium at Geeks for Geeks explains the practical importance of being able to calculate this quickly.

TODO: linger on this and show the C and Racket source code for expt.

Moving on to the algorithm, we have a wrapper function pow3 that hides the result parameter. If your recursion involves multiplication, the initial value of your accumulator (here it’s called result) is typically 1. Why? A related question: what would be a good initial value for the running total if the algorithm involved addition?

; for x^n % p, if x > p, we can just evaluate (x % p)^n % p 
(define (pow3 base exp p)
  (power3 (modulo base p) exp 1 p))

Lingering on this simplification will give us a way into the modular arithmetic identities we’ll use for the main algorithm. Why are we allowed to substitute base modulo p for base? Doing so implies that . Or, using the congruency symbol:

Here’s a concrete example. The claim is that since (this is the substitution that pow3 performs). As a sanity check:

For me, it wasn’t immediately obvious that this relation should hold for any value of . Breaking it down, an inductive proof suggests itself when you consider that the value of the exponent is irrelevant:

As we hold constant, we can pile up 5’s on the top row and 2’s on the bottom row and they will always have the same answer modulo 3. Adding an -th 5 and 2, respectively, will preserve the congruence. We can get the induction going by considering the base case for our example:

This follows immediately from the definition of the modulo operation. Clearly, “wraps around” the same distance that does. Put another way, the remainder of is congruent (modulo ) to itself, since this remainder is what defines the value of in the first place. Basically, after having proved the base case , we want to show that adding a factor preserves the congruency. The statement is the inductive hypothesis. Having assigned we claim that:

Let’s switch up our notation slightly so that we don’t need to work with exponents. We can set and (remember that this is for some arbitary exponent ). Now we prove that . Using our updated notation, we will show that the congruences can be multiplied (much like two equalities). In fact, the proof involves briefly moving out of the congruence notation and into that of equalities. Consider that if then can be expressed as the sum of and some multiple of :

Note that I have only been using integers (the set ) for all bases, exponents, and values of . The intuition is that and remain the same number of “spots” from all the multiples of on the number line. We can express both the following congruences in this new form:

Since we are allowed to multiply equalities, we distribute and then factor out a :

We’ve just shown that we can obtain an equality in terms of and that is identical to the form used to convert to an equality from a congruence: namely, a remainder added to a multiple of . This implies that is congruent to modulo and vice versa (by the symmetry of congruences). If we recall that and , we’ve shown that if then it follows that:

That completes the inductive proof that we can simplify base to base modulo p in a modular exponentiation function such as pow3. Whew! A more general (and very useful) identity for multiplying congruences is:

Melvin Hausner has an excellent set of notes on the basics of modular arthmetic. They were helpful in formulating and checking this post.

We can use this identity to complete our solution for modular exponentiation. Note that this function has the desired time complexity, but we are just using Racket’s built-in modulo operator. Can we do better, following a trick of Hausner?

Actually: built-in modulo handles giant exponents with ease! here is good motivation for profiling part. Maybe extract that out into separate post? (don’t ruin it for 201 students)

https://github.com/racket/math/blob/master/math-lib/math/private/number-theory/modular-arithmetic.rkt

Link to Rosetta Code and then benchmark expt vs modular-expt in Racket

; as before, this has complexity O(log exp)!
(define (power3 base exp result p)
  (cond
    [(= exp 0) result]
    [(odd? exp)
    	(power3 
    		(modulo (* base base) p)
    		(quotient exp 2)
    		(modulo (* base result) p)
    		p)]
    [else
    	(power3
    		(modulo (* base base) p)
    		(quotient exp 2)
    		result
    		p)]))

TODO

08 Jul 2017