Prime Primer

The fundamental theorem of arithmetic states that every positive integer (except the number 1) can be represented in exactly one way apart from rearrangement as a product of one or more primes .
— Hardy and Wright, “An Introduction to the Theory of Numbers”

Prime numbers are powerful, unwieldy things, and whether loved or hated, they demand our respect. They are the foundation of arithmetic, the crux of cryptography, and possibly the language of first contact with intelligent extraterrestrial life.

Since even a short excursion into number theory will reveal the difficulties primes have imposed on even the most brilliant mathematical minds, this blog post simply pays respect to the giants on whose shoulders we stand. Problem 3 of Project Euler is an opportunity to put into practice a concept we all learned in third grade, but probably didn’t fully appreciate until studying the set of prime numbers in depth: factorisation.

First of all, determining primality can be accomplished by one of two strategies, test, or proof. In a computer program, determining primality by testing is almost always used since it is most the general, but therefore inefficient. To alleviate this, computer scientists divide primality tests into dozens of subtypes, which arrive at answers more quickly for certain specialized subsets of the prime numbers set. Primality testing, in other words, is a huge field within mathematics and computer science, and well outside the scope of this blog post, which again, is limited to the more humble aim of prime factorisation. The work of primality testing should therefore be treated separately, and is addressed here only in a standard and perfunctory way using the Ruby method .prime? to facilitate testing.

First, let’s look at two methods that find the largest prime factor without using factorisation to help.

Brute Force Solution

This brute force method will check if every number is both a factor and prime number. The first one will therefore be the largest prime factor.

def largest_prime_factor_brute_force (number)
(3..Math.sqrt(number).to_i).to_a.reverse.each do |num|
if number % num == 0 && num.prime?
return num
end
end
end

Flatiron Solution

This solution offered by Flatiron exhausts all numbers that are both a factor and prime, but only explicitly tests that they are factor. The primality test is implicitly built into the division co-assignment because by definition prime numbers are only self-divisible. The last such prime factor will therefore be the largest.

def largest_prime_factor_flatiron(input)
prime = input
(2..Math.sqrt(input).to_i).each do |i|
break if prime <= 1
prime /= i while (prime > i && prime % i == 0)
end
prime
end

Factorisation Solution

Explaining factorisation is much easier and faster by example. Remember: the prime factorisation of any given integer is unique. Almost everyone has done this before. This is the prime factorization of 29250:

29250 == 2 * (3 * 3) * (5 * 5 * 5) * 13 #=> "true story!"

When factorisation is done by hand, it usually accomplished using a tree. The problem is there are many kinds of trees that could potentially be traversed before reaching the prime factorisation. Here are two ways to decompose 29250 into its prime factors:

Two Prime Factorisation Trees for 29250

The second tree is probably more natural when working by hand, but it is very deep, which means more linear calculations; it also very narrow, which means it must traversed without taking much advantage of the parallel processing which advanced iterators such as Ruby’s collect happily offer us. Therefore, when designing a function to decompose a number into its prime factors we must keep the tree broad and shallow by finding the largest, rather than the smallest, divisors first.

def prime_decomposition (factors)
factors.collect do |factor|
if factor == 1 || factor.prime?
factor
else
divisor = 2
until factor % divisor == 0 do
divisor += 1
end
dividend = factor / divisor
prime_decomposition([divisor, dividend])
end
end
end

This function is recursive, because the design to find the largest divisor has the effect that we are not at first concerned over its primality; since we do not know whether we have hit the bottom, we must therefore continue to iterate recursively until we reach the base case: primality. You may have also noticed that the use of the collect iterator, though efficient, required an input of an array, and results in an output of a potentially nested array. That in itself has two consequences. The first is that the array must be flattened:

def prime_factors (natural_number)
prime_decomposition([natural_number]).flatten
end

It is fair to wonder whether nesting and then flattening is an inefficiency that needs to be refactored. Not so in this case, as the cost of flattening at every step of a recursion of unknown but potentially great depth is higher than taking advantage of a highly optimised and specially-built flattening method as the one given to us by Ruby. However, of course, there will be a stack depth for which the memory will overflow, and built-in flattening cannot be avoided. This leads us to the second consequence of nesting, which is that it preserves the structure of the tree that lead to prime factorisation, which means it the tree can be reconstructed if desired. However, the problem asked for the largest prime factor, not all of them. Let’s solve it now:

def largest_prime_factor(input)
prime_factors(input).max
end

It should not be counterintuitive any longer that finding the maximum of the set of all of prime factors by first decomposing a given number can be more efficient than attempting find the largest prime factor directly or by exhaustion. The key is in the design of the algorithm that traverses the prime factorisation tree. The tree must be traversed broadly to keep it as shallow as possible in order to leverage the ability of parallel computing implementations of iteration (vis a vis collect) to break up the problem into small, manageable, and quickly solvable parts — in other words: divide and conquer.

Some closing thoughts: it’s not a race. In my opinion, the factorisation method is best appreciated in the sense that it makes use of simple mathematics with which most of us are familiar, and presents an opportunity to apply important computer science concepts such as recursion and flattening. However, it’s important to make sure that our methods are not only beautiful, but practical. Here are some benchmark results.

largest_prime_factor_brute_force:  0.150000 (  0.154819)
largest_prime_factor_flatiron:     0.050000 (  0.053146)
largest_prime_factor:             0.000000 (  0.000240)

The brute force method is 3 times slower that the flatiron method, and the flatiron method is 221 times slower than the factorisation method. This makes the brute force method approximately 650 times slower than the factorisation method.

Again, this all has to be taken with a grain of salt. Depending on the type of prime number, and the range of prime numbers under consideration, some methods, especially very advanced and specialised ones not considered here, or even the simplest ones, will have very specific regimes of optimality. The most useful takeaway is not which one is the ‘best,’ but that each has strengths and weaknesses which should be assessed and used as appropriate.