The world’s leading publication for data science, AI, and ML professionals.

Pythonic Tips & Tricks – Basic Cryptography

Using Python to Decipher Encoded Messages

Tales from the Crypt

Photo by Arget on Unsplash
Photo by Arget on Unsplash

One of the challenges of being a data scientist is solving unique problems that leave most people scratching their heads. These range from seemingly innocuous textbook exercises to complex riddles left unsolved for years. In this article we shall go over some light to intermediate problems that will help you better understand how to decipher encrypted messages. We shall then apply our learnings to solve a basic deciphering exercise.

Getting the Sum of all Digits in an Integer

Let’s say the question was:

"Get the sum of all the digits in an integer, make the solution generalizable enough to work with any integer"

Assuming that the number given to us as an integer, we can simply convert it into a string and then simply use the split function to get a list of digits. All we would have to do then be to use the sum function.

def digit_sum_v1(number):
    number = [c for c in str(number)]
    return sum(number)

digit_sum_v1(369)
Output of Function
Output of Function

It seems we have come across an error. Fortunately it is an easy one to figure out. We simply have to look at the output of our function before the sum.

Output of Function
Output of Function

We can see that all the elements of the list are actually strings. This means that to fix our function we simply have to convert every element of the list into an integer before we enact the sum function.

def digit_sum_v1(number):
    number = [int(c) for c in str(number)]
    return sum(number)
digit_sum_v1(369)
Corrected Output
Corrected Output

Excellent, our function does exactly what we want it to do. Let’s do a sanity check.

Sanity Checking the Function
Sanity Checking the Function

All seems well. Now that was a rather simple exercise, let us make the exercise a little more complex.

Getting the Sum of all Digits in an Integer while incorporating If-Else Logic

Let’s say the question was:

"Get the sum of all the even digits and the product of all odd digits in an integer. If the integer is greater than 5000, get the product of all even digits and the sum of all odd digits instead. Make the solution generalizable enough to work with any integer"

Whew, that sure got a lot more complex. Realistically these would be closer to what would be required of a data scientist than a simple digit summation function. So let us solve this challenge.

from numpy import prod
def digit_sum_v2(number):
    digit_even = [int(c) for c in str(number) if (int(c)%2 == 0)]
    digit_odd = [int(c) for c in str(number) if (int(c)%2 != 0)]

    if number > 5000:
        even = prod(digit_even)
        odd = sum(digit_odd)
    else:
        even = sum(digit_even)
        odd = prod(digit_odd)
    return even, odd

Take note that we had to make use of the prod function found in NumPy. Essentially what we are doing is creating two different lists, one populated by all the even digits and one populated with all the odd digits. We then enact the sum and prod functions according to the size of the integer.

Let us run our function through several integers and see if it gives us what we want.

print(digit_sum_v2(1212))
print(digit_sum_v2(1313))
print(digit_sum_v2(3131))
print(digit_sum_v2(1515))
print(digit_sum_v2(5151))
Output of the Function
Output of the Function

Nice, our function works as intended. Notice that a feature of our function is that it is agnostic to the position of the digits. This means that numbers that have exactly the same digits should produce the exact same results, UNLESS of course one of them is above 5000.

Now you may be asking yourself, what does this have to do with encoded messages? Well let’s up the complexity one more time with this final question.

"Get the sum of all the digits equal to or less than 5 and the product of all digits above 5 in an integer. If the integer is greater than 5000, get the product all the digits equal to or less than 5 and the sum of all the digits above 5 instead.

Map the results to the Alphabet DataFrame with the logic below:

The Below 5 Output indexes the DataFrames Rows

The Above 5 Output indexes the DataFrames Columns

Both of the output indexes through the DataFrame cyclically

You should only use a single function that will take as inputs a list of integers and the Alphabet DataFrame"

import numpy as np
import pandas as pd
alphabet_matrix = pd.DataFrame(np.array(list('abcdefghijklmnñopqrstuvwxyz_'))
             .reshape(4,7))
alphabet_matrix
Alphabet DataFrame
Alphabet DataFrame

Note that for the purposes of this exercise underscore will represent space. Alright lets solve this problem.

def decoder(list_of_ints, df_reference):
    references = []
    for i in list_of_ints:
        below_5 = [int(c) for c in str(i) if (int(c) <= 5)]
        above_5 = [int(c) for c in str(i) if (int(c) > 5)]
        if i > 5000:
            bel_5 = prod(below_5)
            abv_5 = sum(above_5)
        else:
            bel_5 = sum(below_5)
            abv_5 = prod(above_5)
        references.append((bel_5, abv_5))

    final_message = [df_reference
                    [r[1]%(df_reference.shape[1])]
                    [r[0]%(df_reference.shape[0])]
                    for r in references]

    return ' '.join(final_message)

Let us run the function and see what the hidden message is.

message = decoder([1080, 1116, 1099, 1108, 1280, 9904, 1611, 119, 2199, 561, 17, 181, 61], alphabet_matrix)
print(message)
How Sweet
How Sweet

In Conclusion

We’ve constructed some interesting function that would help us decipher encoded messages. As one can imagine, most encrypted messages would not be this easy to decode. Nonetheless, I believe this article was sufficient in giving you a basic idea of how to go about studying the field of Cryptography. In future articles we shall go over more complex functions and tackle more interesting challenges.


Related Articles