Ste·ga·no·graph·y / stegəˈnägrəfi / (noun): the practice of concealing messages or information within other nonsecret text or data.
Synopsis: it’s possible to effectively conceal sensitive information inside language without raising eavesdropper suspicion using Deep Learning. This has important implications for secure communication.
Edit 09/06/20: Check out the end of the article for a machine-generated summary! More to come in that direction soon. 🙂
Steganography has been used for ages to communicate information in a hidden manner. A natural question: how is this different from cryptography? Isn’t cryptography a very well-studied field in which two individuals aim to share information with each other without an eavesdropper being able to discover this information? Indeed, these two areas are very similar, but there’s an interesting property of steganography that takes information sharing to a whole different level: the information is shared without an eavesdropper even knowing that anything secret is being shared. What’s the use of, say, Shor’s algorithm (for breaking RSA encryption in polynomial time using a quantum computer) if you don’t even know what to decrypt?
Steganography has long been associated with painting and visual art. Painters often hide signatures, self-portraits, and other secret messages within their works as an "inside joke". One such example of this is Jackson Pollock’s "Mural", wherein Pollock hid his entire name in plain sight in the curvatures of the work.

Until recently, however, computational steganography methods for images (such as appending bits at the end of a .jpg file or applying mathematical functions to select RGB pixel values) have been easy to detect and uncover, and hand-crafted ones are difficult and not scalable.
In 2017, Shumeet Baluja proposed the idea of using deep learning for image steganography in his paper "Hiding Images in Plain Sight: Deep Steganography" [1]. In this paper, a first neural network (the hiding network) takes in two images, a cover and a message. The aim of the hiding network is to create a third image, a container image, that is visually similar to the cover image and is able to be used by a second neural network (the revealing network) to reconstruct the message image via the revealed image (without any knowledge of either the original message or the original cover). The loss is defined by how similar the cover and container images are and how similar the message and revealed images are. The concept was expanded upon a few months later by Zhu et al. to allow for arbitrary data encoding into images [2]. The results were astounding: the network was able to create container images that looked very much like the cover yet allowed the revealing network to reconstruct the message very closely.

While this result was very interesting, we felt that the utility of steganography specifically for images is limited.
The question arises: what are the limits for this approach in other domains of information?
More specifically, the aim is to apply this approach to the domain of human language in the form of text and audio, which could be a stepping stone to implementing this procedure for general information.
Deep Steganography for written language and other textual information
Text is harder to perform steganography with: images are dense while text is sparse; images aren’t affected much by small changes in pixel values while text is greatly affected by small changes in token values. While various methods for conducting text-based steganalysis exist, they face substantial challenges: (1) classical heuristic-based approaches are often easy to decode, because they leverage fixed, easily reversible rules, and (2) current approaches do not exploit any of the structural properties of the text, resulting in hidden messages that are not semantically correct or coherent to humans.
Recent deep learning approaches [3, 4, 5, 6] rely on using generative models to hide the "secret" text in meaningless groupings of words. Here, we want to propose using a transformer-based approach to address both problems at once. We explore using a transformer to combine the desired secret text with some human-readable, coherent cover text in order to generate a new container text that both properly encodes the hidden message inside of it and is nearly identical to the cover text, retaining the cover text’s original semantic structure and legibility. In addition to the transformer used for encoding, we leverage a second transformer model to decode the container text and recover the hidden message.

Because transformers are big and bulky, we first tested our luck with a much simpler 1D-convolution character-based approach, CharCNN [7].
In this character-based approach, the idea is that a model would learn a statistical profile of character choices in a string of text and modify the characters in a way that sends a signal capturing the hidden message through character additions, substitutions, or removals.
A trivial example in which the message is two bits is considered. To communicate the secret message, our function is the length of the container message modulo 4. More specifically, let l represent the number of characters in the container message. l ≡ 0 (mod 4) yields 00, l ≡ 1 (mod 4) yields 01, l ≡ 2 (mod 4) yields 10, and l ≡ 3 (mod 4) yields 11. The model would accordingly remove or add characters from the cover to communicate the secret message. In practice, we would ideally have our model be more robust, yielding much more complex secret messages through the container texts. This approach has given some recognizable results on both steganographic and reconstructive metrics.
Cover: I can not believe my eyes, what I saw in the forest was far beyond the reaches of my imagination.
Secret: meet at river tomorrow at sunset.
Container: I mac now bleiave mye eey, waht I sa inn tee freost ws fara beymdo tee racheas of ym imaingaiton.
Revealed Secret: eemt a th rivre tomowro tt snseht.
Surprisingly, this information is still somewhat decodable (even though it’s clear that the message has been modified).
An interesting (likely unsolved) information-theoretic question arises in this area: given an information domain (like images, text, etc.), how much secret information can a model hide in given cover information in the average case? We started to see that with larger secret message input sizes and a static cover message size, the model had an increasingly difficult time hiding the information and reconstructing the hidden message. How good it was at each depending on how we weighted the two tasks in the loss function.
Next, we decided to investigate a heftier model for performing steganography in text. The primary approach we propose to tackle the challenge of text-based steganography consists of leveraging two NMT (Neural Machine Translation) models: one transformer model to encode the hidden message and a second model to decode it. We hypothesize that this transformer-based approach can potentially succeed at encoding a secret text within a cover text to produce a container text that closely matches the semantic structure of the cover text. An additional nice thing about this is that no custom dataset is needed: any collection of sentences or phrases and random generation of secret messages will do.
What does "similarity" between cover and container in this case mean? We don’t have a simple metric anymore like edit distance or L2 norm between pixel values. In our new scheme, the sentence "Don’t eat, my friends!" and "Don’t eat my friends!" mean very different things, whereas "Don’t eat, my friends!" and "Please abstain from eating, ye for whom I truly care!" have similar meanings. For ascertaining a metric of similarity, we leverage BERT (Bidirectional Encoder Representations from Transformers [8]), a pre-trained language model that can represent a sentence as a real-valued vector (using the [SEP] token’s vector) where the cosine similarity between two vectors is a good indication of how similar the sentences are in meaning.
The results presented in Neural Linguistic Steganography [6], the work most closely related to our own, indicate that state-of-the-art transformer-based language models such as GPT-3 can be leveraged to generate convincing cover texts to hide secret messages. In our implementation, our first NMT transformer model reads in the concatenated secret message (four digits in base 10) and cover text and proceeds to translate them into a container text. Our second transformer reads in the container text and translates it into a reconstruction of the original secret message. Again, the loss function we use consists of a linear combination of the similarity functions between the cover text and the container text (using BERT to produce Loss_Stego), along with the edit distance between the reconstructed secret message and the original secret message. The loss function is (somewhat humorously) formulated as

where c is the cover instance, c′ is the container instance, s is the secret message, and s′ is the reconstructed message. α and β in our loss function are parameters we can set or have change as a function of the epoch or the loss rate of change. We define the similarity between stegotext (or container text) and cover text with respect to meaning to be the cosine similarity of the embedding of both sequences generated by a pre-trained BERT base model.
We found that the model that we used, an LSTM seq2seq model with attention for the hiding network and revealing network, was not powerful enough to generate good containers and was faulty in reconstructing the secret message. The loss converged quickly and at a fairly high loss. We additionally hypothesize that a low loss is likely the result of a generative adversarial model for BERT: finding sentences that are meaningless to humans but have small cosine similarity in their embeddings against cover texts as evaluated by BERT. Below is one output example:
Cover: celebrate the opportunity you have to eat this.
Secret: 8 4 4 4.
Container: blessing pedals ampoule mbi mbi jharkhand ampoule coring substantive substantive tranquil steadfast murdoch cleverness germane obeng.
Revealed Secret: 8 4 1 4.
Despite this weak example, we also feel that with a sufficiently powerful NMT model and enough compute, we would start to observe useful natural language steganography on textual and general information. This technique could potentially revolutionize communication where there exist adversarial eavesdroppers, especially in the quantum era of fast decryption of common cryptographic protocols. This implementation is left as future work to folks who have an abundance of compute. (Code for the seq2seq model is viewable in this Colab.)
Deep Steganography for speech and other audio information
While performing deep steganography for text may be a ways off, a similar approach for sonic information (sound files) is certainly within reach. The core differentiator for why text is so much more difficult to perform steganography with than images is because text is sparse: in a line of text, each word can be represented by a natural number, and there are typically no more than 100 words in a given sentence. However, images are dense: that is, an image is represented by a very large number of pixels, each having 16777216 possible values (for RGB images).
Sound waves are similarly dense, depending on how they’re represented. We’re going to look at two of the ways that sonic information can be represented: spectrograms and waveforms.
- A spectrogram is a visual representation of audio. Specifically, the spectrum of frequencies of a signal as it varies with time is represented in the spectrogram by colors and intensities of a time series heatmap. Below is a spectrogram of myself saying, "this is a secret message: my secret message". For the sake of visuals in this article, we focus on spectrograms.
- A waveform is the shape of the graph of a sonic signal as a function of time. Waveforms can be visualized with oscilloscopes to be analyzed for properties such as amplitude, frequency, rise time, time interval, distortion, and others.

As was proposed by Hiding Images in Plain Sight: Deep Steganography [1], the spectrogram can be operated on as an image using its 2D feature map spectrogram, which can be linearly or logarithmically scaled. The secret message can be an image, text, or other audio data. There are three primary ways of extracting spectrograms from sound: 1) Short-Time Fourier Transform (STFT), with the filters distributed uniformly on the linear frequency axis; 2) Log STFT, same as 1) but with filters that are equally spaced on the logarithmic frequency axis, and 3) Random Matrix Transform (RMT); related to 1) and 2), but with a random matrix R in the place of the discrete Fourier transform matrix.
Here, we can use a 2D U-Net architecture [9] to learn steganography with the spectrogram.

We can also use a convolutional model to learn steganography with the raw audio waveform (STFT), as was proposed in July 2020 by Kreuk et al. in Hide and Speak: Towards Deep Neural Networks for Speech Steganography [10] with impressive results. In either case, the deep learning model training procedure is largely the same: a concealer network and revealer network must be trained in parallel in such a way that the concealer network is able to generate container audio data that is aurally similar to the cover audio it is given (in practice, this perturbed container audio has a tiny bit of "fuzzy" sound), which can then be used by the revealer network to reveal the secret audio it is also given.
End-to-end, a product created using this system can be used as illustrated after training the concealer and revealer networks:

As shown, Bob is trying to hide an image of me inside a spectrogram. The concealer model receives the spectrogram and secret message, yielding a resulting modified spectrogram with small perturbations. This spectrogram is sent across the internet to Alice, who uses the revealer model to convert the container spectrogram into a reconstructed (slightly modified) secret image. A similar process is achieved with a convolutional architecture on the audio STFT waveform.

To conclude…
Deep steganography has important implications for secure communication as well as watermarking and service accountability for information machine learning cloud providers. These are only preliminary, unpublished results and have not been incorporated into any sort of product yet, but they provide a hopeful look into the potential that deep steganography offers to keep our information safe in the future.
Shoutouts to Mario Srouji, Dian Ang Yap, Michele Catasta, Brad Efron, Ashish Kundu, Mustafa Canim, and Christopher Manning for their involvement and contributions!
I’m on LinkedIn, Medium, and Twitter.
Machine-Generated Summary
As promised, here is the machine-generated summary, minimally edited (replacing [UNK] tokens here and there, capitalizing words, removing extra spaces…not cherry-picked other than paragraph selection!). This is another project I’m working on…more to come soon. The original text is 2400+ words.
Summary (473 words):
People have been using steganography for ages to communicate information in a hidden way. It has long been associated with painting and visual art. Baluja proposed the idea of using deep learning for image steganography in his paper. A first neural network takes in two images to create a third one that can be used by a second neural network to reveal the message via the revealed image. The concept was expanded upon by et a few months later to allow for arbitrary data encoding into images.
For written language the idea is that a model would learn a statistical profile of character choices in a string of text and modify them in a way that sends a signal to the hidden message through character or character. The current approaches rely on using generative models to hide the hidden messages. The new approach uses a second model to combine the desired text with some coherent cover text to create a container text. L represents the number of characters in the container. The model would remove or add characters from the cover to communicate the secret. The approach has given some recognizable results but the information is still hidden.
The primary approach for performing a in-depth analysis of secret messages consists of two models: one reads in the hidden message and the second one translates it into a container text. The loss function consists of a linear combination of the similarity functions between the cover text and the container text to create a secret message. The model with attention for the hiding network was not powerful enough to generate good containers and was faulty in the secret. The low loss is probably the result of a generative adversarial model for finding sentences that have small similarity in their similarity. A sufficiently powerful and powerful model and enough people will start to observe useful natural language. The implementation is left as future work to people who have a lot of money.
Text is more difficult to perform deep steganography for text than for images. Images are represented by a very large number of words. Waves are similarly represented. The deep learning model training procedure is largely the same: a concealer network and the revealer network. It is able to generate container audio data that is similar to the cover audio it is given. It can then be used by the network to reveal the secret audio. A product created using the system can be used as illustrated after training the concealer and when training the secret. A model is trying to hide an image of me inside a spectrogram concealer model. The model is sent across the internet to someone who uses the model to convert the container into a reconstructed spectrogram revealer secret.
Deep steganography offers to keep people’s information safe in the cloud.
[1] Baluja, S. (2017). Hiding images in plain sight: Deep steganography. In Advances in Neural Information Processing Systems (pp. 2069–2079).
[2] Zhu, J., Kaplan, R., Johnson, J., & Fei-Fei, L. (2018). Hidden: Hiding data with deep networks. In Proceedings of the European conference on computer vision (ECCV) (pp. 657–672).
[3] Yang, Z.L., Guo, X.Q., Chen, Z.M., Huang, Y.F., & Zhang, Y.J. (2018). RNN-stega: Linguistic steganography based on recurrent neural networks IEEE Transactions on Information Forensics and Security, 14(5), 1280–1295.
[4] Fang, T., Jaggi, M., & Argyraki, K. (2017). Generating steganographic text with LSTMs arXiv preprint arXiv:1705.10742.
[5] Chang, C.Y., & Clark, S. (2010). Linguistic steganography using automatically generated paraphrases. In Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics (pp. 591–599).
[6] Ziegler, Z., Deng, Y., & Rush, A. (2019). Neural linguistic steganography arXiv preprint arXiv:1909.01496.
[7] Zhang, X., Zhao, J., & LeCun, Y. (2015). Character-level convolutional networks for text classification. In Advances in neural information processing systems (pp. 649–657).
[8] Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.
[9] Ronneberger, O., Fischer, P., & Brox, T. (2015, October). U-net: Convolutional networks for biomedical image segmentation. In International Conference on Medical image computing and computer-assisted intervention (pp. 234–241). Springer, Cham.
[10] Felix Kreuk, Yossi Adi, Bhiksha Raj, Rita Singh, & Joseph Keshet. (2020). Hide and Speak: Towards Deep Neural Networks for Speech Steganography.