Opinion

Should Deepfakes Be Open-Sourced?

A discussion about the pros and cons of opening up deepfakes

Jack Saunders

Published in

Towards Data Science

5 min readMay 25, 2023

I am a PhD researcher who creates what can be considered deepfakes for my research. I’ve been fascinated by the ability to create lifelike digital doubles and level-up entertainment. Before getting into my research, I had assumed that these models can cause far too much harm to be released to a general audience. Over the past few months, I’ve noticed an ever-increasing number of the top voices arguing for Artificial Intelligence as a field to make open-source software one of our core tenets. While this discussion almost exclusively centres on LLMs, it seems to me that the idea is common across the field. I am fully in favour of open-sourcing almost all AI models, but when it comes to my own research area I am less sure.

There are few areas in which the potential for misuse is as high as for the creation of deepfakes.

My approach so far has been to walk a middle ground. Trying to communicate exactly how deepfakes work, at a level that does not require a PhD to understand. Yet, I often wonder if this is the correct approach. The purpose of this article is to try and start a conversation about the direction we deepfake researchers should take. With that in mind, in this article, I cover some of the pros and cons of open sourcing.

The Pros of Open-Sourcing

We hear a lot about the negatives when it comes to deepfakes. One would be forgiven for asking why researchers like me should even consider creating open-source models. There are, however, a lot of legitimate reasons to do so:

Transparency: For me, this is the big one. If most major deepfake models are completely open-sourced, then they become transparent. In doing so, it becomes possible for regulators to understand what they are dealing with and for other researchers to develop better detection algorithms. When it comes to deepfakes, there will be an arms race between those who wish to do harm using them (bad actors), and those trying to prevent that harm (good actors). You can be sure “bad actors” will develop their deepfakes whether we release our models or not. In open sourcing, we give “good actors” more data on which to build their harm prevention models.
Fairness: If we choose not to open source, only those institutions with access to talent and computational resources can create deepfakes. Speaking from experience it takes a long time to learn to develop these models, and without open-source software, not many people can do so. This can serve further to concentrate power into the hands of the already powerful. Deepfakes can be used in several markets and could have potential value in the tens of billions of dollars. For example, the dubbing market alone is estimated to be worth over $3.5bn. If only the likes of Google can create deepfakes, then the likes of Google alone can benefit from them.
Awareness: Deepfakes are advancing at a rapid pace. It is likely that we will soon reach a point where you cannot trust any video you see online unless it is verified in some other way. While many people have a vague idea of this, I do not think many fully understand the implications. As deepfake researchers, it is our responsibility to help educate people. We need to really encourage everyone to practise good digital scepticism, to check the sources of any media they see online and to really question its authenticity. Open-source software helps. It becomes much easier to educate when the models are freely available online. If you can create your own deepfakes, you will naturally be on the lookout for others.

The Cons

There are, of course, a lot of potential downsides to open-sourcing deepfake models ranging from the obvious to the more subtle.

People will misuse them: Regardless of how much we regulate or how good detection models are, there will be a small group of people that will use deepfakes for the worst possible reasons. From revenge porn to disinformation, there are some really nasty applications of this technology. If we open-source the models, we make it easier for all people to access them, which will undoubtedly cause harm. It is true that some of these bad actors will be able to do it anyway, particularly large criminal or national organisations, but the majority of people seeking to do harm may not have otherwise been able to do so, if there were no open-source models.
Safeguards can be removed: One of the better ways of protecting against the misuse of deepfakes is to introduce safeguards. In particular, methods such as watermarking are being used by most groups that create deepfakes. Watermarking involves the addition of data into the created videos in a way that is invisible to humans and most software but can be easily identified by those that have the “key” to unlock them. This means, for example, YouTube or Twitter could quickly detect if a video was created by a deepfake platform and remove it. Because watermarks can only be seen by those that are given this secret key, it is not possible for bad actors to remove them. If we open source deepfake generation, then it becomes possible for bad actors to simply skip adding the watermarks. Making the deepfake undetectable.
One-way sharing: If we consider again an arms race between so-called good and bad actors then we can see another drawback of open sourcing. If we, as the good actors, open-source all of our software then the bad actors can build on top of this. On the other hand, bad actors will not open-source their models meaning that information is being shared in one direction only. This gives bad actors a significant advantage.

Summary

As can be seen, this is not an easy question to answer. There are a lot of pros and cons, and the potential for harm is high in either case. During the course of writing this article, I have had many conversations. One thing that has surprised me is the number of open-source absolutists. An argument I have oft heard repeated is that deepfakes are here and there is no putting the genie back in the bottle. In the opinion of many, including myself, we will reach a time in which deepfakes are indistinguishable from reality. If by that time we are not all aware enough to be questioning them we may be in great trouble as bad actors operate without detection. This is a point I think recent calls for pauses in AI development overlook. Yet, while open sourcing may reduce this long-term harm, it opens the door for short-term harm by those that may want to abuse deepfakes, but do not have the technical skills.

While I remain undecided on the issue of open sourcing, I feel more confident than ever that this discussion needs to be had, and that at the least deepfake research needs to be done in the open and communicated to the public. I strongly encourage everyone to have a say. If you have an opinion that I haven’t covered here or any questions, please either leave a comment or contact me directly, I really do want to hear from as many people as possible.