Art Guard: Protecting Your Online Images From Generative AI

A lot of artists are worried about generative AI. They are concerned about web crawlers scraping images from their web pages to train AI models without permission and/or compensation.
I spent the last four weeks researching this topic and found a lot of information on how these models work and how to prevent the bots from stealing your work.
TL;DR – The easiest thing you can do is turn off AI crawler access to your website using settings in your hosting service, like SquareSpace and others. More details on this and other steps you can take are below.
In this article, I will provide some background on how text-to-image AI models work, including Stable Diffusion, Midjourney, and DALL-E 3. Next, I will show you how to detect if these models have been using your images for training. Finally, I will give you advice and steps that you can take to prevent the bots from stealing your stuff.
Text-to-image Generative AI Models
One of the first text-to-image AI models is called alignDraw by Elman Mansimov et al.. It was described in a 2016 paper called "Generating Images from Captions with Attention." [1] Here’s what the authors said about their model.
…we introduce a conditional alignDRAW model, a generative model of images from captions using a soft attention mechanism. The images generated by our alignDRAW model are refined in a post-processing step by a deterministic Laplacian pyramid adversarial network. We further illustrate how our method, learned on Microsoft COCO, generalizes to captions describing novel scenes that are not seen in the dataset, such as "A stop sign is flying in blue skies" – Elman Mansimov, et al.
Here are three sample images from their paper. Note that the images are very small, 32×32 pixels.

Even at this small size, you can see some red objects in the sky. The objects don’t have octagonal shapes or any discernible letters, but they look a little like stop signs.
Since 2016, a lot has happened with text-to-image AI models. For comparison, here are images from the latest versions of Stable Diffusion [2], Midjourney [3], and DALL-E [4] generated from the same prompt.



These images are much better. They were all generated at 1024×1024 pixels with a lot of details. The word "STOP" is rendered nicely by all three models, a relatively new development. Earlier versions of these models would show shapes that kinda looked like letters but were often not entirely legible. The image on the right by DALL-E 3 looks more like an illustration than a photo, but my prompt was ambiguous, so that’s the choice it made.
Text-Guided Diffusion Models
The current crop of text-to-image generation systems are all variations of text-guided diffusion models. These models were trained to start with random noise and iteratively improve to look like realistic images based on a given text prompt.
One of the first text-guided diffusion models is GLIDE [5], which I wrote about in my BIG.art article in 2022. Here’s an overview of the work.
Diffusion models have recently been shown to generate high-quality synthetic images, especially when paired with a guidance technique to trade off diversity for fidelity. We explore diffusion models for the problem of text-conditional image synthesis and compare two different guidance strategies: CLIP guidance and classifier-free guidance. We find that the latter is preferred by human evaluators for both photorealism and caption similarity, and often produces photorealistic samples. – A. Nichol, et al.
Note that CLIP stands for Contrastive Language-Image Pre-training [6], which led to the training of a pair of encoder models that can be used to measure the similarity between a text phrase and an image.
Training Text-to-image Generative AI Models
The first step in training a text-to-image model is gathering training data, which consists of pairs of images with appropriate text captions. Here’s a flow diagram showing the data filtering process used to train the Stable Diffusion 2.1 model [7].

All the images and text used to train the Stable Diffusion 2.1 originated on the Internet, with about 50 billion web pages. CommonCrawl, a US-based non-profit group, collects datasets of 3 billion web pages scraped from the Internet using the CCBot, which checks the robots.txt file first to see if it is allowed to scrape the pages. LAION, a German-based group, collected their 5B Dataset of caption/image pairs. They scanned the CommonCrawl dataset to find images that met their criteria: images must have alt-text assigned to them, images must be of a minimal size, and the text must match the image based on a minimal CLIP score. Stability.AI trained their Stable Diffusion model on images from the LAION 5B dataset, further filtering them to be at least 512×512 in size and clearing an NSFW filter to remove any toxic content. They then trained the model, running 1.4 million steps using 32 systems with 8 A100 GPUs in each system.
Release Licenses
CommonCrawl limits the use of its "crawled content" under its terms of use, which state that people cannot use it for any illegal purpose or other types of prohibited use listed in the terms.
LAION released their 5B dataset under the Creative Common CC-BY 4.0 license, which means people can use it for commercial purposes, but they must give attribution to the creators of the imges.
Stability.AI released its Stable Diffusion 2.1 model under its CreativeML Open RAIL++-M license, allowing commercial use but spelling out specific prohibited uses.
Running Stable Diffusion 2.1
The easiest way to run the model is to go to a free site on Hugging Face, type in a prompt, and hit enter. Here are the results of running the prompt, "A photo of an abstract Cubist bust of a man rendered as a 3D-printed object in blue plastic, on a pedestal, in a room with white walls." I tried it with three versions of Stable Diffusion, SD1.5, SD2.1, and 3.0-medium.



These images look OK but could be better. Yes, they look like 3D-printed busts of men in blue, but they are not exactly Cubist. The general quality of the images improves with each version, and the walls are indeed white.
Now that you know how these text-to-image models work, I’ll show you how to check if the artwork on your website is being used to train these models to mimic your visual style.
Check to See if Gen AI is using Your Artwork
To see if your art was scanned for training generative AI models, you can follow the steps laid out in the filtering process mentioned above.
I will use examples from three artists to illustrate how to do this: Auguste Rodin is a 19th-century French sculptor; Kara Walker is an American artist known for her work with silhouettes; Anna Kristina Goransson is an artist who works with textiles and fiber arts.
Checking CommonCrawl
The first step is to check to see if your site was scanned by the CommonCrawl bot. To do that, head to index.commoncrawl.org, choose the latest crawl, and type in your URL.
For example, here are results from the Rodin Museum, RodinMuseum.org.


You can see that CommonCrawl has captured the content from the RodinMuseum.org site. I also checked the websites WalkerArt.org and AnnaKristinaGoransson.com to find that they are in the CommonCrawl dataset, too.
However, here are the results from my website, RobGon.com.


You can see that no captures were found for my site. This is either because the bot wasn’t aware of it or because I turned off AI bot scanning in my site settings. More on this later in the article.
Checking Alt Text
The alt attribute in HTML has been around since 1993. It allows web designers to specify text associated with images that will be shown if the image can’t be displayed. It also allows people with impaired vision to hear the alternate text read aloud using screen reading apps. Here’s what it looks like in HTML.
<html>
<head></head>
<body>
<img src="blue_sculpture.jpg" alt="A blue Cubist sculpture of a man">
</body>
</html>
In the img tag, the src attribute specifies the name of the image file, and the alt attribute specifies the alternative text.
As I mentioned earlier, the LAION dataset is comprised of images with captions scraped from the alt text. In other words, if there was no alt text, then the images were not collected.
You should check your website to see if you are using alt text for the images of your artwork. Browser plug-ins like the Image Alt Text Viewer for Chrome make it easy to read the alt text on web pages. When you enable the plug-in, it shows the alt text in the images, if there is any.

The Alt Text Viewer shows the alternate text above the images in green if it exists but warns in yellow if there is no alt text. You can also check this in the HTML page source code, but using the viewer plug-in is easier.
Although my advice is to add alt text to images on web pages to help people with impaired vision, my recommendation is to describe what is in the image without adding identifiers like the name of the artist and the title of the pieces in the alt text. There is an opportunity to do that with captions and descriptive text on the page.
Checking if Generative AI Can Mimic an Artist’s Style
The most effective way to see if text-to-image AI models can mimic a particular style is to try it. I used prompts like "a sculpture by < artist’s name>" with Stable Diffusion 3.0, Midjourney v6, and DALL-E 3 to see if these generative AI models could replicate the style of artists.
For reference, here are some real sculptures by Auguste Rodin.



Rodin’s work shows the realism of muscular bodies with irregular but natural supports. Here are the results from the three generative AI models from the prompt, "A sculpture by Auguste Rodin."



All three AI models seemed to generate versions of Rodin’s The Thinker. The images from Stable Diffusion and Midjourney show a sculpture cast from metal, whereas the sculpture from DALL-E appears to be carved in stone. All three show muscled men perched on irregular surfaces, but it’s unclear what’s going on with the legs in the Stable Diffusion version. However, all three are creating sculptures in the style of Auguste Rodin.
Here are some works on paper by Kara Walker, with permission from the artist.



As you can see in the images above, Kara Walker investigates race, gender, sexuality, and violence through silhouetted figures.
Here are the results from the generative AI models from the prompt, "A drawing by Kara Walker."



The generative AI systems seem to know Kara Walker’s style. They all feature silhouetted figures set in a historical context. The Stable Diffusion model only showed a solo figure, where Midjourney and DALL-E 3 showed a group of multiple people in an outdoor setting, which is more common in Walker’s work. However, none of the AI systems showed anything overtly violent or sexual, which was mitigated during training and is prohibited by the systems’ terms of service.
Here are three works by Anna Kristina Goransson, with permission from the artist.



Goransson constructs three-dimensional felt sculptures and often dyes them with graded colors, typically presented as wall installations. Let’s see if the AI models can mimic her style with the prompt, "A sculpture by Anna Kristina Goransson."



Neither Stable Diffusion nor Midjourney knows about Goransson’s style. They both generated an abstract sculpture of a woman. Because the GPT-4 model from OpenAI drives DALL-E 3, it seems to know about her work from text sources on the Internet, but it doesn’t really know her particular visual style, and the generated image reflects this. Although it showed some color grading in the 3D forms, the style is quite different.
Protecting Your Artwork from Generative AI
In this section, I will discuss steps you can take to protect your art from being used to train generative AI systems that would mimic your style.
Blocking the Bots
The first and easiest step you can take is to block AI crawlers from grabbing your work. This is done by directly or indirectly editing your robots.txt file for your site.
For example, here are the SquareSpace settings for my site and the corresponding robots.txt file.


SquareSpace has a setting for Crawlers that allows me to block search engine crawlers and/or known AI crawlers. In my case, I don’t block search crawlers like Google and Bing, but I do block AI crawlers like the CCBot and others. The robots.txt file shows a list of user agents, which are the names of the various AI crawlers. The "Disallow: /" line below the crawlers tells them not to scrape my site’s pages. Instructions on blocking AI crawlers for other web hosting services are here.
Use Appropriate Alt Text
As I mentioned above, it’s OK to use alt text that describes what’s going on in your images. I advise excluding information that identifies the artist in the alt text. You can use other text on the web page for that.
Here’s a screenshot of the SquareSpace UI for entering alt text.

In SquareSpace, I edited a page, selected an image, and, using the Edit menu, entered text into the Image Alt Text field. The change is automatically saved.
In general, my approach is to provide alt text that describes what’s in the picture, no more, no less.
Countermeasures to Protect Artists
While researching this subject for my article, I learned about several systems that provide countermeasures against generative AI.
Three such systems are Glaze [8], Anti-DreamBooth [9], and Mist [10]. All three systems propose making subtle changes to artists’ images to help prevent mimicry if these images are later used to train or fine-tune diffusion models. Here’s what the authors of Glaze said about their system.
Recent text-to-image diffusion models such as MidJourney and Stable Diffusion threaten to displace many in the professional artist community. In particular, models can learn to mimic the artistic style of specific artists after "fine-tuning" on samples of their art. In this paper, we describe the design, implementation and evaluation of Glaze, a tool that enables artists to apply "style cloaks" to their art before sharing online. These cloaks apply barely perceptible perturbations to images, and when used as training data, mislead generative models that try to mimic a specific artist. – Shawn Shan, et al. [8]
This sounds impressive! The authors made the Glaze system accessible as a Windows and Mac application.
Testing Glaze
To test the Glaze system, I created a series of images with a similar style and fine-tuned Stable Diffusion 2.1 using the term "by robgon-art" in the captions. I then applied Glaze to the images and reran the fine-tuning. I then compared the model trained with non-Glazed and Glazed images to see how they compared.
For testing, I created a series of pictures of "Blue Contemplation" sculptures. The images depict busts of people based on one of my previous projects on using AI to create 3D-printed objects. I used DALL-E 3 to create these images. They all show faceted plastic sculptures in blue on a simple pedestal.

Based on my prompts, the system created sculptures of people of different ages, genders, and ethnicities with a distinct visual style.
For example, the prompt for the first image was "Generate a photo of a modern, abstract, 3D-printed cubist bust in blue plastic of a teenage girl from Mexico, 17 years old, dressed in a denim jacket, with long curly hair. The bust is rotated 20 degrees to the right, giving the appearance that she is looking to the right. The sculpture features a smooth surface with subtle indications of assembly from smaller pieces, creating intricate geometric forms and facets. It is displayed on a white pedestal with a wide-angle view, highlighting the detailed shapes and angles of the bust. The background is minimal and clean, drawing full attention to the artwork."
The Glaze application has three strength settings: Low, Default, and High. I used the Default setting for my tests. Here is the resultant image before and after running Glaze.


It’s subtle, but you can see from the images above that the original image on the left looks sharp and has smooth whitespace, while the Glazed image seems a little soft and has a barely noticeable swirling pattern in the white space. You can click on the images to see the details.
Fine-tuning Stable Diffusion
I used the DreamBooth script to fine-tune the Stable Diffusion 2.1 model using the 12 Blue Contemplation sculpture images. I ran the script twice to create two different model variations, one with Glaze protection and one without. During training, I used shorter prompts like this, "a sculpture of a teenage girl from Mexico, 17 years old by robgon-art." The system learned to associate the phrase "by robgon-art" with the visual style of the training images.
The Test Results
Here are three images that were not used in the fine-tuning run, for reference. Although there is a slight variation, all three images show a unique visual style.



Below are images generated by Stable Diffusion 2.1 fine-tuned on my Blue Contemplation images. Here are the prompts I used to generate the images.
- "a sculpture of a middle-aged man from South Korea, 40 years old by robgon-art"
- "a sculpture of a middle-aged woman from Nigeria, 45 years old by robgon-art"
- "a sculpture of an elderly man from France, 72 years old by robgon-art"
First, here are images generated by Stable Diffusion 2.1 fine-tuned on the Blue Contemplation images without Glaze protection.



The model picked up on the style of the training images. The quality of these images is not as good as the training images created with DALL-E 3, but the system did a reasonable job of depicting people with the Blue Contemplation style. Also, the contrast for all images was cranked up a bit compared to the training images.
Here are images from Stable Diffusion 2.1 fine-tuned on the same images protected with Glaze using the same prompts listed above.



Running Glaze on the training images had a visible effect on fine-tuning. Although the system picked up the style from the Blue Contemplation series, the results didn’t exactly follow the prompts, and the images were less appealing. For example, there is a discernible noise pattern in the middle image, and the extra hand popping up in the image on the right is bizarre. It seems Glaze didn’t prevent mimicking the visual style, but it made the fine-tuned model less useful.
Criticism of Glaze
I found a research paper critical of Glaze and the other systems that purport to protect artists’ images from mimicry. It has the ominous title "Adversarial Perturbations Cannot Reliably Protect Artists From Generative AI," it comes from ETH in Zurich, Switzerland [11]. Here’s what the authors said.
Artists are increasingly concerned about advancements in image generation models that can closely replicate their unique artistic styles. In response, several protection tools against style mimicry have been developed that incorporate small adversarial perturbations into artworks published online. In this work, we evaluate the effectiveness of popular protections – with millions of downloads – and show they only provide a false sense of security. … In this work, we show that state-of-the-art style protection tools – Glaze, Mist and Anti-DreamBooth are ineffective when faced with simple robust mimicry methods. – Robert Honig, et al.
In their paper, the authors experiment with various "purification" methods to remove the protections against generative AI. One technique that seems to work well is called "noisy upscaling," where Gaussian noise is applied and a stable diffusion upscaler model is used.
This is part of an interesting cat-and-mouse game where some researchers are trying to generate images, others are trying to protect images, and yet others are trying to defeat the protections.
Conclusion and Final Thoughts
In this article, I provided some background on how text-to-image AI models were trained and how they work. I also showed some ways to check whether your images were scraped and used to train AI models and some steps to protect your images from generative AI.
I mentioned it twice already, but I’ll say it a third time: The easiest way to protect your work is to block access to AI crawlers on your website. As mentioned above, I recommend checking and updating the alt text for your images. Whether using a protection tool like Glaze is worth the effort is unclear. It would take a lot of effort, and there seem to be relatively easy ways to defeat these protections.
Acknowledgments
I thank Jennifer Lim for her help with this project. I also thank Kara Walker and Anna Kristina Goransson for allowing me to use some of their artwork in this article. Finally, I thank Prof. Ben Zhao from the University of Chicago for answering my questions about Glaze.
References
[1] E. Mansimov, E. Parisotto, J. Lei Ba, and R. Salakhutdinov, Generating Images from Captions with Attention (2016), International Conference on Learning Representations
[2] A. Sauer et al. Fast High-Resolution, Image Synthesis with Latent Adversarial Diffusion Distillation (2024)
[3] Midjourney (2024)
[4] J. Betker et al., Improving Image Generation with Better Captions (2024)
[5] A. Nichol et al., GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models (2021)
[6] A. Radford et al., Learning Transferable Visual Models From Natural Language Supervision (2021)
[7] R. Rombach, A. Blattmann, et al., High-Resolution Image Synthesis with Latent Diffusion Models (2022)
[8] S. Shan et al., Glaze: Protecting Artists from Style Mimicry by Text-to-Image Models (2023), 32nd USENIX Security Symposium
[9] T. V. Le, et al., Antidreambooth: Protecting Users from Personalized Text-to-image Synthesis (2023), Proceedings of the IEEE/CVF International Conference on Computer Vision
[10] C. Liang et al., Adversarial Example Does Good: Preventing Painting Imitation from Diffusion Models via Adversarial Examples (2023) International Conference on Machine Learning
[11] R. Honig, J. Rando, N. Carlini, F. Tramer, Adversarial Perturbations Cannot Reliably Protect Artists From Generative AI (2026)