
For the last four years, I have been experimenting with and writing about the creative uses of AI and ML. I have researched and kept up-to-date with the latest tools for 2D and 3D image editing, music composition, and creative writing. My work has focused on how these tools can assist and enhance various forms of artistic expression.
For this article, I explored the application of AI/ML for image editing, evaluating both commercial and open-source tools. I examined Adobe Photoshop and Runway ML alongside open-source systems like Stable Diffusion 2 and InstructPix2Pix. Each tool offers various image editing features, such as inpainting and style transfer. My goal was to assess their practical applications, usability, and limitations to understand how they can aid in creative expression. I experimented by editing images I had created for prior articles, showing the potential for these tools to enhance and modify existing artwork.
In the sections below, I will show how AI models can be used to inpaint and edit images with text prompts. I will also discuss the license agreements of these systems and the societal impact of using AI to edit pictures. I will provide some final thoughts at the end of the article.

Inpainting Images
The first feature I looked at is Inpainting images with text prompts. This capability allows creators to modify visuals by selecting an area of an image and describing the desired change in text form. I examined three systems: Stable Diffusion 2 Inpainting (an open-source model), Runway ML’s Erase and Replace, and Adobe Photoshop’s Generative Fill (commercial offerings). I’ll describe each system using an example image from my article "Digital Art Showdown" from November 22. I created a picture of a "splatter painting with thin yellow and black lines" using Midjourney version 3. After describing the three systems, I will provide some side-by-side examples for comparison.
Stable Diffusion 2 Inpainting
The first system I looked at is Stable Diffusion 2, an open-source AI model for generating and inpainting images with user-specified image areas and text prompts. CompVis, the Computer Vision and Learning research group at Ludwig Maximilian University of Munich, created the model. Their paper is entitled "High-Resolution Image Synthesis with Latent Diffusion Models" [1]. Here’s what the authors said in the paper.
To enable [diffusion model] (DM) training on limited computational resources while retaining their quality and flexibility, we apply them in the latent space of powerful pretrained autoencoders. In contrast to previous work, training diffusion models on such a representation allows for the first time to reach a near-optimal point between complexity reduction and detail preservation, greatly boosting visual fidelity. By introducing cross-attention layers into the model architecture, we turn diffusion models into powerful and flexible generators for general conditioning inputs such as text or bounding boxes and high-resolution synthesis becomes possible in a convolutional manner. Our latent diffusion models (LDMs) achieve new state-of-the-art scores for image inpainting and class-conditional image synthesis … while significantly reducing computational requirements compared to pixel-based DMs. – Robin Rombach, et al.
Autoencoders are ML models that reduce images to an abstract "latent" space and then recreate the exact images as the output. Cross-attention layers in diffusion models are designed to propagate information from the input space to the output space during generation.
To demonstrate the model’s abilities, I built a Google Colab that takes an input image, a user-specified area, and a text prompt. I added another black splatter to the abstract painting. The Stable Diffusion 2 model generated three versions of the modified area. Here are some screenshots of the Colab.


The screenshot on the left shows how I specified the area for the new black splotch in white. The screenshot on the right shows how I used the prompt "black ink splot with small white speckles" to generate three output images with Stable Diffusion 2. I liked the first output image the best. Here is a comparison of the image before and after the modification.


The new paint splotch is in the upper right. It mostly follows the style of the other splatterings in the painting with the white speckling details. It made a creative choice to show some of the light gray background color underneath the splotch, which was unexpected. But overall, the modification seems aesthetically pleasing.
Runway ML Erase and Replace
The second inpainting system I tested was the Erase and Replace feature in Runway ML’s online service. Runway ML has many AI-based tools for creating and editing images and videos. Users can sign up for free and get 105 image generations. If they need more generations, various subscription plans start at US$12 per month.
I uploaded the original yellow and black splatter painting, chose Edit Images, and then opened the Erase and Replace feature. I selected the area for the new element, typed in the prompt "black ink splot with small white speckles," and clicked the Generate button. The system then generated four options for me to choose from.


I selected the best option and downloaded the modified image. Here are the before and after images.


This new paint splotch looks really good! It matches the style of the splatters in the original image. The only minor issue is that the yellow color of the modified area doesn’t match the surrounding color. It’s subtle but noticeable. I could quickly fix this using an image-painting program like Adobe Photoshop.
Adobe Photoshop
The third system I tested was Adobe Photoshop. Users widely recognize Adobe Photoshop for its extensive capabilities in digital image editing, including selection tools, layering options, and numerous filters and adjustments. It has been shipping commercially since February 1988. Adobe offers a seven-day free trial, and monthly subscriptions start at US$23.
I used the Photoshop 25.11 beta version to try out their new Firefly Image 3 Model for generative fill. However, before using the feature, I needed to agree to Adobe’s user guidelines for generative AI. I’ll discuss ethical considerations further in the section below.

After agreeing to the guidelines, I imported the splatter painting into Photoshop and selected the area to be modified. Next, I entered "black ink splot with small white speckles" into the Contextual Task Bar and clicked the Generate button. The system created a new layer with three options.


I chose the third variation of the Layers tool, which looked the best to me. Here are the before and after images.


This paint splotch looks good, too. The style is a little different from the other splatters. It seems it was rendered as a close-up, not using the scale of the rest of the image. However, the yellow background color matches perfectly.
Comparison of the Three Systems
I tested the three systems using examples from prior articles. For each example, I show the original image and the three modifications from Stable Diffusion 2, Runway ML, and Photoshop. The row below shows a corresponding detailed area of change.
Splatter Painting
As mentioned above, the first example shows modifications to a digital abstract painting from my article "Digital Art Showdown" from 2022. I created the image using the prompt "splatter painting with thin yellow and black lines" using Midjourney v3.


The original image is on the left, with the modification area marked with a gray box.
Stable Diffusion 2 produced an aesthetically pleasing result in the second column. It integrated the new paint splatter into the original artwork, though it introduced an unexpected light gray background color.
The best results came from Runway ML’s Erase and Replace feature in the third column. Despite the slight mismatch in the yellow background color, the generated splatter matched the style and detail of the original image.
Adobe Photoshop’s Generative Fill in the fourth column provided a well-matched yellow background. Still, the scale of the paint splatter differed slightly from the other elements, making it appear more of a close-up.
Freddy the Fox
The second example is from my article "Using ChatGPT as a Creative Writing Partner—Part 3: Picture Books," which featured illustrations from Midjourney v4. I used GPT 3.5 Turbo to create the prompts for the images. I used the two systems to create a children’s book called Freddy the Fox and the Magical Forest. Here is the prompt for the cover illustration.
Create an illustration for the cover of a picture book. Show a talking fox, who is brave and adventurous, standing in the middle of a magical forest. A young girl is standing next to the fox, looking up at him with a look of wonder on her face. The girl has red hair, is around 8 years old, and is wearing a green dress. The background is a beautiful, colorful sunset, with the trees and other elements of the forest casting long shadows across the ground. The illustration should be eye-catching and full of whimsy and wonder, inviting young readers to dive into the magical world within the pages of the book.
You can see the resultant image from Midjourney v4 below. Note that the fox does not have a pupil for his eye. I used Stable Diffusion 2, Runway ML, and Photoshop to generate the fox’s pupil.


The results are shown in the first row of images, with the corresponding areas of detail below. The first column shows the original image.
The second column shows the results from Stable Diffusion 2. The new eye looks excellent. It appears that the fox is looking at the girl in the picture. The eye’s style fits in well with the rest of the illustration.
The third column shows the results from Runway ML. The blue pupil looks OK but not very intense. It seemed like the system pasted a generic pupil into the picture.
The fourth column shows how Photoshop handled the prompt. The color choices are more in line with the illustration, but it also looks like a copy/paste job.
Cubist Head Sculpture
My third and final example came from my article "Molding the Imagination: Using AI to Create New 3D-Printable Objects," published in February 2024. One of the examples in the article was using a text-to-3d AI model called MVDream. I used the prompt "a 3d-printed Cubist-styled sculpture of a male bust, in light-gray plastic, on a simple light-gray pedestal, dark-gray background" to generate views of a physical object.
The image came out well, but the nose of the sculpture lacked texture. The original image is on the left, followed by modifications from Stable Diffusion 2, Runway ML, and Photoshop.


In the second column, the results from Stable Diffusion 2 added a textured pattern to the nose that fit in with the existing sculpture’s design, although the added texture didn’t cover the entire nose.
In the third column, Runway ML’s Erase and Replace feature produced the best result by enhancing the nose with a more pronounced and cohesive texture that matched the geometric style of the original sculpture.
In the fourth column, Adobe Photoshop’s Generative Fill also added texture to the nose, but it appeared less prominent.
Overall, Runway ML provided this example’s most aesthetically pleasing and consistent modification.

Editing Images with Text Prompts
The second set of experiments I ran involved editing images with text prompts. Unlike inpainting, these AI models change every pixel in the input image to follow the prompt’s instructions while keeping the general form of the input image. I looked at two systems, the open-source InstructPix2Pix and the commercial Runway ML Image to Image feature.
My first example uses an image from my first article on Medium, "MachineRay: Using AI to Create Abstract Art," published in August 2020, where I trained a Generative Adversarial Network (GAN) using public domain paintings on wikiart.org.
InstructPix2Pix
A team of researchers from UC Berkeley developed the InstructPix2Pix model and wrote about it in their paper, "InstructPix2Pix: Learning to Follow Image Editing Instructions" [2]. Here’s what they said about the model.
We propose a method for editing images from human instructions: given an input image and a written instruction that tells the model what to do, our model follows these instructions to edit the image. … Since it performs edits in the forward pass and does not require per-example fine-tuning or inversion, our model edits images quickly, in a matter of seconds. We show compelling editing results for a diverse collection of input images and written instructions. – Tim Brooks et al.
The authors based their model on the Stable Diffusion model mentioned above. Instead of creating a new image or inpainting part of an image, they designed it to modify an image based on a prompt.
I created a Google Colab to experiment with InstructPix2Pix. Note that there are two parameters to adjust: the image and text weights, both known in the literature as "classifier-free guidance." Increasing the image weight keeps the image intact while increasing the text weight emphasizes following the text prompt.
For this example, I used the prompt "an abstract oil painting with swirling texture" and set the image weight to 1.0 and the text weight to 7.5, quite a bit higher. Here are the results.

I set the number of samples to three, showing the original image on the left and three variations of edited photos to the right. I liked the first image, so I downloaded it. Here are the original and modified images.


The results look fantastic! I liked how InstructPix2Pix kept the general form of the image but added the paint-like swirls of texture. It looks a lot like an oil painting.
Runway ML Image to Image
The other image editing system I tested was Runway ML’s online service’s Image to Image feature. I uploaded the original abstract painting, chose Generate Images, and opened the Image to Image feature. I loaded the source image and typed in the prompt "an abstract oil painting with swirling texture in red and blue with saturated colors." Note that I added the bit about the colors because otherwise, it seemed to ignore the colors in the original image. I then clicked the Generate button. The system then generated three options for me to choose from.

I chose the second image because it had the most contrast. Here are the original and modified images.


OK, this is quite different from the original abstract painting. It seems that the colors from the original image didn’t come over at all, but a trace of the form did. It looks like an abstract oil painting with swirling texture and refined details.
Comparison of the Two Systems
Here are the results of my experiments with InstructPix2Pix and Runway ML’s Image to Image feature using three images from my prior articles.
Abstract Painting in Red and Blue
Here is the original painting from MachineRay compared to the modifications of the two image editing systems.



InstructPix2Pix turned the image into a vibrant abstract oil painting with a swirling texture while mostly maintaining the original color forms. The changes look good, with added depth and fluidity to enhance the look.
Runway ML’s Image to Image feature also created an abstract oil painting with a swirling texture but introduced a new color scheme dominated by muted reds. This modification diverged more from the original, emphasizing the swirling textures to produce more details.
Overall, InstructPix2Pix preserved more of the original image’s essence, while Runway ML’s Image to Image provided a more dramatic change in terms of color and texture.
Contemplation in Blue
The second image editing example is from my most recent article, "Contemplation in Blue: how to design large 3D sculptures with AI," which describes my use of DALL-E 3 to design a cubist sculpture I constructed with a 3D printer. Below is a photograph of the finished piece next to images created with InstructPix2Pix and Runway ML’s Image to Image feature using the prompt, "a brass sculpture of a cubist head."



InstructPix2Pix transformed the sculpture into a dull brass version while maintaining the cubist design and structure. The subtle change gave the sculpture a metallic look without significantly altering its original form.
Runway ML’s Image to Image feature produced a shiny brass sculpture with a more pronounced transformation, including turning the wood pedestal into brass. The model softened the cubist features, smoothing the edges for a polished look.
Overall, InstructPix2Pix preserved the essence of the original sculpture with a slight material change, while Runway ML offered a more dramatic transformation into a brass sculpture.
Cubist Mona Lisa
For my third and final example, I ran an experiment I first tried in my article "Exploring Midjourney V4 for Creating Digital Art," published in December 2022.
Below is the original painting, followed by modifications by InstructPix2Pix and Runway ML using the prompt "cubist painting."



These results are very different. InstructPix2Pix mostly kept the general form of the original painting intact but turned the macro-level features into angular geometric shapes. It also boosted the saturation levels of the color scheme.
The Runway ML system seemed to have reimagined Da Vinci’s painting to give it a much more modern look. It rebuilt the basic form with curved shapes and superimposed blocks of color, perhaps in response to the "cubist" prompt.
Ethical Considerations
Ethical considerations are important when using AI for image editing. Here are excerpts from the author’s statements, with my commentary.
Stable Diffusion v2 Inpainting
The creators of Stable Diffusion v2 acknowledge potential biases in their models. Here is an excerpt from the author’s model card on Huggingface.
While the capabilities of image generation models are impressive, they can also reinforce or exacerbate social biases. Stable Diffusion v2 was primarily trained on subsets of LAION-2B(en), which consists of images that are limited to English descriptions. Texts and images from communities and cultures that use other languages are likely to be insufficiently accounted for. This affects the overall output of the model, as white and western cultures are often set as the default. Further, the ability of the model to generate content with non-English prompts is significantly worse than with English-language prompts. Stable Diffusion v2 mirrors and exacerbates biases to such a degree that viewer discretion must be advised irrespective of the input or its intent. – Robin Rombach, et al.
These points highlight the importance of considering social biases in AI-generated content and the need for careful use and evaluation of these models.
Runway ML
Runway ML is mostly silent on the ethical use of AI to create content and its potential societal impacts. The only pertinent information is a boilerplate section of their Terms of Use.
You acknowledge that all images, video, audio, audio-visual, text, materials and other content, is the sole responsibility of the party from whom such Content originated. This means that you, and not Company, are entirely responsible for all Content that you upload, post, e-mail, transmit or otherwise make available through Company Properties, and that you and other Registered Users of Company Properties, and not Company, are similarly responsible for all Content that you and they Make Available through Company Properties. Your Content also includes any Content that you create or generate through the use of the Company Properties, other than the Company Properties themselves.
I’m not a lawyer, but this seems to suggest that the end users are solely responsible for the content they create with Runway ML’s services.
Adobe Photoshop
Unlike Runway ML, Adobe has a lot to say on this subject. For example, they have a dedicated landing site to explain their take on the ethics of AI.
AI is transforming the way we create, work, and communicate. By taking a thoughtful and comprehensive approach to AI ethics, Adobe is committed to ensuring this technology is developed responsibly and respects our customers and our communities. – Dana Rao, Adobe
They also posted a lengthy whitepaper on AI ethics at Adobe, focusing on responsibility, accountability, and transparency. Here’s an excerpt from the paper.
… we recognize the potential challenges inherent in this powerful technology. AI systems are based on data, and that data can be biased. AI systems trained on biased data can unintentionally discriminate or disparage, or otherwise cause our customers to feel less valued. Therefore, we are committed to maintaining a principled and ethically sound approach to ensure our work stays aligned with our intended outcomes and consistent with our values. And we are actively participating in government discussions around the world to shape AI Ethics regulation for the good of the consumer and effectiveness in the industry. – Adobe
As mentioned above, Adobe posted AI user guidelines for their generative systems. These guidelines prohibit using their generative AI features to train other AI models. They prohibit the use of their models to create images that contain abusive, illegal, or hateful content. This includes things like porn, violence, and fraud. They also advise users to be authentic and respectful and to use good judgment when using their system.
InstructPix2Px
The authors added this brief note to their paper.
… there are well-documented biases in the data and the pretrained models that our method is based upon, and therefore the edited images from our method may inherit these biases or introduce other biases. – Tim Brooks, et al.
This highlights the ongoing need for awareness and mitigation of biases in AI development and application.
Final Thoughts
I found that these AI models were very useful for editing images. In general, the results from Runway ML’s Erase and Replace and Image to Image features were excellent. I think the US$12 monthly cost is reasonable, especially with the other features like super-resolution image resizing and video generation. There is certainly more convenience with Photoshop’s built-in Generative AI features, but the US$23 per month seems a bit high to me. The open-source models performed well, and they’re effectively free. (Note that both models I tested run on the T4 GPU available in Google Colab’s free tier.) Using a Colab for a user interface is a bit wonky, but you can’t beat the price. 🙂
I really like Adobe’s approach to addressing the ethical issues associated with using generative AI systems. They are doing much more than describing the biases in their models. They are promoting transparency and a shared commitment to using AI models responsibly. I think Adobe’s guidelines are excellent, and more companies and researchers should adopt similar policies.
Source Code and Colab
This project’s code is available on GitHub. I released the software under the MIT open-source license.

Acknowledgments
I want to thank Jennifer Lim for her help with this project.
References
[1] R. Rombach, A. Blattmann, D. Lorenz, P. Esser, and B. Ommer, High-Resolution Image Synthesis With Latent Diffusion Models (2022), Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2022, pp. 10684–10695
[2] T. Brooks, A. Holynski, and A. A. Efros, InstructPix2Pix: Learning to Follow Image Editing Instructions (2023), IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2022): 18392–18402.