
In this post, we will learn how to use Python for basic web scraping, image manipulation, and string handling while playing a game known as ‘fake album covers‘. The idea of the game is to generate a fake cover for your own band’s CD following the steps below –
- Get any random image from Lorem Picsum which will be the album of the cover.
- The band name on the cover will be generated using title from any random wiki page title.
- The album name will be the last four words of of the very last quote within a random page from quotations page. The random link doesn’t work anymore and I checked reddit to find that designers now use wikiquote page. So for this post I have used the title of the randomly generated wikiquote page.
To start the game we need to get the image from the site mentioned above and here we need very simple Web Scraping techniques with python. The idea of web scraping in short is, instead of a human browsing and copy-pasting relevant information in a document, a computer program does the same, much faster and more correctly to save time and effort. Even though we are not using web scraping here for data analysis, it is still useful to remember that the web sites are written mostly using HTML and these are structured documents, and not like a familiar CSV or JSON format that can be handled using data-frame library directly, like with pandas
.
To start with, first question would be what library will help us here in web scraping with Python ? Three of the most used libraries are BeautifulSoup, Requests, and Scrapy. Here we will learn to use requests, which will allow us to send HTTP request to get the HTML files.
First we can start off with getting a random picture from Lorem Picsum website
import requests
response_obj = requests.get('https://picsum.photos/g/500/?random')
Here we created a response object ‘raw_pic’ by using requests.get
. Now to get the information from this object, we need to access the response body and for that we will use content
. Below is an example of saving the response object as a .png image –
name = 'first_pic.png'
with open(name1,'wb') as raw_file:
# 'wb': write binary, binary is necessary because it is a png file
raw_file.write(response_obj.content)
I got a random image as below

Now as a rule of the game, we need to put the album name and band name on this image and for that we will use python image library (PIL).
from PIL import Image
First open and identify the image file using Image.open
img = Image.open("first_pic.png")
Since we want to draw (write) text on this image to complete the cover, we will use ImageDraw
module from PIL library.
from PIL import ImageDraw
draw = ImageDraw.Draw(img)
This object 'draw'
can be used later for inserting texts.
Next objective is to specify the font directory path for the fonts we will use to write the band name and album and I ask you to play around with font style and be creative. To load a font file and create a font object, we will again use PIL library’s ImageFont
module.
from PIL import ImageFont
band_name_font = ImageFont.truetype("path_to_font/font1.ttf", 25)
album_name_font = ImageFont.truetype("path_to_font/font2.ttf", 20)
Before we place the text and use these fonts to write, we need to use web scraping again to extract texts from websites, which are written using HTML.
Like before for the image, here we create a response object for the random wiki page –
wikipedia_link='https://en.wikipedia.org/wiki/Special:Random'
r_wiki = requests.get(wikipedia_link)
To read the content of the response one can use text
as below
wiki_page = r_wiki.text
Requests library will automatically decode content from the server. When we make a request, Requests library will make an educated guess about the encoding of the response based on the HTTP headers, which we can see here using r_wiki.headers
. The test encoding guessed by Requests is used, when we run r_wiki.text
. Let’s check the encoding type
print "encoding type", r_wiki.encoding
>>> encoding type UTF-8
print type(wiki_page)
>>> <type 'unicode'>
Let’s see how the decoded content looks like
print (wiki_page)
I get an output as shown below

Our idea is to use the title of the Wiki page as the title of the band. As you can see in the above output, the title of the article is surrounded by the XML nodes as follows
wiki_page = wiki_page.encode("ascii", errors='ignore')
print type(wiki_page)
>>> <type 'str'>
After this we are ready to use string.find() and eventually just select the specific part of the XML as below
xml_e = wiki_page.find('</title>')
xml_b = wiki_page.find('<title>')
title_b_len = len('</title>')
total_len = xml_e + title_b_len
title_wiki = wiki_page[xml_b:total_len]
print "title including wiki: ", title_wiki
>>> title including wiki: <title>Menyamya District - Wikipedia</title>
Remaining task is to get rid of the unnecessary parts and we will be left with only the title. This you can take as a small assignment for string manipulation with python as there are tons of ways to do this, and check the github page later to compare solutions. We follow the same procedures for the wikiquote page and extract the title which will be our album name.
Finally, once we have the band name and album name from web scraping, we will write (draw) these as texts in the image file. For this, we will use text
on the draw object we have defined before. Since I’m working with only gray-scale images, I can play around with text colors but if you want to have a colored image as the fake cover then the best option is to write text in a way so that it will be visible with background of any colors. Such format is provided by Alec Bennett and you can follow it and improvise. Once done we are ready to check our fake CD cover, and remember that the image, band name and album name are all subject to change on every run. So have fun! Below is the one I got in one such run

While playing this fun game, what have we learnt so far –
- Web Scraping using Requests Module.
- String Handling; Basic operation like replace, strip etc (assignment for you).
- Image manipulation using PIL library.
Quite a lot, isn’t it ? The full notebook is available in my github page. This whole game was a part of the final assignment in Python for applied data science course in Coursera, and as always these courses contain some fantastic lab sessions to play and learn with python. However, instead of the Wikiquote page, we used Wikipedia page twice for the assignment, to generate the album and band name.
Stay tuned for more!
About Me: I was involved in science analysis of CALET electron spectrum to find Dark Matter signatures. Find me in Linkedin.