Data Scraping Android Apps

Analyzing App Network Traffic with apk-mitm and mitmproxy

Roger Pharr
Towards Data Science

--

Apps, windshields and web pages can all be scraped

I’ve seen more than one web scraping article on here lately, so I thought I’d add to the canon with a quick article on app scraping — or scrapping.

If you want to get your hands on the data that your app is consuming, you need to be able to see where it’s coming from. This means examining network traffic. On Android, Google puts up a few barriers to doing this. Luckily for us, the community has our back. I’m going to show you how to chain together a few tools to get the treasure you seek.

Beside data scraping, you may want to look at your app’s network traffic to see what data it is collecting on you. I’m pretty paranoid about the information my devices emit. So I want to know what these things are trying to get from me and how.

Thanks to Niklas Higi (shroudedcode on GitHub), we’ve got a dead simple, cross-platform way to see this traffic for Android apps. He released a tool called apk-mitm that lets you modify the network security profile information from your app’s apk file so that we can now view the traffic. I’m going to show you how to use it now.

Ingredients

Instructions

  1. Create and Start Your Android Virtual Device: Inside Android Developer Studio, you want to start the AVD Manager from the tools menu to create your emulator. There are no special settings you need to worry about here, you can use any device and Android version. I picked a Pixel 3 with the preview version of Android (named “R”). When done, I was at a screen that looks like below.
Your AVD is ready to go

2. Get the APK file and Modify it with APK-MITM: Next, we need an apk file to run on our emulator. I just searched for a site where you can do this and came across https://apkgk.com. There I found and downloaded the file I needed. Once downloaded, using apk-mitm is just one line.

npx apk-mitm <file-name>

3.Set up the Proxy: Of the mitmproxy tools, I prefer to use mitmweb, since I never learned their terminal keyboard shortcuts. To start this on Ubuntu, from the terminal I navigate to where I downloaded the tools, and then type “./mitmweb”. Once it’s started, you should have a screen like below.

./mitmweb

4. Set your AVD to use the proxy: In the advanced settings for your AVD, go to settings, then Proxy to set the proxy. You want to set it as shown below. Use 127.0.0.1 as the host name, 8080 as the port number, then hit apply. You may have to reboot your AVD at this point — long press on the power button to do so.

easy as one, two, three

5. Install the mitmproxy CA certificate on the AVD: In your android emulator, open the browser and navigate to http://mitm.it. Then click the android icon to download the certificate. It won’t install automatically, so you have to go to Settings -> Security -> Encryption & credentials -> Install a certificate -> CA certificate to install the certificate you just downloaded. You’ll know you’ve done it right if mitmproxy is in your Trusted Credentials.

Trust is a relative term

6. Install the App and Start Viewing Traffic: Now we get to the good part. Install the app by dragging the file you made in step 2 onto the screen of your device emulator. Open it up and you should start seeing some traffic in the mitmproxy screen. Here’s a json file this one is grabbing from AWS S3. I bet I could get the same file using python if I wanted.

Now we know what’s going back and forth from the app.

OK. That’s it. Bye.

Hopefully, this helps your projects along and helps you avoid some of the more intrusive apps. Feel free to send me some feedback via the comments, but be sure to send some love over to the makers of these awesome tools.

--

--