PySyft for Android

Extending OpenMined to bring privacy to mobile devices

Jose Corbacho
Towards Data Science

--

What is PySyft

“PySyft is a Python library for secure, private Deep Learning. PySyft decouples private data from model training, using Multi-Party Computation (MPC) within PyTorch”

PySyft is the main part in the OpenMined family.

What is PySyft for Android

PySyft for Android is the first step towards having different platforms working together in a PySyft setup

PySyft relies on workers to do the job. Currently the workers are implemented in Python and hooked to the PySyft wrapper on top of PyTorch

PySyft for Android expands this system by allowing a mobile device on the edge to do operations using a different framework. In the case of this PoC, the Android app uses Nd4J and DL4J to compute the operations coming from a PySyft (PyTorch) worker

Applications

In a federated learning setup, the Android app sits on the edge, keeping its data to do local training

For MPC, we can have participants using different operating systems and platforms to do the operations

High-level Architecture

There are three main components in this setup:

  • A PySyft worker implementing WebsocketIOClientWorker acting as a facade
  • A WebsocketIOServerWorker that forwards the requests and responses between the two participants
  • An Android application executing the operations sent from the facade

The following diagram shows how a command sent by the driver or actor is handled in this setup

These objects form a component that offers the same API as any other worker in PySyft via the SocketIOClientWorker. From the point of view of the actor this whole component is just another worker

Architecture of the app

The architecture of the app can be seen in the following diagram. It’s an architecture based on separation of concerns and defined by layers following clean architecture. This will allow us to reuse much of the work into other Java/Kotlin platforms

Communication Details

The communication is done using SocketIO but both the worker on the server side and the Android app could use other socket libraries

The main restriction is imposed by the synchronous nature of the PySyft workers

  • _recv_msg is synchronous and once the operation is sent to Android, it cannot wait for the latter to reply as it’s going through the server. This is what the semaphores are doing in SocketIO classes

Using a neutral Tensor representation

Given that Android cannot use PyTorch objects, a neutral representation had to be selected to make the data available on both sides. Fortunately both PyTorch and DL4J offer the possibility of working with npy format. Though this conversion hits the performance it assures us the data transferred from and to both sides is not corrupted

Conversations with the TFLite team about having npy implemented were also encouraging and it should not be too difficult. This work could also be used for other platforms (KMath-based platforms, for instance)

PoC

This PoC implements the following PySyft operations: send, delete, get and add

The following libraries are used:

  • DL4J: This is the machine learning framework that will process the operations though for this PoC only Nd4J is necessary. The system is open to other frameworks
  • msgpack: PySyft uses it to move the data and operations from worker to worker. This had to be implemented on Android too though it is open to use different formats if necessary
  • SocketIO: There is a strong dependency of using this library since it’s also used by the proxy/server.

How to run it?

You must have PySyft installed. Follow the instructions from the repository

Get the code for the application from the following repository

Running this setup requires the following steps:

  1. Start the Android application
  2. Start the server (socketio_server_demo.py shows how to do this in a simple manner)
  3. Start the socketio client
  4. The driver can be run in a notebook. “Socket Bob” can be used as an example
  5. Connect the socketio client (Bob)
  6. Connect the Android app
  7. Start using it!

Coming soon

  • Kotlin multi-platform module that would allow the core implemented for the JVM to be used in other environments
  • Tensorflow Lite implementation
  • KMath implementation
  • Non-blocking socket communication
  • Compression
  • A plan with a model

References

--

--