
With the recent new wave of operating system versions (BigSur, iOS 14 etc.), announced at recent WWDC, Apple quite silently introduced a new ML framework to accelerate training of neural networks across the CPU or one or more available GPUs.
ML Compute is not properly a new ML framework but new API’s that utilizes the high performance BNNS primitives made available by the Accelerate framework for the CPU and Metal Performance Shaders for the GPU.
After looking at the documentation and starting to use it on an iOS/macOS application, I understood that this is not really a simple, high level framework but something probably targeting the acceleration of existing third-party ML library like the ONNX Runtime or the TensorFlow Lite framework on the Apple platform.
Even if Apple documentation is pretty good, I would say these APIs are not really developer friendly and swifty for doing generic ML on iOS/macOS. The tensor API, for example, is really rough and it requires dealing with unmanaged pointers in Swift. Basically, you are responsible for managing by yourself the ownership and lifetime of memory allocation of objects like tensors, nodes and graphs that you pass to these API.
More generally, ML Compute does not provide by definition ML APIs like Keras, PyTorch or Swift for TensorFlow to simplify building and training ML models but low level API to build computing graph and manage low level training loop.
For general ML coding on iOS/macOS, I would suggest continuing to use Core ML with tools like CoreMLTools to import model from other framework (TensorFlow, PyTorch etc.) or eventually give a try to the SwiftCoreMLTools library I developed if you want to completely build and/or train model locally on devices avoiding any Python code.
Anyway my personal opinion, after playing with that, is that ML Compute could become potentially really powerful even for regular Swift ML developer adding for example on top of that a Swift Function Builder (DSL) high level API, like the one I developed for SwiftCoreMLTools, and a high level Tensor Swift API, hopefully, integrated with Swift Numerics multi dimensional array.
To quickly test the capability of these APIs, I decided to develop and illustrate a PoC App to train and inference with ML Compute on both iOS and macOS a simple shallow model for the MNIST dataset.

Demo ML Compute Playground
Before entering the details of this MNIST ML Compute App, please let me share a quick and dirty Swift Playground I’ve initially used to get familiar with the ML Compute tensor and graph API.
As you can see here I’m not, yet, building a real ML model but I’m using a ML Compute graph to simply run some basic arithmetic operations on tensors.
I thought it could be very helpful to simply get familiar with this first.
Prepare the MNIST dataset tensors
Now that we get familiarized with the basic of the ML Compute API for building tensors and graph lets you start to see how to build a first sample shallow model for training the famous MNIST dataset.
We will first import, efficiently, CSV files containing the MNIST training and test dataset and see later how to transform that in ML Compute tensors.
As you can see the code below read in parallel, the training and dataset CSV file, apply some normalization, and convert images and labels in Swift Float arrays.
A quick note here about concurrency and Grand Central Dispatch API (GCD). In order to quickly and concurrently processing a batch of data in parallel, I used here the convenient DispatchQueue.concurrentPerform API to easily manage parallel For-Loop in Swift.
Utility functions
Unfortunately, reading the CSV file buffers into String and using split() method to process the rows is not really performant, so I had to develop an alternative method to scan CSV file line by line using the more efficient C runtime getline() function.
Other functions below will become useful later when dealing with label and loss function cost in order to encode and decode the label using one hot encoding technique as required by the loss function (ML Compute does not provide a sparse categorical cross entropy).
Build computational graph
Now that we have the training and test images and labels loaded on Swift Float array, we can finally start building the ML Compute graph that we will use respectively to train and test the model.
The ML model we’re using in particular in this sample application is a very minimal shallow neural network with one dense layer with relu activation followed by a final dense layer with the softmax activation for the ten digit classes of the MNIST dataset.
As you can see a ML Compute graph is built just adding node of layers and activation functions, passing to all these layers and activation functions the classic parameters like the shape of input/output tensors, weights/bias initial matrix etc.
It is important to pay attention here to the shape of all the tensors used for weights, biases as well as input and output of the model. In particular, you can see that when building the graph we have to choose the batch size we’re gonna later use for training and inferencing, creating all these tensor shape according to this batch size.
Build training graph
Now that we have a ML Compute graph we can build a MLCTrainingGraph passing the compute graph and the training parameters.
In this case we will specify softmax cross entropy as a loss function (ML Compute do not provide a sparse categorical cross entropy) and ADAM as optimizer with standard default parameters.
Training the graph
In order to train the MLCTrainingGraph we will need to create a full training loop iterating each epoch for all batch that we have, simply dividing the total number of samples we have in the training dataset by the size of batches we used above for creating tensors and graph.
In particular, in the batch loop, we will get a Swift unsafe buffer of the slice of image and label data from the Swift Float array and we will construct a MLCTensorData through the unsafe pointer of this slice of data and finally pass this tensor data to the training graph execute method to finally train the model with the batch data.
Testing the graph with validation data
Once the model is trained for sufficient epochs we will validate the model building a similar MLCInferenceGraph and feeding, one batch at a time, all the test data.
In the inference graph execute closure callback, we will calculate finally the accuracy of the model simply looking at how many time the prediction corresponds to the test label.
The full code
As always, the code for this story is completely open sourced and available on my GitHub personal account:
Special thanks
I want to finally thank here the Apple ML Compute team for their very quick and fully-detailed help in providing me suggestions and insights on the new ML Compute framework.