Background
This year’s August was dedicated to IBM’s Global Quantum Summer School where I not only learned the basics in a compressed timeline and a tight schedule but also a few applications of quantum computing. The badge one gets after 4 gruelling weeks is a "quantum experience" in itself as you think you understand what you are doing but at the same time, you have no idea what is going on. The month transitioned from quantum circuit basics to variational algorithms at a fast pace which left only a little and limited time to ‘do your own research’ and get your hands dirty on the application part.
As far as the applications are concerned, quantum chemistry, quantum simulations, and a few really complicated modelling tasks would fit the bill of the problems that can be solved with quantum computers. Having said so, there is another branch that’s burgeoning and seeking a lot of interest from the users and researchers and that’s Quantum Machine Learning – QML in short.
I thought QML should be a logical successor to the conventional ML and I set out to do the same. Now, I wanted to have a problem that wouldn’t be straightforward for ML algorithms to solve because of the sheer size of the data, hard to identify the complex patterns, but something that I could code from the comfort of my humble machine. I looked no further than our old friend, Physics, which hides a gamut of complex but interesting problems in its lap and it sounds intellectually cool to work on such problems.
So it goes.
Problem Statement
I decided to deal with the dark matter classification problem examined under the OPERA experiment (Oscillation Project with Emulsion-tRacking Apparatus) associated with the Large Hadron Collider, CERN.
Problem Statement
In short, we will train a classifier to differentiate between the signal and the noise. The signal is the presence of the dark matter and the noise means an absence or something else altogether but not the signal.
Quite simple!
Intuition
Let’s elaborate a little on the background of the experiment to develop a little intuition.
So, dark matter is a mysterious and as-yet-undetected form of matter that does not interact with electromagnetic radiation, such as light. It’s said that it makes up nearly 80% of the total mass of the universe. It is called "dark" because it cannot be observed directly with telescopes or other instruments that detect electromagnetic radiation.
Why is it challenging to find dark matter?
It’s challenging because we don’t know what we are looking for.
- Invisibility: Dark matter doesn’t interact with light and that’s why we don’t really know what we are looking for, there are many theories around it but there isn’t any consensus.
- Background noise: Experiments designed to detect dark matter must contend with various sources of background noise that can mimic the expected signals from dark matter. Distinguishing between actual dark matter interactions and these background signals is a significant challenge. Dark matter’s interactions with regular matter are very weak, making it difficult to detect and distinguish from background noise in experiments.
- Multiple Possibilities: There are various theoretical candidates for dark matter, which require different detection methods. Scientists are exploring these possibilities, increasing the complexity of the search.
What really happens in the experiment at OPERA?
OPERA is located at the Gran Sasso National Laboratory in Italy. It is a neutrino Physics experiment that primarily focuses on the study of neutrino oscillations. It is not specifically designed for finding dark matter.
When a supposedly dark matter particle (which we are trying to find) collides with the lead nuclei, the nucleus emits electrons in the form of a shower that’s detected on a screen. This is what we are trying to find. There is a problem though, when a neutrino collides with the lead nuclei, it also produces an electron and the same way an electromagnetic shower is produced which muddles the signal with the noise. We are trying to differentiate between this signal and the noise.
What we’ll do
Essentially we have to sift through the data and distinguish between the signal and the noise, it can be accomplished using conventional machine learning but it’s still a herculean task. In a dataset of 10 million collisions, there would be hardly 10 thousand that would correspond to a signal. Such imbalance and sparsity in the dataset make it a skewed and difficult problem to solve. And because we like challenges, we will add a cherry on top and will use quantum machine learning algorithms instead of conventional ones (apologies for starting a sentence with a conjunction).
Data
There are loads and loads of datasets present on the LHC website here at your disposal; the one used in this experiment can be found here.
License: The dataset is released under CC0 (CC Zero) which **** is the Creative Commons Public Domain Dedication License (https://opendata.cern.ch/record/16541)
The code is present on my GitHub repo.
The data consists of two h5 files, open30.h5 and test.h5; h5 is hdf i.e. hierarchical data format that’s used to store and organise large amounts of data in a compressed manner.
The data contains the following variables:
- X – X coordinate of the base track
- Y – Y coordinate of the base track
- Z – Z coordinate of the base track
- TX – Angle from origin projected on X-axis
- TY – Angle from origin projected on the Y-axis
- Signal – A binary variable, 1 indicating signal and 0 indicating noise
Library
IBM’s quantum library – qiskit 0.44.3
import Qiskit.tools.jupyter
%qiskit_version_table
%qiskit_copyright
A note about the Variational Quantum Algorithms
Quantum algorithms are designed to run on quantum computers but at the moment we are in the NISQ – Noisy Intermediate Scale Quantum era of quantum computers which makes the reproduction of the results difficult. The current quantum computers are very noise-prone, even a trivial change in thermodynamic conditions or other circuitry would impact the results a lot. The logic gate that we want to apply would turn into something else because of the noise. It’s undesirable.
What smart folks working on advanced methods have devised is something called variational algorithms, they use both classical and quantum computers for speed gain and accuracy.
Essentially all the algorithms use some form of optimization and parameter adjustment, what variational algorithms do is use the quantum computer to approximate the cost function and then calculate the new values of the parameters of the cost function on a classical computer and then run with the new values on the quantum computer again. Thus, calculations are split between classical and quantum computers, speeding up the process.
We will use a variational quantum classifier here as the task is classifying signal and noise. For more info on IBM’s VQC: https://learn.qiskit.org/course/machine-learning/variational-classification
Modelling
Let’s just take a look at the data and the variables
Let’s look at the pair plot to see if some correlation between the variables.
Okay, there is a pattern but quite a complicated one!
After the usual boiler plater material including sampling, scaling, and train test split, we are ready for the quantum model.
Before we proceed, let’s run the Support Vector classification algorithm, so we have a baseline of the conventional ML.
from sklearn.svm import SVC
svc = SVC()
model_classical = svc.fit(train_features, train_labels)
70% accuracy isn’t that great on the test data, but I haven’t done much feature engineering here. It will improve once I do so.
Now, quantum computer’s turn.
The problems are formulated in the form of gates and circuits that are fed qubits (quantum bits) in a quantum computer.
We haven’t dropped any feature – TX, TY, X, Y, Z; let’s use all of them and thus the number of qubits used in our circuit will be 5.
num_features = features.shape[1] #5
feature_map = ZZFeatureMap(feature_dimension=num_features, reps=1)
feature_map.decompose().draw(output="mpl", fold=20)
That’s what the circuit looks like. It feeds 5 qubits to which Hadamard and P gates are applied. Hadamard changes the basis of the qubit from |0> to |+> and from |1> to | – > while the P gate causes a single-qubit rotation about the Z axis.
The next step after we form a circuit is Ansatz, it’s quite a common vernacular in the quantum world; it’s German for approach but in physics and mathematics it means an educated guess.
So, that’s what we have to do, to make educated guesses for the parameters – this will create a quantum state and will be executed on a quantum computer. The executed value will be compared with the desired value and depending on how far off we are, the optimiser will tune the parameters or the ansatz until we reach a good enough or a satisfactory value.
from qiskit.circuit.library import RealAmplitudes
ansatz = RealAmplitudes(num_qubits=num_features, reps=3)
ansatz.decompose().draw(output="mpl", fold=20)
You see there are a lot of R gates applied. Essentially, all the qubits are just rotated around their axis to get some arbitrary starting value and that’s the ansatz.
Now, let’s fit the VQC on the training data.
optimizer = COBYLA(maxiter=100)
vqc = VQC(
sampler=sampler,
feature_map=feature_map,
ansatz=ansatz,
optimizer=optimizer,
callback=callback_graph,
)
vqc.fit(train_features, train_labels)
Callback graphs are really cool. It’s like getting the results of your deeds in real-time.
Ok, a downward trend is promising which means it is learning but fitting it on the test data only would tell whether we are overfitting.
train_score_q4 = vqc.score(train_features, train_labels)
test_score_q4 = vqc.score(test_features, test_labels)
print("Quantum VQC on the training dataset:",train_score_q4)
print("Quantum VQC on the test dataset:", test_score_q4)
It is marginally better than the conventional ML. I ran it using a simulator on the local machine. Maybe it would perform better if run on real quantum computers (nothing stops us, just that one has to queue up and with the server load these days, a simple task such as the addition of two numbers takes a few hours, not because it is slow but because more and more people line up and use the computation time) or spend more time on engineering the features.
Closing Notes
Having said so, I am quite sure that the conventional ML algorithms will outperform quantum algorithms at the moment, especially for the classification tasks, as much research and many resources have been spent in making them robust and sophisticated. Once QML goes through such an overhaul, it will be a fair competition.
It doesn’t mean that QML will replace ML, this is far from the truth, instead quantum computing is for those problems that classical computers can’t really solve in polynomial time or can’t even approximate. Machine learning will have its place and QML will make its own place in the larger scheme of things.
This post isn’t meant to exemplify the power of either ML or QML but only to show similarities between the two – how both of them differ in execution but are similar in flavour, how both are dependent on feature engineering and choice of hyperparameters.
Before we close, it would be interesting to see if there is a cluster that exists in the signals. We have separated the signal from the noise, can we cluster them to see if there is they form an electromagnetic shower by a dark matter particle?
kmeans = KMeans(n_clusters=5).fit(train)
clustering_labels = kmeans.labels_
X_train = train.sample(frac=0.05)
clusters['cluster'] = clustering_labels
fig = plt.figure(figsize = (20,20))
ax.scatter(X_train.X, X_train.Y, X_train.Z, c=X_train.cluster)
plt.show()
Not bad! They aren’t distinctive but they aren’t awful either. One can see some patterns. The vertical nature of the clusters is because of the mass and the angle of the trajectory after the collision.
Interesting!
From the noisy data, we have extracted the best possible tracks which could be candidates for dark matter particle interactions.
I hope this small post might encourage you to take your quantum leap. Feel free to contact me on Twitter or by mail; as usual, I am open to criticism and comments that help me grow and learn.
_PS: Once again, the code is present on my GitHub repo here._