(Photo by Hal Gatewood on Unsplash)

“Can I Train my Model on Your Computer?”

The Need for Massive Computational Power

Kemal Tugrul
Towards Data Science
4 min readAug 6, 2018

--

Computers are good at one thing: they are fast. They are fast enough to do calculations in lighting speed. Fast enough to send information across the globe in split-second. However it is still not fast enough. The constant race between the software and hardware still continues. As hardware get faster and faster, we keep demanding more and more. If it is not fast enough, we demand more of them to compute in parallel. Gaining access to faster processors allowed us to push the frontiers in data science. Thanks to the advancements and availability in hardware, AI can beat humans in complex video games (here), synthesize music, and let us not forget about detecting cats in youtube videos!

(Hardware used in OpenAI Five)

Computational power is correlated to success in AI. You can see that in the past, research in machine learning was limited by the available computational power. Even though they had good amount of data and access to the modern methods we use today, they were not able to exploit their full potentials. As a result, the applications and impact of the machine learning and data science were limited. As researchers and practitioners gained access to more computational power, the field bloomed with advanced methods and interesting applications.

(Hardware used in a study in 2006)

Frustration over Limits

If you are a practitioner of machine learning or data science there is a good chance that you hit the annoying wall of lack of computational resources. How many times you waited for midnight to run your experiments? How many times you reduced the hyper-parameter search volume so that it will end quicker? This kind of limitation is a great source of frustration for me. It was just this month that I had to quit a personal project because it would require a lot of resources. I either had to wait for weeks for a single experiment, or spend a good amount of money in cloud services. I also remember the facial expression of my manager when I told each single run of an experiment for the project would cost 200$. It also happened multiple times that I saw people, or was one of them, asking for favors from friends or colleagues to run some experiments on their machines when they were not using them.

The Sleeping Devices

I was spoiled by the high performance cluster that I had access when I was in academia, where I could run thousands of jobs in parallel. But when I left, the resources that I had access got scarce. When I was searching for more computational power, I realized that there are computers lying around, doing nothing for a good amount of time! My co-workers are shutting down their computers every day after work. All the devices that I have, just sleep in the night doing nothing. The devices we have, just “sleep” doing nothing for a good amount of time every day.

We constantly waste resources we carry in our pockets and bags. Your phones, computers, smart TVs and basically anything that has a processor is a computational device. They can compute and that is what we are in need of. Ask yourself, how long do you use your devices actively in the cycle of the day? The most basic example is night time. Every night I connect my phone to charge and lay it in my night stand for long hours, where it sleeps while I also do.

The idea of using “sleeping devices” is not new. CERN has a network for people to donate their computers for scientist to use (here). Cryptocurrencies depends on people attaching their computers to the network to be alive. Even though the concept is not new and there are working examples, the problem is not solved yet. The problem would be solved when it is a very logical thing to do for everybody who has access to a computational device, electricity, and internet to attach all of their devices to a computing network.

We Need the Resources Now

The resources that we need as data scientists are available now. They are in people’s pockets, on their night desk, and in locked cabinets in offices. However we do not have access to them. We are willing to pay for the resources, but they are not accessible. The large portion of individuals who hold the computational devices are not aware of their values. They can convert those resources in to money, however there is a missing link.

The missing link is between the suppliers and the consumers. Data scientists wish to consume the resource in exchange of money, and I believe suppliers want to earn credit by lending their devices when they do not use. I assume the link between two parties will be established when the imbalance between the suppliers and the consumers become high enough so it cannot be neglected. Till then, I will keep paying cloud services and asking for favors to use the computers that sleep after working hours.

--

--

Machine Learning Software Engineer at thebeat.co, Founder at elify.io. A good data scientist, mediocre computer scientist, terrible running partner.