Open source: The magic power of AI research.

Opensource is the key to advancing AI and has been the driver of the majority of innovation in the field. This is the story from an insider’s perspective.

William Falcon
Towards Data Science


Credit: LoveTheWind (with permission)

As an open-source developer, the question I hear the most is “why would you want to give that away for free.?”

In the field of AI, there are many reasons why opensource is key. First, the code for building models does not give away any competitive advantage because the value comes from models+your own data. Second, it lets the whole world help you find and correct mistakes. Imagine building a house where every architect in the world can contribute one tiny idea. But more importantly, AI is a really hard problem to solve.

The problems in the field cannot be solved by any one individual or group.

In AI we opensource everything from datasets to models to frameworks that give users incredible sophistication to build machine learning models. As the original creator of one of these frameworks (PyTorch Lightning), I’ve had the fortune to see the beauty of opensource from the inside.

It’s all about the community.

PyTorch Lightning has its humble beginnings as a project that I developed during the first few years of my Ph.D. at NYU CILVR and later at Facebook AI Research. At NYU it gained the powers of rapid iteration and standardization that makes Lightning a pleasure to work with today — it standardizes AI research code so everyone’s code can be formatted the same way, and thus it becomes more readable and reproducible. At FAIR it learned how to train massive neural networks across hundreds of GPUs.

But had I remained the only developer of the project it would be nowhere near where it is today as a quickly rising favorite for deep learning research. Our first non-facebook contributor Jirka, forced much-needed formatting and structuring to the internals.

But then something magical happened. Tens of thousands of people adopted Lightning to build super-advanced AI models. But the magic is not the adoption but how the community worked with Lightning. Lightning became a living, breathing organism where many of the world’s top AI researchers and PhD students started contributing their own features back.

Lightning became the AI research community’s framework.

What exists today is a highly-tuned, rigorously tested a world-class framework for building AI models. It helps Ph.D. students create new papers, research scientists try insanely creative ideas and data scientists build complex scalable AI production systems.

Lightning is more than a framework now, it’s a community.

The opensource core values.

Building on top of previous work is how we advance. (credit: pxhere with permission).

Opensource is centered around honesty, transparency, and building on top of each other’s work.

I’ve always considered myself to be an honest person with a strong internal moral compass. But as I learned as a young trainee undergoing US Navy SEAL training in 2007, what you believe your internal compass differs from what you actually do in situations with real consequences. SEAL training taught me a few hard lessons along the way. Integrity is what you do when no one is looking. Courage is how you react when every muscle and mental fiber of your body is aching for you to quit.

These principles are behind a lot of how I approach building Lightning and interact with the AI community. There have been many times during the development of the project where I’ve had to enforce the integrity of our project by having a zero-tolerance policy for copying code from other projects.

For example here, one of our contributors took code from another project and copy-pasted it into Lightning.

Now, it’s clear that the contributor did not have bad intentions, but nevertheless, this is not the Lightning way.

We don’t copy, we create.

This particular event was resolved positively.

Now, you might wonder why I care so much about this given that “opensource copying is fair game.” But opensource was designed so that we can build on top of each other’s works. The objective is not to pull pieces of projects to the point where you just end up duplicating functionality.

Building on top of each other is why the field of AI is the fastest-growing field today. This principle is also at the core of the scientific community. Peer reviews, attribution, citations.

It is in the DNA of science.

In my opinion, slowly copying chunks of functionality instead of building on existing work, is counter to these principles and has no room in the AI research community.

Lightning is built by creators, researchers and innovators for those who want to build the next big thing in AI. I hope future projects continue to build on our work to advance the field. I’ve learned a lot from friends and contributors of Kornia, NVIDIA NeMo, and other amazing projects. Integrating as partners has helped us all deliver exponentially better experiences for users.

Let’s build the future of open source AI together.



⚡️PyTorch Lightning Creator • PhD Student, AI (NYU, Facebook AI research).