Author Spotlight

Read and Write Obsessively: There are No Real Shortcuts to Producing Great Work

On working in public, chasing intrinsic motivation, and taking risks in writing

Published in

Towards Data Science

8 min readMay 12, 2021

In the Author Spotlight series, TDS Editors chat with members of our community about their career path in data science, their writing, and their sources of inspiration. Today, we’re thrilled to present Elliot Gunn in conversation with Mark Saroufim.

Mark is an ML Partner Engineer on the PyTorch team. In his past lives, Mark has worked as an ML engineer and Product Manager at Graphcore, his own company yuri.ai, and Microsoft. Mark is optimistic about a future where people forge their own education and companies.

You took a rather unconventional career path in ML. From leaving your job at Microsoft to start your own game development AI company, to working at Graphcore, and now as a PyTorch Engineer at Facebook. Could you share a bit about how you found your way into the field, and through these roles?

Haha, I get asked this question a lot. So I really started to go deep in machine learning at UC San Diego, I went all in on computer science theory, I felt like this was the one true way to understand the world.

It gave me my current confidence to pick up any math textbook and just read it, but there was also an opportunity cost because I could have started in deep learning right about when the field was starting to explode.

My first job out of school was at Microsoft as a product manager, and I was terrible at it. I figured out what the product manager role was about after I switched over to being an applied scientist. If anything, I feel that disciplines are artificial constructs, anyone would benefit from being a better designer, coder, mathematician, writer, and speaker.

That experience encouraged me to start my own company, yuri.ai, which was a reinforcement learning service for game developers. It’s one thing to do what your boss tells you to do but it’s a completely different ball game to get people to pay—especially for a half-baked project. My savings were evaporating and I needed to find a better way to get people’s attention, and that’s why I ended up writing the Robot Overlord Manual. It was my first time seeing the benefits of working publicly; the book ended being orders of magnitude more popular than my startup.

After that at Graphcore I was an ML engineer working very directly with researchers to port their model to the IPU, which gave me amazing insight on how to run models fast. It was really amazing to have the ability to reach out to any customer, pitch them a POC on our hardware, and then write a paper together. Even within a company, impact doesn’t need to be localized to an organization but can be done globally over the internet.

And yeah, now I’m at PyTorch, which frankly is amazing. I’m the dumbest person in the room in every room I’ve been in, which is exciting. I’m still scoping out what I should be doing exactly, but in the meantime you can see all my work publicly on GitHub.

What is a project you’re particularly proud of?

My favorite projects always started very small—it was initially just me and then interest organically grew.

The first and most memorable example of that was an internal telemetry tool at Microsoft that told us who is using which internal data tool.

It started as just me manually running a bunch of SQL queries and making a weekly usage dashboard, but eventually it got so well-automated we could drill down to where usage was dropping for which teams, reach out to them, and then fix their issues before they silently abandoned a product. We then opened it up to others and adoption exploded.

We growth-hacked our way by essentially hijacking the global namespace for Power BI dashboards, creating a dashboard called “!!” so that whenever anyone in the company opened any Power BI dashboard, they would first see our product page. We did get a sternly worded email from the legal team to stop.

That was a really fun time. We built something for ourselves that we liked and it organically grew into something that others liked and depended on to figure out if their products were useful to others.

I have a similar fond experience ramping up the graph neural networks efforts at Graphcore, where it was just me initially. Once the benchmarks were promising people organically volunteered to help me out. Intrinsic motivation beats extrinsic motivation, and I’m hoping I can make this a pattern in my work.

You have written many articles across an eclectic range of topics, from the viral “Machine Learning: The Great Stagnation”, to “Deschooling Society”, and a really fun visual history of math. You’ve also published a free textbook on robotics and machine learning, and stream explainers on Twitch and YouTube. What is your process for writing and creating so prolifically? What are the things you’ve enjoyed the most about this kind of writing?

You need a good funnel where the content you consume slowly becomes polished content that you can share.

I buy and read a lot of books. The key is to be OK with quitting the bad ones. After that I take lots of notes, some of them publicly on GitHub and the more private ones on a Google doc.

When I settle on an interesting idea, I’ll A/B test on Twitter. If 0 people engage, odds are it’s not a unique insight. But if a bunch of people do, then it’s a good sign I should flesh it out in a blog post. My HuggingFace article started as a Tweet.

Writing is by far the most time consuming social media activity I engage with. I’ve found that there are no real shortcuts, you just have to sit down and write a lot, but what’s helped is only choosing topics that are unique and that I’m fascinated by.

YouTube and Twitch I also started for selfish reasons—when I explain stuff live, it’s my first time reading about the topic, so for me it’s great because I can turn what I’m learning about into assets. You can’t procrastinate when you’ve broadcast to the world that you’re about to teach them something.

To be fair, though, I did spend the first six years of my ML career reading and writing obsessively, but never in public, and I never had the output I wanted whether it’s papers or blogs or libraries. I’ve been getting ready to work in public for a long time, but at this point I don’t think I’ll ever stop, I meet too many interesting people this way.

You have a pretty unique writing philosophy. What is your advice to readers looking to improve their writing and also stand out online as more and more engineers get into writing? What kind of writing in DS/ML would you like to see more of?

Every writer goes through an evolution where they slowly find their style. Initially, you have no style and it’s easy to overthink what your style should be. Don’t! At this stage it’s important that you write a lot to get in the habit of explaining things clearly. Once you have a few cookbook tutorials under your belt, it’s time to take a few risks and start injecting more of your personal life, lessons, and domain knowledge into your posts.

The reason this helps is because it’s really hard to be the best in the world at one thing; I don’t want to compete with Yann LeCun. So my realization was that when people are reading something they are not only motivated by learning. No one says a dictionary is their favorite book.

People want to laugh, be inspired, be challenged, afraid — so think a bit more about what it’s like to read your posts and how you are offering a unique experience to your reader. Cookbook tutorials that discuss how to solve problem X with technology Y don’t have staying power. It’s much better to contribute to the official docs as an alternative.

The best technical posts I’ve read are not purely technical posts; they tend to be technical posts and something else. That something else could be abstractive summaries, illustrations, business insights, personal details, combining several technical fields, etc. Writing is inherently introspective. You need to dig out seemingly irrelevant strengths and make them shine.

One of the best writing tips I can give is to write a technical post in the voice of your favorite author. My favorites are Nassim Nicholas Taleb for the irreverent hot takes, George Orwell for the dystopian future as commentary, and H. P. Lovecraft, who helps me realize how small of an impact humans actually have. If you borrow from all your interests you’ll find your own unique style very quickly.

In my case, I’ve worked as a Product Manager, Engineer, and Scientist and try to discuss something from at least those three angles. I’m also an amateur comedian and let it come through with memes I make myself. I want to make it easy for someone to skim my article by only looking at the memes. I want them to have fun and feel smart and challenge them by saying things their coworkers wouldn’t.

What are your hopes for the DS community in the next couple of years? Are there any events or trends that make you feel more optimistic about the future of ML?

I’ve mentioned in the Great Stagnation that while core ML has stagnated, the intersection with other fields is booming. The most obvious area where I see an explosion of innovation is productionizing ML models where there’s seemingly a new startup coming out every day. This is a good thing, it means there is so much innovation that the community hasn’t settled on the best abstractions to do ML, and I’ll be keeping a close eye on everyone to see where I can help.

One exception to the modeling side is graph neural network, which I’m very excited about because they encode invariances in the model itself. Data augmentation was always a hack; turns out with more modern math we can increase the representational power of architectures.

Finally, I have a soft spot for game simulations—games essentially generate datasets one observation at a time. Reinforcement learning is slowly becoming more mainstream in the game developer community but it’s surprisingly rare to find people with both the ML and game-dev skill set (if you are one of those people please DM me, I’d love to chat). The cost of generating new observations is essentially free, and with the growing popularity of self supervised learning we can go way beyond Atari and turn game simulations mainstream.

Curious to learn more about Mark’s work and data science interests? You’ll find his writing on his Medium profile, on his Substack, Breaking the Stagnation, as well as on his Twitter account. He also streams his writing and tutorials on his Twitch channel, and uploads video explainers to his YouTube channel. Here are some of our recent favourites.

Machine Learning: The Great Stagnation (TDS, December 2020): Mark’s viral hit catapulted him to the top of Hacker News. He explains how incentives have created a culture where ML researchers chase incremental research over more risky routes that lead to breakthroughs.
Exterior Product (TDS, November 2019): An article that dives into how geometric algebra may be the better mathematical tool for the future of deep learning and robotics.
How To Turn Physics into an Optimization Problem (TDS, November 2019): Mark suggests how many physics problems can be solved with machine learning tools.

Stay tuned for our next featured author, coming soon. If you have suggestions for people you’d like to see in this space, drop us a note in the comments!