The world’s leading publication for data science, AI, and ML professionals.

Technical Writing for Data Scientists

A four-part formula for radically adequate technical writing

Notes from Industry

Original art by the author
Original art by the author

Data Science is like driving in a new town: You might arrive at the destination, but explaining how you got there is another story. After spending weeks or months doing painstaking technical work, the writing phase of a data science project can be discouraging. You know you did good work – and learned some cool stuff! – but the words are sometimes slow to find their way to the page.

Fortunately, this is very solvable. Over the course of several years working with data scientists I’ve devised a set of pointed questions that, if answered in order, will essentially write an article for you. I’d suggest answering each question in isolation first before pasting them together into a final piece. As a bonus, you can string together one-sentence versions of each section to write a simple abstract or summary.

A note on style: Of course we’d all love to be the most eloquent kid on the block. When it comes to Technical Writing, though, the premium is on clarity and effectiveness over elan. Master the art of saying what you need to say before bogging yourself down in flair. Totally adequate technical writing is a huge accomplishment in itself. And anyway, being a good writer is like being beautiful; both are more fun if you have something interesting to say.

1. What’s the problem?

Somewhere in your work you’re addressing a problem or problems. In many cases, something is expensive or losing money. In others, something is inconvenient, inefficient, or making people sick. Isolate the main problem(s) and describe the scale, location, affected population, etc. of the issue. For example,

"Cupcake sales were down last quarter, causing us to miss our target by $4M. Worryingly, we estimate these decreases were externalized predominantly to the children who consume our pastry products. The issue manifested mainly in the Pacific North West, sparking concern over the compounding psychological effects of dreary weather and low-sugar diets. In the interest of our business and our customers’ wellbeing, we are eager to resolve this pressing matter."

2. What do we know about it?

Use this section to give background information on the subject at hand. This includes previous work done on the topic and peripheral factors the reader should understand. Don’t forget to cite your sources!

"Sales have historically held steady around $10M per quarter. Over the winter we saw a precipitous decline to only $6M, the likes of which has not occurred since the late nineties during a bout of uniquely-effective advertising by our competitors. According to our customer surveys, 98% of customer households report that residents under 18 consume our cupcake products. Analysis of sales trends predict that no children will have access to cupcakes by 2022 if the decrease continues."

3. What are we doing about it?

This is akin to a methods section. Explain the data collection process, how you conducted the analysis, the results, and the proposed to solution to the problem you’ve set out to solve. Describe the limitations of your work, and elaborate on future research and next steps.

"Customer surveys were conducted by a team of trained survey methodologists by phone using contact information submitted by sweepstakes participants and triangulated data from the National Database for Trivial Nonsense. Our university intern ran a time series analysis for a his final project using a small open-sourced Python package. It is plausible that the analysis was conducted until something significant showed up. Based on the results, we will be increasing the advertising efforts for our cupcake products in Washington and Oregon."

4. Why should anyone care?

None of the previous three sections matter if what you’re saying isn’t relevant to the readers. Your job in this section is therefore to spell out exactly why your solution, discovery, or innovation is useful to them. Naturally, some work will be irrelevant to all but the most niche audiences; you don’t need to convince everyone, just the folks you hope are reading.

"Cupcakes are the sole remaining source of good in this world, and we owe it to our employees and our customers to spread this joy."

This method might not win you a Nobel prize in literature, but radically adequate technical Writing is all about telling the reader what they should know about your work. By answering their questions before they even ask, you’ll have done a fantastic job at achieving what you set out to do: learn cool stuff, and tell people about it. Happy writing!

Build a career that’s actually yours


Related Articles