How to organize code in Python if you are a scientist

Workflow for Reportable, Reusable and Reproducible Computational Research

Victor Serban
Towards Data Science

If you work in computational research or data science and you don’t come from a Computer Science background, you probably generated amazing pieces of scientific knowledge, but the code you wrote to do it is not quite up to the academic standard. You might need a week yourself to make your scripts do the same things again. Even if you are an experienced coder, you might find it hard to do both the job of a developer and that of a scientist.

I have faced this challenge myself as a research student. Although I took classes in Software Engineering, I found it hard to divert from the research questions I was handed to think about how I can create useful tools out of my code. After some trial and error, I designed a recipe to help me make the transition seamlessly and now I use it in my exploratory work as a Data Scientist.

In this article, I share the way I organize my coding workflow, give you some tips and tricks and show you the tool stack I use. The goal is to make the transition from experimentation to tool development easier.

Jupyter Notebook: a digital lab notebook

Create an account to read the full story.

The author made this story available to Medium members only.
If you’re new to Medium, create a new account to read this story on us.

Or, continue in mobile web

Already have an account? Sign in

Responses (4)

What are your thoughts?

Nice post. Thanks! Have you thought about working a tool like dvc into your work flow?

--

Good advice. Check out my article on related topic and you can refer it if you find out useful.

--

Fantastic article, Victor! I usually split my scripts/notebooks in several stages and then organize them in a pipeline. I made a tool to automate this process: https://github.com/ploomber/ploomber I'd love to hear your comments.

--