Can Machine Learning Answer the Question “Where Did We Come From?”

Bob Lindner
Towards Data Science
3 min readMar 7, 2017

--

In order to understand where we came from, scientists start by exploring why galaxies are born, how they grow up, and why they die.

We live in a galaxy, and the life cycle of every galaxy is quite a story, always with these three phases. It is easy to *see* galaxies in each phase, but without better understanding of star formation and how gas is turned into stars, we will never know WHY they age, and why they die.

For over 50 years, astronomers have been manually measuring and modeling star formation, but future understanding depends on analyzing more data from larger net generation telescopes. In the absence of the ability to utilize the amount of data necessary to measure star formation, theorists have relied instead on models that are often compelling, but unconfirmed.

All of that is about to change.

Last week, I released a machine learning code base to the public that fundamentally accelerates the ability for scientists to analyze data from next generation telescopes.

I developed the algorithm, and corresponding code, while I was a post-doctoral researcher at the University of Wisconsin. The code base allows the practical application of an algorithm outlined in a paper published in 2015. The work was funded by a National Science Foundation grant held by Snezana Stanimirovic, Professor in the Astronomy Department at the University of Wisconsin-Madison.

Gausspy allows scientists to test theories for the first time ever using the increased data from bigger telescopes to find out why stars form, why they age and die, and get much closer to understanding the most fundamental question of why we are here.

So, why did it take 50 years to move from scientists sitting in labs manually analyzing spectra to machine learning?

Science needed someone to solve this problem from a perspective outside of the field that needed it.

Radio astronomy needed it to understand their spectra. My background was machine learning applied to messy problems that needed humans to fix. I personally picked this project not just for the science, but because I saw a huge problem they had and knew I could put a big dent in. A problem that was plaguing radio astronomers for decades.

Early adoption has been impressive and validates just how important this innovation is to the future of astronomy.

Although the code base was just released last week, I completed Gausspy in 2015 and began looking for the next machine learning challenge. Just as I was leaving my post doc, I met my partner and formed VEDA Data Solutions.

I’ve always had the most fun jumping from field to field solving long standing data problems to make disruptive discoveries, then move to the next most “volatile” field. So I got very broad in knowledge. I’ve brought with me the best techniques and abilities to solve problems because I latched onto the best data analysis styles from every field I saw, and applied them in other fields that had not considered those ways.

My style at VEDA is just the next progression in this chain, jumping not just fields, but sectors and verticals as well. I’m thrilled to share Gausspy with the scientific community and look forward to solving more problems and sharing new discoveries through the work we are doing at VEDA.

--

--

Founder & CTO @VEDA_Data; data scientist; published astrophysicist; Physics Ph.D. Rutgers University, mathematics and astrophysics at UW-Madison