By Isaac Faber – 10 min read
An obscure controversy has reared its ugly head again this past month. Two icons of the quantitative analysis community have locked horns on the greatest of public stages, Twitter.
Tutorial: Build a lane detector
By Chuan En Lin 林傳恩 – 10 min read
Waymo’s self-driving taxi service just hit the road this month – but how do autonomous vehicles even work? The lines drawn on roads indicate to human drivers where the lanes are and act as a guiding reference to which direction to steer the vehicle accordingly and convention to how vehicle agents interact harmoniously on the road.
How to Learn Data Science: Staying Motivated
By Harrison Jansma – 11 min read
Over the last few weeks, I’ve taken a break from writing to focus on applying to internships. But as I was driving to class today, a question began to bother me.
Get Smarter with Data Science – Tackling Real Enterprise Challenges
By Dipanjan (DJ) Sarkar – 17 min read
The ‘Data Science Strategic Guide – Get Smarter with Data Science’ is envisioned as a series of articles, which serve to be more of a strategic guide depicting essential challenges, pitfalls and principles to keep in mind when implementing and executing data science projects in the real-world.
Synthetic data generation – a must-have skill for new data scientists
by Tirthajyoti Sarkar – 11 min read
A brief rundown of methods/packages/ideas to generate synthetic data for self-driven data science projects and deep diving into machine learning methods.
Analyzing Hacker News book suggestions in Python
By Alessandro Mozzato – 6 min read
An analysis of an Hacker News thread, using Python, Hacker News API and Goodreads API, and the definitive top 20 book suggestion list!
Develop a NLP Model in Python & Deploy It with Flask, Step by Step
By Susan Li – 6 min read
By far, we have developed many machine learning models, generated numeric predictions on the testing data, and tested the results. And we did everything offline. In reality, generating predictions is only part of a machine learning project, although it is the most important part in my opinion.
Generating New Ideas for Machine Learning Projects Through Machine Learning
By Paras Chopra – 19 min read
Let’s do a quick Turing Test. Below, you’ll see ten machine learning project ideas. Five of them are generated by a human and five of them are generated by a neural network. Your task is to tell them apart.
Deconstructing BERT: Distilling 6 Patterns from 100 Million Parameters
By Jesse Vig – 6 min read
The year 2018 marked a turning point for the field of Natural Language Processing, with a series of deep-learning models achieving state-of-the-art results on NLP tasks ranging from question answering to sentiment classification.
A Brief Introduction to PySpark
By Ben Weber – 15 min read
PySpark is a great language for performing exploratory data analysis at scale, building machine learning pipelines, and creating ETLs for a data platform. If you’re already familiar with Python and libraries such as Pandas, then PySpark is a great language to learn in order to create more scalable analyses and pipelines.
ProGAN: How NVIDIA Generated Images of Unprecedented Quality
By Sarah Wolf – 10 min read
The people in the high resolution images above may look real, but they are actually not – they were synthesized by a ProGAN trained on millions of celebrity images. "ProGAN" is the colloquial term for a type of generative adversarial network that was pioneered at NVIDIA.