Weekly Selection — Dec 21, 2018

TDS Editors
Towards Data Science
3 min readDec 21, 2018

--

Why you should care about the Nate Silver vs. Nassim Taleb Twitter war

By Isaac Faber — 10 min read

An obscure controversy has reared its ugly head again this past month. Two icons of the quantitative analysis community have locked horns on the greatest of public stages, Twitter.

Tutorial: Build a lane detector

By Chuan En Lin 林傳恩 — 10 min read

Waymo’s self-driving taxi service just hit the road this month — but how do autonomous vehicles even work? The lines drawn on roads indicate to human drivers where the lanes are and act as a guiding reference to which direction to steer the vehicle accordingly and convention to how vehicle agents interact harmoniously on the road.

How to Learn Data Science: Staying Motivated

By Harrison Jansma — 11 min read

Over the last few weeks, I’ve taken a break from writing to focus on applying to internships. But as I was driving to class today, a question began to bother me.

Get Smarter with Data Science — Tackling Real Enterprise Challenges

By Dipanjan (DJ) Sarkar — 17 min read

The ‘Data Science Strategic Guide — Get Smarter with Data Science’ is envisioned as a series of articles, which serve to be more of a strategic guide depicting essential challenges, pitfalls and principles to keep in mind when implementing and executing data science projects in the real-world.

Synthetic data generation — a must-have skill for new data scientists

by Tirthajyoti Sarkar — 11 min read

A brief rundown of methods/packages/ideas to generate synthetic data for self-driven data science projects and deep diving into machine learning methods.

Analyzing Hacker News book suggestions in Python

By Alessandro Mozzato — 6 min read

An analysis of an Hacker News thread, using Python, Hacker News API and Goodreads API, and the definitive top 20 book suggestion list!

Develop a NLP Model in Python & Deploy It with Flask, Step by Step

By Susan Li — 6 min read

By far, we have developed many machine learning models, generated numeric predictions on the testing data, and tested the results. And we did everything offline. In reality, generating predictions is only part of a machine learning project, although it is the most important part in my opinion.

Generating New Ideas for Machine Learning Projects Through Machine Learning

By Paras Chopra — 19 min read

Let’s do a quick Turing Test. Below, you’ll see ten machine learning project ideas. Five of them are generated by a human and five of them are generated by a neural network. Your task is to tell them apart.

Deconstructing BERT: Distilling 6 Patterns from 100 Million Parameters

By Jesse Vig — 6 min read

The year 2018 marked a turning point for the field of Natural Language Processing, with a series of deep-learning models achieving state-of-the-art results on NLP tasks ranging from question answering to sentiment classification.

A Brief Introduction to PySpark

By Ben Weber — 15 min read

PySpark is a great language for performing exploratory data analysis at scale, building machine learning pipelines, and creating ETLs for a data platform. If you’re already familiar with Python and libraries such as Pandas, then PySpark is a great language to learn in order to create more scalable analyses and pipelines.

ProGAN: How NVIDIA Generated Images of Unprecedented Quality

By Sarah Wolf — 10 min read

The people in the high resolution images above may look real, but they are actually not — they were synthesized by a ProGAN trained on millions of celebrity images. “ProGAN” is the colloquial term for a type of generative adversarial network that was pioneered at NVIDIA.

--

--

Building a vibrant data science and machine learning community. Share your insights and projects with our global audience: bit.ly/write-for-tds