Norm NiemerinDataDrivenInvestorRapid Prototyping for Quantitative Investing with d6tflowQuant investing research involves complex data dependencies and optimizing strategy parameters. d6tflow makes it easy to manage this…6 min read·Oct 24, 2020----
Norm NiemerinDataSeriesPyspark quickstart for pandas usersThe fastest way to get up and running with pyspark!2 min read·Oct 19, 2020----
Norm NiemerinTowards Data ScienceAutoML Faceoff: 15 Humans VS 2 Machines. Who won?15 students worked on 8 datasets, trying to beat the performance of 2 leading data science platforms. Find out who won!6 min read·Jul 31, 2020----
Norm NiemerFuzzy joins in python with d6tjoinCombining different data sources is a time suck!5 min read·Jul 10, 2020----
Norm NiemerinTowards Data ScienceExplaining “Blackbox” ML Models — Practical Application of SHAPTrain a “blackbox” GBM model on a real dataset and make it explainable with SHAP.5 min read·Apr 27, 2020----
Norm NiemerinTowards Data ScienceCan we trust AutoML to go on full autopilot?Why you still need expert data scientists even with AutoML — a real-life case study7 min read·Aug 15, 2019--4--4
Norm NiemerinTowards Data ScienceHow to use airflow-style DAGs for highly effective data science workflowsAirflow and Luigi are great for data engineering but not optimized for data science. d6tflow brings airflow-style DAGs to data science.4 min read·Jul 24, 2019----
Norm NiemerinTowards Data ScienceTop 10 Statistics Mistakes Made by Data ScientistsAvoid those mistakes, some of which could derail your career. Especially useful for data science coders without a statistics background7 min read·Jun 17, 2019--8--8
Norm NiemerinTowards Data ScienceTop 10 Coding Mistakes Made by Data ScientistsA data scientist is “ … better at software engineering than any statistician”. Learn common coding mistakes to avoid5 min read·Apr 20, 2019--9--9
Norm NiemerinTowards Data Science4 Reasons Why Your Machine Learning Code is Probably BadYour current ML workflow probably chains together several functions executed linearly. Instead of linearly chaining functions, data…3 min read·Mar 27, 2019--3--3