By Joseph Rocca and Baptiste Rocca – 20 min read
"Unity is strength". This old saying expresses pretty well the underlying idea that rules the very powerful "ensemble methods" in machine learning.
Democratising Machine learning with H2O
By Parul Pandey – 9 min read Overview of H2O: the open source, distributed in-memory machine learning platform
5 Advanced Features of Python and How to Use Them
By George Seif – 4 min read
Python is a beautiful language. Simple to use yet powerfully expressive. But are you using everything that it has to offer?
Detecting Malaria with Deep Learning
By Dipanjan (DJ) Sarkar – 16 min read
Welcome to the AI for Social Good Series, where we will be focusing on different aspects of how Artificial Intelligence (AI) coupled with popular open-source tools, technologies and frameworks are being used for development and betterment of our society.
Linear programming and discrete optimization with Python using PuLP
By Tirthajyoti Sarkar – 11 min read
Linear and integer programming are key techniques for discrete optimization problems and they pop up pretty much everywhere in modern business and technology sectors.
A Radiologist’s Exploration of the Stanford ML Group’s MRNet data
By Walter Wiggins – 8 min read
This post reviews the recently released Stanford MRNet knee MRI data set and competition. As I am a senior radiology resident, I will focus on exploring the data through basic domain knowledge – addressing aspects of the data distribution that non-physicians may find perplexing.
Top 10 Coding Mistakes Made by Data Scientists
By Norm Niemer – 5 min read
A data scientist is a "person who is better at statistics than any software engineer and better at software engineering than any statistician". Many data scientists have a statistics background and little experience with software engineering.
Machine learning for anomaly detection and condition monitoring
By Vegard Flovik – 10 min read
The current article focuses mostly on the technical aspects, and includes all the code needed to set up anomaly detection models based on multivariate statistical analysis and autoencoder neural networks.
Simplifying Deep Learning with Fast.ai
By Andrei Lyskov – 7 min read
Deep learning is a field notorious for gatekeeping. If you try to find answers online on how to break into the field, you’ll likely find yourself overwhelmed with a long list of requirements.
Making the Mueller Report Searchable with OCR and Elasticsearch
By Kyle Gallatin – 6 min read
April 18th marked the full release of the Mueller Report – a document outlining the investigation of potential Russian interference in the 2016 presidential election. Like most government documents it is long (448 pages), and would be painfully tedious to read.