Bogdan CojocarHow to read data from s3 using PySpark and IAM rolesIn this tutorial we will go over the steps to read data from S3 using an IAM role in AWS.·1 min read·Nov 7, 2022--1--1
Bogdan CojocarPySpark integration with the native python package of XGBoostIn this tutorial we will highlight how to use the latest XGBoost library version 1.7.0 that works natively with PySpark·4 min read·Oct 21, 2022----
Bogdan CojocarHow to read data from AWS S3 and Athena in pandas with column validationThis is a step by step tutorial on reading data from AWS S3 and Athena into a pandas DataFrame and doing column validation to assess the…·2 min read·Oct 5, 2022----
Bogdan CojocarPySpark ML and XGBoost setup using a docker imageI this tutorial we will build and test a docker image where we will be able to run a jupyter notebook with xgboost fully integrated.·2 min read·Oct 3, 2022----
Bogdan CojocarPredicting similar political donors for UK parties using graph dataIn this tutorial we will train a ML graph algorithm that will find similar likely political donors based on their UK companies donations to…·6 min read·Sep 16, 2022----
Bogdan CojocarinTowards Data ScienceBuilding a Health Entity labelling service using Azure Kubernetes Service, Seldon Core and Azure…In this tutorial we will build an inference service entirely in Kubernetes in the Azure ecosystem·8 min read·Jun 16, 2022----
Bogdan CojocarinTowards Data ScienceBuilding a Serverless Azure ML Service Using Cognitive and CDKTFIn this tutorial we will go over using cloud services such as Azure Functions and Cognitive to build a sentiment analysis service·7 min read·May 26, 2022----
Bogdan CojocarinTowards Data ScienceBuilding a Credit Card Fraud Detection Online Training Pipeline with River ML and Apache FlinkIn this tutorial, we will go over writing real time python Apache Flink applications to train an online model·8 min read·Apr 30, 2022--1--1
Bogdan CojocarHow to read parquet data from S3 using the S3A protocol and temporary credentials in PySparkWhen we access AWS, sometimes, for security reasons, we might need to use temporary credentials, using AWS STS instead of the same AWS…·2 min read·Jul 21, 2020--1--1
Bogdan CojocarinTowards Data ScienceHow to run a PySpark job in Kubernetes (AWS EKS)A complete tutorial on deploying an EKS cluster with Terraform and running a PySpark job using the Spark Operator·6 min read·Jul 16, 2020--1--1