Member-only story
Here Is How You Can Apply Software Development Best Practices to Analytics Pipelines
Using the Data Build Tool — dbt
I have been closely working with the Data & Analytics domain for almost a decade and have seen a lot of interesting trends around analytics, big data, data engineering in general. Being a hardcore software engineer, I always wondered how to bring the principle of software engineering best practices to the analytics world. Recently I came across a very interesting open-source tool — dbt ( Data Build Tool).
dbt applies the following principles of software engineering to analytics code —
- Version Control & Code Review
- Automated Testing
- Sandboxing & Environments
- Documentation
- Modularity
- Package Management
dbt is an open-source project under Apache 2.0 license.
One thing you need to note is, dbt does NOT help you in ingesting data. dbt’s magic comes live when your data is already sitting in your data warehouse.
dbt currently supports ( via core and community contributions) the following databases/data…