dbt (Data Build Tool)

Just finished an excellent dbt (Data Build Tool) course on Udemy by Zoltan C. Toth.

#dbt is a tool that can be used to develop data models. dbt applies many software engineering best practises to the workflow of transforming your data.

What I really liked about dbt:

– Version control is easily added to a dbt project. For certain cloud data warehouse providers adding version control can be challenging.

– Seamless integration with several cloud data warehouse providers (#Snowflake, Redshift, etc.)

– Easy addition of data quality tests to your models. For example, checking if a column contains unique values is as simple as adding a few lines to a .yml file.

– Integrated and simple generation of documentation. Adding some pictures and a few lines to a .yml file will generate a webserver with a lot of documentation of your models (including data lineage diagrams). As creating documentation is everyone’s favorite pastime, this is a great selling point for dbt.

– No boilerplate DML or DDL code. In other words, you do not have to write “CREATE VIEW V_F_SOMEAWESOMEVIEW AS …” every single time you create a view. dbt takes care of this for you.

– … And a lot of other cool features.

Leave Comment