Clustering depth of tables in Snowflake: a potential indicator of clustering health of a table

Clustering depth indicates if micro-partitions are overlapping with one another. Ideally, micro-partitions are not overlapping and are contiguous. Being contiguous means that one micro-partition seamlessly connects with another micro-partition. Sometimes however, micro-partitions are overlapping. A reason for this could be a less than optimal clustering key. Whatever the reason, this is not ideal, because now…

Micro-partitions and clustering in Snowflake

Tables are stored within Snowflake as micro-partitions. Put simply, micro-partitions are basically horizontal slices of the table. These slices divide the table into subsets of rows. This process is called “clustering”. Clustering is done to avoid unnecessarily reading micro-partitions that are not needed for the query. This leads to more efficient, faster and cheaper results.…

Tasks in Snowflake

Tasks are an excellent scheduling tool in Snowflake. You can use tasks for a lot of different purposes. Below are some examples: The scheduling can be done using CRON (for example, run a task every hour, minimum schedule is one minute) or using a trigger. You could trigger a task to run after another task…

Streams in Snowflake

A stream is a great way to track changes to a source table in #Snowflake. These tracked changes can then be used to create an incremental load for example. There are 3 types of streams: – Standard: tracks all data manipulation changes to an object. – Append-only: tracks only inserts. – Insert-only: tracks only inserts…