{"id":431,"date":"2024-06-28T11:00:17","date_gmt":"2024-06-28T11:00:17","guid":{"rendered":"https:\/\/datadandies.nl\/?p=431"},"modified":"2024-06-28T11:00:17","modified_gmt":"2024-06-28T11:00:17","slug":"micro-partitions-and-clustering-in-snowflake","status":"publish","type":"post","link":"https:\/\/datadandies.nl\/index.php\/2024\/06\/28\/micro-partitions-and-clustering-in-snowflake\/","title":{"rendered":"Micro-partitions and clustering in Snowflake"},"content":{"rendered":"\n<p>Tables are stored within Snowflake as micro-partitions. Put simply, micro-partitions are basically horizontal slices of the table. These slices divide the table into subsets of rows. This process is called \u201cclustering\u201d.<\/p>\n\n\n\n<p>Clustering is done to avoid unnecessarily reading micro-partitions that are not needed for the query. This leads to more efficient, faster and cheaper results.<\/p>\n\n\n\n<p>A picture says more than a thousand words and the Snowflake documentation has one handy to aid in visualizing the clustering process. This picture also shows that the stored data is organized in a columnar fashion.<\/p>\n\n\n\n<figure class=\"wp-block-embed\"><div class=\"wp-block-embed__wrapper\">\nhttps:\/\/docs.snowflake.com\/en\/_images\/tables-clustered1.png\n<\/div><\/figure>\n\n\n\n<p>Using the Query Profile you can check the amount of partitions scanned for the query and the total amount of queries. If you are using a filter in your query and you find that for a small amount of resulting records a whole lot of partitions were being scanned, you might consider defining an explicit clustering key. &nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Tables are stored within Snowflake as micro-partitions. Put simply, micro-partitions are basically horizontal slices of the table. These slices divide the table into subsets of rows. This process is called \u201cclustering\u201d. Clustering is done to avoid unnecessarily reading micro-partitions that are not needed for the query. This leads to more efficient, faster and cheaper results.&hellip;<\/p>\n<p class=\"more-link\"><a href=\"https:\/\/datadandies.nl\/index.php\/2024\/06\/28\/micro-partitions-and-clustering-in-snowflake\/\" class=\"themebutton\">Read More<\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[40],"class_list":["post-431","post","type-post","status-publish","format-standard","hentry","category-blog","tag-snowflake"],"_links":{"self":[{"href":"https:\/\/datadandies.nl\/index.php\/wp-json\/wp\/v2\/posts\/431","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/datadandies.nl\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/datadandies.nl\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/datadandies.nl\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/datadandies.nl\/index.php\/wp-json\/wp\/v2\/comments?post=431"}],"version-history":[{"count":1,"href":"https:\/\/datadandies.nl\/index.php\/wp-json\/wp\/v2\/posts\/431\/revisions"}],"predecessor-version":[{"id":432,"href":"https:\/\/datadandies.nl\/index.php\/wp-json\/wp\/v2\/posts\/431\/revisions\/432"}],"wp:attachment":[{"href":"https:\/\/datadandies.nl\/index.php\/wp-json\/wp\/v2\/media?parent=431"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/datadandies.nl\/index.php\/wp-json\/wp\/v2\/categories?post=431"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/datadandies.nl\/index.php\/wp-json\/wp\/v2\/tags?post=431"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}