Append files in a folder to a single table in Snowflake

Want to append the data of several files into ONE table?

You could do this using Power BI Dataflows. However, you can also do this in Snowflake using the COPY INTO command.

Imagine you have an AWS S3 bucket with a folder containing a lot of files that need to be appended into a single table. The command below will do exactly that.

COPY INTO database.schema.table

FROM @my_stage/directory

FILE_FORMAT = ‘myfileformat’ If new files arrive in the S3 bucket every day, you could schedule the above command using a Task. Using the metadata from the table in Snowflake and the files in the stage referencing the S3 bucket, Snowflake will not duplicate the data in the table, unless you force it to.

--Only one file is uploaded to the AWS S3 bucket.
--First the database and schema where the stage is located
use test.test
;

--Check which files are present in the stage; currently only 1
list @my_s3_stage
;

--Check out contents of the file
select $1, $2 from @my_s3_stage
;

--Create table to move the results to
create table test.test.append_test
(kolom1 int,
kolom2 varchar)
;

--Check content of empty table
select * from test.test.append_test
;

--Create file format 
create file format fileformat_csv_comma
    TYPE = 'CSV'
    FIELD_DELIMITER = ','
    SKIP_HEADER = 1
;

copy into test.test.append_test
    from @my_s3_stage/appendtest 
    file_format = 'fileformat_csv_comma'
;

--Check content of table filled with first file
select * from test.test.append_test
;

--First add second file that needs to be appended to S3 bucket
--Then COPY INTO again
--No extra records; using the metadata of the table (which is available for 64 days), the files have not added to the table again, which would cause duplicate records
--After 64 days the LOAD_UNCERTAIN_FILES will make sure that already loaded files are not loaded again

Leave Comment

Het e-mailadres wordt niet gepubliceerd. Vereiste velden zijn gemarkeerd met *