Wednesday, 27 March 2019

Clean up files by built-in delete activity in Azure Data Factory

Azure Data Factory (ADF) is a fully-managed data integration service in Azure that allows you to iteratively build, orchestrate, and monitor your Extract Transform Load (ETL) workflows. In the journey of data integration process, you will need to periodically clean up files from the on-premises or the cloud storage server when the files become out of date. For example, you may have a staging area or landing zone, which is an intermediate storage area used for data processing during your ETL process. The data staging area sits between the data source stores and the data destination store. Given the data in staging areas are transient by nature, you need to periodically clean up the data in the staging area after the ETL process has being completed.

We are excited to share ADF built-in delete activity, which can be part of your ETL workflow to deletes undesired files without writing code. You can use ADF to delete folder or files from Azure Blob Storage, Azure Data Lake Storage Gen1, Azure Data Lake Storage Gen2, File System, FTP Server, sFTP Server, and Amazon S3.

You can find ADF delete activity under the “Move & Transform” section from the ADF UI to get started.

Azure Data Factory, Azure Study Materials, Azure Learning, Azure Guides, Azure Certifications

1. You can either choose to delete files or delete the entire folder. The deleted files and folder name can be logged in a csv file.

Azure Data Factory, Azure Study Materials, Azure Learning, Azure Guides, Azure Certifications

Azure Data Factory, Azure Study Materials, Azure Learning, Azure Guides, Azure Certifications

2. The file or folder name to be deleted can be parameterized, so that you have the flexibility to control the behavior of delete activity in your data integration flow.

Azure Data Factory, Azure Study Materials, Azure Learning, Azure Guides, Azure Certifications

3. You can delete expired files only rather than deleting all the files in one folder. For example, you may want to only delete the files which were last modified more than 30 days ago.

Azure Data Factory, Azure Study Materials, Azure Learning, Azure Guides, Azure Certifications

4. You can start from ADF template gallery to quickly deploy common use cases involving delete activity.

Azure Data Factory, Azure Study Materials, Azure Learning, Azure Guides, Azure Certifications

Related Posts

0 comments:

Post a Comment