Our projects consist of sharing data between Snowflake databases and shared Windows folders, were custom applications perform certain tasks and return a result that has to be loaded back to Snowflake. The difficult part is the cleanup of all those folders/blobs. We accomplish these tasks as follows:
- Using Azure Blob Storage Unload, we export the data from Snowflake to an Azure BLOB.
- Using a File Iterator and a Data Transfer component, we transfer the files to a INPUT Windows file share.
- Our app executes, consumes the provided files and returns the results to an OUTPUT file share. Files in the INPUT share are deleted by our app.
- Using a File Iterator and a Data Transfer component, we transfer the files from a file share to an Azure BLOB.
- Using Azure Blob Storage Load, we we load the files from the blob to Snowflake AND at the same time delete the loaded files (Purge Files = true)
- Using a Python script, we then delete all files from the file share.
The only part that is missing is cleaning up the files in the BLOB on step #1. I came across a few only writeups on how to accomplish that in Python. For example:
https://stackoverflow.com/questions/58900507/upload-and-delete-azure-storage-blob-using-azure-storage-blob-or-azure-storage
which works very well, but only on Python on Windows. Even though, as per the article, we have loaded the latest from azure.storage.blob library, the first line of code that tries to load it in Python:
from azure.storage.blob import ContainerClient
fails constantly. Since the approach works in Windows, there has to be something specific to the Linux installation of the Matillion Server OR accessing the library from within Matillion.
Does anyone have any suggestions as to how to cleanup the Azure BLOB, Python or otherwise?
Thank you,
Aristotle