What needs to be configured in Matillion to connect to an storage container

We have an Azure storage container with multiple text files. What we want to do is connect to the storage container with Matillion to load the files into several tables in Snowflake. How do we go about connecting to the containers in Matillion?

Hi @DF214566​, I will preface this answer with, we are an AWS customer and thus use Matillion within AWS but I do have some experience with Azure and Blob storage. I would definitely start here: https://docs.snowflake.com/en/user-guide/data-load-azure-config.html

If you have a Snowflake stage created which is pointed to your container, it makes the process of loading data stored in files trivial. It then becomes a simple "COPY INTO" Snowflake statement. The COPY INTO statement can be done manually by using a SQL Script component or the load component. In the AWS version of Matillion there is an S3 Load component that can facilitate the COPY INTO. I hope this helps. Please let us know more about your status or situation if this doesn't help and maybe someone else can jump in here. Thank you for contributing!

​Thanks @Bryan​ (Customer)

We are running Matillion on a Azure VM so instead of a S3 load component would you recommend a Azure Blob Load Generator or an Azure Blob Storage Load? Also I'm having issues with either one of those components connecting to the blob storage. I believe it's the way I'm structuring the container path. Can you give me any advice on the that?

Hi @DF214566​, you would want to use the Azure Blob Storage Load component. I could be mistaken but I am about 90% sure that the Container Path includes the Snowflake Stage. Meaning you need the Snowflake Stage created which allows Snowflake to connect to your Azure storage container. The path you are supplying in the component is the stage and then the subfolder(s) within the stage/container to where the file(s) exist.

Think of it this way. Matillion is only facilitating Snowflake SQL Statements. Meaning, Matillion isn't connecting to the Azure Storage Container. Snowflake is actually connecting to the container. Matillion is just creating a Snowflake SQL statement (i.e. "COPY INTO") to issue to Snowflake which will use the stage to go get the files in the path within the stage and load those into the table you specify. So, if you can successfully run the "COPY INTO" statement within Snowflake to load files from the Azure Storage via the Snowflake External Stage then you will be able to use the component within Matillion because Matillion is just building that COPY INTO statement. This is why the link I sent you in my first reply is important. That is the procedure you need to follow to create a Snowflake external stage which is Snowflake's connection to your Azure Storage.

Does this make sense? I hope I explained that well enough. Please let us know if you need more clarification.

Thanks @Bryan​ again for your answer. I'm trying to comprehend both platforms new to Matillion and Snowflake.​ So regardless I would need to establish the Snowflake external stage in the link you sent in order to connect to the Azure storage container via Matillion? Would I even need Matillion to copy the files from the blob storage to Snowflake once I setup the procedure you sent in Snowflake?

No problem @DF214566​, we were all new to both products at one time and I know how daunting it can seem when you first get started. You are correct, in that you will want to get the Snowflake External Stage setup regardless. Once that is setup, you will find that a lot of things get much easier.

The answer your second question around whether you would need Matillion at all to copy the files from Blog Storage into Snowflake the answer is no with a huge BUT... If you only want to load the files once then you can simply do that directly from Snowflake using a COPY INTO statement. If you want to schedule that load on an ongoing basis and perhaps do data manipulation, transformation, or moving the data from multiple databases, schemas, and/or tables that's where the power of Matillion comes in. It allows you to get some visibility around the data lineage as the data is initial ingested into Snowflake and is then moved or transformed thereafter.

Another piece of functionality you get after you have created that external stage is that you can query the data in the files within the Storage Container before you even ingest it into Snowflake using the typical SQL type syntax. This is helpful if you are looking to do some high level analytics on the data before you start doing full fledged ELT work.

I hope this helps you with your understanding of the process and what is going on behind the scenes.

​Got it @Bryan​, I appreciate you taking the time to fill in the gaps I had. For this project it's a one time load so I think we will be able to leverage the COPY INTO statement directly from Snowflake. This will allow us at some point to use Matillion to schedule loads and do some data manipulation if need be. Thanks again for your help.