Is there a way to parallelize Matillion Transformation jobs for Snowflake?

I created two Matillion schedules. Let call them schedule A and B. Each schedule is configured with the following

  1. A different Snowflake compute warehouse,
  2. A different Matillion version
  3. A different Matillion environment. Matillion environment variables are set up so that all transformations are in different Snowflake databases between the two scheduled jobs.
  4. Checked out on the same matillion branch.

 

In my head, there should be no reason why these two schedules would interfere with another. But I'm seeing that schedule A would run for 15 minutes (runs over much more data) alone. Schedule B would run for 1 minute alone. When run together, both schedule A and B would run for 16 minutes.

 

Am I missing something and is there a way to have Matillion transformations for snowflake run in parallel?

 

The tasks view shows both scheduled runs happening at the same time

  • The main thing you need to look at here is Size of your Matillion VM. Look at the number of cores(vCPUs) on which it is running. How many parallel queries, your VM allows to run at a point in time.

Hi @ivan.liao​ and thanks for posting!

This documentation here may help you further. A named transformation job will not run in parallel, but you can give it that ability by wrapping the transformation logic inside a Shared Job.

I hope that helps! Let us know how you get on.

Many thanks,

Claire