I just want to know if anyone has already done something like this, or can give any advise on how to start this task.
Thank you.
I just want to know if anyone has already done something like this, or can give any advise on how to start this task.
Thank you.
@NeelamMacwan My suggestion Create a dedicated warehouse for Matillion and Use that for Matillion default warehouse.
Thanks Ganesh,
we do have a specific warehouse for Matillion use, what I really need is credit consumption per job/pipeline to understand which one is costing the highest.
Neelam
We've handled this by creating dedicated warehouses for our bigger METL jobs (assessed by looking at runtime + rows affected) which allow us to track those individually. Smaller jobs use a shared generic METL warehouse at the expense of granular tracking.
Thanks, gshenanigan
@NeelamMacwan Snowflake has connection level tagging for this very purpose.
You can alter the QUERY_TAG session variable and set it to the Matillion job name or some other unique identifier for the job.
ALTER SESSION SET QUERY_TAG = 'metl:customer_incr_load';
https://docs.snowflake.com/en/sql-reference/sql/alter-session
Then this will be available in query_history view.
SELECT * FROM snowflake.account_usage.query_history
WHERE query_tag = 'metl:customer_incr_load'
LIMIT 100
You can build a dashboard in Snowflake to identify top time consuming jobs using this view.
Note: Make sure to unset the query_tag once the job is completed.