I set up the new Matillion Error Handling feature and thought that it will alert for errors layers deep in your jobs. However, when one of my components fails in a transformation job within my orchestration job, the Component Message does not come through

When a component fails inside a transformation job that is inside of the orchestration, all I get in my webhook is the name of the transformation for Component Name, and nothing for "Component Message" the actual error. The Error is blank when it should be "Parameter validation failure: join expression....". From what I read on Matillion you are supposed to get error messages from levels deep.

 

Another problem similar to that is when a component validates OK but causes a job to fail, the error is not captured in the webhook. For example (with screenshots below), The component name is End Failure 0 with Component Message being blank, when the error should be "ORA-00936: missing expression".

 

This is how the webhook is configured:

"text": ":warning:*The following job failed:*:warning:\n\n*Job Name:* ${job_name}\n *Component:* ${component_name} \n *Error Message:* ${component_message}"

 

Any thoughts are appreciated.

Hi @bwiechelman​ ,

What version of Matillion are you using? Also, is it for Snowflake or some other CDW? I ask because I vaguely remember that older versions of Matillion didn't bubble failures up appropriately and I believe that was fixed in later versions.

@Michael​, am I remembering correctly that you use error handling feature? I am not familiar with it as we do notifications on failures very differently. If you do use it, is there any knowledge you could pass on as it pertains to this?

Thanks for posting and thanks for the attention @Michael​ !

Hi @bwiechelman​ ,

I can give you a quick answer to the second part: If you use the "manual erro handling" approach with "End Failure" and/or "End Success" components, the global error handling is disabled for this job. I am very surprised that this is not part of the documentation. I think this limitation was just mentioned in the release notes back then:

"New Manage Error Reporting feature empowers users to post a message or webhook if an error occurs in a component or job that does not have any other error handling enabled." (https://matillion-docs.s3.eu-west-1.amazonaws.com/release-notes/1.51.html)

I will have to check for the first part of your question.

Michael

Hi Bryan! It says Version: 1.59.9 (build 551efe94dc8/eee1701f-0) AMI Version: 1.59.9 and it's for Snowflake.

Thanks Michael, thats helpful to know. But if a job failed that didn't have the end failure components, would the task history for that job show that it succeeded or that it failed? And thanks for looking into the 1st part still!

Hi @bwiechelman​.

I am trying to find a documentation on how to set up the webhook on the Snowflake side.

May I ask how did you set it up?

I am currently trying to set up the same things to log the errors into Snowflake.

Thank you.

I think I followed this: https://www.matillion.com/resources/blog/integrating-slack-with-matillion

From our experience the job would show as succeeded but the task that actually failed would show as failed would show as failed. And the only way to know in Matillion is by looking at all the tasks. Here's an example of where one of our jobs succeeded but did in fact have a failure on a sub task in a dependent orchestration:

 

We handle this very differently because Matillion’s implementation of error handling is not very robust. We export all Matillion job history to a set of tables in Snowflake and build a dashboard around the data which bubbles those failures up in a nice dashboard interface. It also allows us to drill into trouble areas. Honestly, Matillion’s job reporting could be much better when it comes to failures. :disappointed:

 

Hopefully this area gets better over time.

Yeah thats a bummer. But thanks for sharing!