I'm a back end developer in .NET/C# that was given the task of calling the Matillion API to get task status information and history for a reporting dashboard. and I need help understanding some Matillion terms for URLs - can anyone help?

I have never used or really heard about Matillion up until this very day other than anecdotal evidence that my company is using it for migrating off of SSIS to Snowflake and I have no plans of using the ETL portion. My sole task is to be able to probe the api and monitor tasks for running/failure/etc... and present those on a web page or for consumption by other tools. I'm also the only .NET/C# developer available to our data warehouse.

 

I'm trying to understand what a group and project is based on the API call patterns for tasks but I'm not getting anywhere. The "project" tree in our data warehouse implementation is pretty deep and I'm not sure what is a group, project and task. What I've seen so far in video training has been pretty sparse and seems to assume that you know way more than I do about Matillion (which is next to zip, nada, nil).

 

Anyways - pointers to better info, direct help, etc...would be greatly appreciated.

 

Thanks

I would sit down with the ETL developers and have them give you an example of a project and project group, that way you could call the API and look for that project and group. I think once you see this you would have a better understanding of how the pieces fit together. In the meantime, this link might help: https://documentation.matillion.com/docs/2819274

I finally figured out project and group by doing some more searching and I have some working queries running in Postman. So I can get task history for an entire project which results in hundreds of MB's of json. I only need to monitor 6 jobs at the moment and I only need to see if they succeeded in their run which appears in the top level piece of JSON. I was able to query one job and the response was still 40MB big - I don't need all that "data". With that in mind, the query time against the API is not very quick. I can test the main job by doing a MAX against the audit date way faster than using the API. I was hoping it would be much, much faster. So what this tells me is that I'm probably doing my API calls wrong.

 

This massive history result seems to happen in the UI as well - often resulting in OOM errors and dropped connections but I'll chat with the devs as suggested.

The API allows you to grab the history as of a date, so that would limit the amount of data in the json - maybe limit it to a date since the last check. (see DirkZ response to this post from a couple of months ago: https://matillioncommunity.discourse.group/t/in-matillion-we-have-a-task-history-which-is-really-having-detailed-data-is-there-any-way-i-can-use-those-data-and-trigger-an-email-like-failure-jobs-task-history-or-some-specific-job-task-history-curious-to-know-any-suggestions-please/1372)

Thanks. I tried the pattern from DirkZ and that resulted in a 401 - not authorized.

 

Here's my request that works but yields way too much data: "https://matillion.XXXXXXX.com/rest/v1/group/name/XXXXXXXX/project/name/XXXXXXXX/task/filter/by/start/after/date/2021-08-25" and it results in a 4oMB json response. But this is querying the entire project. I can't figure out the pattern for a single job to filter and get the data for in the group and project.

I played around using Postman and I could not find a way to combine filters, so I could not find a way to filter on job name and start after date. To get the data for a single job, using your example above:

"https://matillion.XXXXXXX.com/rest/v1/group/name/XXXXXXXX/project/name/XXXXXXXX/task/filter/by/job/name/{MyJob}"

{MyJob} would be the job that you are looking for (https://documentation.matillion.com/docs/2972278). Unfortunately I could not find a way to limit this to a date range.

Yeah it's like this is a serious shortcoming in external access to the API but is exactly what I need to be able to do unfortunately.

I just added an idea in the Ideas portal to this effect if you would like to upvote it: https://metlcommunity.matillion.com/s/idea/0874G000000kBXeQAM/detail

Done. I can execute the URL for the get but the code chokes on the massive response (near 40MB) and I can't even sift the data now to find the info for the job I want