I've just started looking at the Matillion API and would like to be able to get a list of all jobs (orchestration and transformation) within a project (default version) with the path that they are saved under. I know this data is available in the /job/export endpoint on a job-by-job basis, but it looks like the list of all jobs endpoint (/group/name/<group_name>/project/name/<project_name>/version/name/default/job) is the only endpoint available and it just lists the jobs with no paths.
Is this possible via a different endpoint, perhaps?
The paths to jobs isn't not exposed natively through the API but you can get the info it's just a bit cumbersome. The only way I know to do this is to export the project/version itself. So, instead of going down to the job, backup to the project version. The endpoint would be something like (/group/name/<group_name>/project/name/<project_name>/version/name/default/export). This will give you way more than what you want and will be in JSON format. You would then have to take the results and use Python to walk through the levels to determine where jobs exist in the project. This can be quite cumbersome and complicated if you haven't spent much time in Python.
Perhaps a better approach might be to find out what you are ultimately trying to accomplish which may lead to alternatives. Just a thought... Hopefully this sheds some light.
I raised a support ticket about this - and it turns out that the Metadata (lineage) API is only available from version 1.54 of Matillion, while we are on 1.50.6. So that explains my 404s
Thanks again for being so helpful on this thread - hopefully we will be able to upgrade and I'll be able to use the endpoint for what I need.
I had already thought of that as an option but, as you said, it's a bit cumbersome just to get a list of jobs with their paths!
What I'm trying to do is create an application where users can select a job from a drop down (preferably with the paths included) and the app would then get the Json for the job from the API - this would then be parsed by the application to identify table inputs and outputs which can be used in documentation of databases (to show that the job affects those tables). I would prefer to avoid having to load the Json for the entire project/version if I can!
Incidentally, I wouldn't have to use Python...other languages are available (I'm a .NET developer)
I feel ashamed that I didn't even think to bring up version. I knew that the Data Lineage feature and API addition was in newer versions.
Most users that are posting here are typically newer customers. So, I tend to make some assumptions around versions. Sorry, for not drawing that conclusion on my end.
On the flip side of things, going to a newer version Matillion will give some pretty nice features that were not available back in the older versions. I would definitely consider it. Check out this page to see all the changes that have taken place since the 1.50.6 version: https://documentation.matillion.com/docs/2804617
Thanks for posting back! This helps me keep the version being used top of mind when helping others.
I did find another possible option. I will preface this with I haven't used the information myself but if you are running Enterprise, you get access to the Data Lineage. This particular API does out put information like path the job along with SQL statements that are used within the job. Matillion uses this API in the Data Lineage feature within the product. The documentation for this is here: https://documentation.matillion.com/docs/9907241
I did run it through Postman for one of my projects and it seems to have the information you are looking for but it will need to be pulled apart to get the pieces you are after. Hopefully this helps!
I'm struggling to get the Data Lineage API call to work - the documentation says the endpoint is http(s)://<host>/rest/v1/group/name/<projectGroupName>/project/name/<projectName>/environment/name/<environmentName>/lineage?startTimestamp=<value> (since endTimestamp is optional) but I get a 404 when I call that (after substituting in my group name, project name and environment name).
What you have looks correct. A couple key items that need to be correct with the API call is the host and the startTimestamp format.
If you are calling the API within a Matillion job then the host should be 127.0.0.1:8080. If you are calling it externally then the host should be the IP without the port number.
The startTimeStamp has to be in EPOC/Unix format. You can use a tool like the one in the link to generate a startTimestamp for testing (https://www.epochconverter.com/). If you are doing this work in a Matillion job, you will need to use something like a Python or Bash Script to give you the value to pass into the API Call. This is an example of a valid EPOC timestamp: 1633406697415 which is equivalent to Monday, October 4, 2021 9:04:57.415 PM GMT-07:00 DST.
Hopefully this helps a bit more. Post back if not. Thanks!
Thanks for the info. That all sounds solid to me. Something occurred to me while reading your message though. Is your Matillion instance running Enterprise? If not, the lineage API is not available.
I believe this matches up with what you are executing though. I did get 404's if any of the group, project, environment names were incorrect at all. Although it returned 404 it passed back a message indicating what the issue was.
Well, I'm thoroughly confused. I took your full URL from above and substituted in our domain, group, project and environment names (also updated http to https) and I'm still getting a 404 with no explanation. This is very frustrating...but thank you for all you've done to help.