Also is there any way to switch the full load job into incremental and vice versa?
Hi @gpreetsingh I'm so sorry your post was missed initially. If you haven't already found the answer, then could I suggest taking a look at our Academy course here for Matillion ETL Best Practices as I think that may be of use.
Many thanks,
Claire
Hi @gpreetsingh,
I can speak to what we have done and the patterns we follow. See my answers below:
1.) Job naming conventions. How should the job be named and prefix and postfix to be followed?
We try to name the initial orchestration of the project with the name "__start". The prevents someone from trying to figure out where the orchestration flow starts when there are several dependent orchestrations involved. After you know where the orchestration flow starts, you can follow it all the way through.
We also try to create clearly defined guardrails for specific use cases by creating folders. The folders could be based on a subject area, loading pattern, etc.
2.) Best practice to design the matillion ETL orch jobs?
The best practice course @ClaireSeniorCommunityManager provided is a good place to start. In my opinion it's lacking in places where your orchestration flows get more complex. In my opinion Version management should be part of the best practices. For instance, in every Matillion project we have a version called CURRENT which is considered our released version of the project. Any schedules that are created always use orchestrations that exists in the CURRENT version of the project only. This prevents mishaps where others accidentally delete a version that a schedule was using without knowing it. Another best practice would be to only create environment variables where absolutely needed. Often times environment variables are overused because devs become lazy or don't understand the difference between job variables and environment variables. This will keep the list of environment variables shorter and much more manageable.
3.) Best practice to handle the errors?
This is a little tougher question to answer as far as best practices go because there are a few ways of handling errors effectively and it would depend on the situation and use case. Generally speaking we log every error to a table which allows us to create dashboards around errors. In most of our orchestrations we try to catch errors on specific components that are critical and require a greater level of granular detail. We also try to catch most other errors by using the error connector on each component and dropping those to a single OR component which then connects to a logging component that we built ourselves. Either way, catch as many errors as you can. It will pay off in the end.
I am not certain of the last question on full load job versus incremental. If I had more detail on this, I could provide more info. I hope this helps in some way.