Matillion official recommendation on upgrades

It seems there are a few of us that were given different best practices/recommended approaches to upgrades. It would be great to the get the official word from Matillion here in the discussion board so that everyone is on the same page as to what approach should be used.

A couple of the approaches that have been thrown out there between the discussions and ideas portals are:

  1. You can do an in place upgrade but only for major.minor.minor updates. You would need to build out a new instance and migrate over to it in the case of a major.minor release.
  2. You should always build out a new instance and migrate to it for all updates.

At the time of this discussion post, these are some posts that reference this subject:

https://metlcommunity.matillion.com/s/idea/0874G000000kAteQAE/detail

https://metlcommunity.matillion.com/s/question/0D54G00007AyKIoSAN/configure-sslsocketfactory-class

https://metlcommunity.matillion.com/s/idea/0874G000000kAtfQAE/detail

Referencing Matillion resources to get attention to this topic

@MatillionProductTeam​ , @JoeCommunityManager

I agree with @Bryan​ . I am looking for Matillion to provide best practice about how to apply upgrades. My thought is to build a separate instance to test upgrade before applying in place upgrade to main instance. I am happy to jump on a call to discuss need be.

Hi @Bryan​ and @SSabnss​ ,

Good to see your posts. When it comes to upgrading we always recommend that firstly you run a backup of your instance, prior to launching a new Matillion ETL instance with the latest release (see link to our documentation site below)

https://documentation.matillion.com/docs/2975839#

However, I am very interested to hear your feedback and experiences with upgrading. Is there anything specific which our documentation does not cover?

Thanks in advance.

Regards,

Nick

I would appreciate some guidance as well since I spent the better part of the day today trying to recover an unavailable instance resulting from a suggestion previously available in that page to revert to a previous version.

Thanks @Bryan​, @matteo.fiorillo1586365319990​ , @SSabnss​ ,

We are discussing this internally as your feedback is very important to us. We do have a blog which details the option of an in-place upgrade, though I can see the confusion that this may bring:

https://www.matillion.com/resources/blog/best-practices-update-matillion/

I will update the thread with a full response. Please do let me know if there are any other areas of confusion regarding upgrading so that I can ensure I address these fully.

Regards,

Nick

Thanks Nick! I can't speak for others but my only concern with the blog post for new customers is, how do they know that blog post even exists? When the customer goes to the update screen, no where in that screen does it lead them to think they should do anything but click the update button. Thanks again for help getting us squared away on the confusion around this topic.

Hi @Bryan@SSabnss​ & @matteo.fiorillo1586365319990

Thank you for raising such a good topic of discussion. Your views and opinions have been really helpful and created some healthy conversations within the @MatillionProductTeam​. However, we would like to call on you, and the rest of the community for some further assistance.

We have begun working on a "Best Practices" document and would love to have your feedback as to whether this would assist you with your upgrade paths or if there is further information you would require. We will then look to have an official article added to our documentation site.

Matillion Upgrade Options - Pros and Cons

Matillion provides software as a Machine Image which leads to the creation of one or more Virtual Machines running in AWS, Azure or GCP.

Upgrades are offered both as:

  • New RPM packages which can be applied as “in-place” upgrades
  • New machine images and migration support

Both of these methods have some advantages and disadvantages.

All software upgrades are inherently risky. Matillion integrates with over 100 external services which are changing all the time - improvements and fixes for the latest updates often risk causing issues with older versions of systems and/or services, therefore there is both a need to upgrade and a risk in doing so.

Your attitude to risk should steer the upgrade path and the effort put into it.

1.In place Upgrade

Method: YUM update over SSH or Update Software from the Matillion UI

Pro: Can be done via the UI by a non-specialist. The RPM upgrade is performed in the background and the service restarted.

Con: There is no automatic backup.

You should take backups (from within Matillion or another method) of the VM so you can recreate it if the upgrade causes issues. You should test restoring a working system from that backup too.

Pro: Does not require SSH access to Matillion Virtual Machine

Con: In-place upgrade cannot always be used. For example, moving from Amazon Linux to Centos is an OS change - that can never be offered as an in-place upgrade.

2.Create-New and Migrate

Method: Launch new Virtual Machines and migrate the workload to them. The migration can use the Matillion UI (Server Migration Wizard, or Export/Import tools) or the API - although using the API is quite involved it can be customized and made repeatable.

Pro: There is no need to back up the old infrastructure. Once you verify your workloads work on the new software, you can enable the scheduling of the new infrastructure and remove/retire the existing infrastructure.

Con: System modifications are not preserved, e.g. JVM settings, changes to JDBC settings. The user configuration is also not migrated.

*Some of the items not migrated from the UI can be done from the API, but not all.

Pro: You can test out the new versions over a period of time while the old software is still running production jobs.

Con: Requires additional effort from users to maintain a pre-prod environment and perform testing (although this is good practice!)

3.Best Practices

  • Based on the information in the above sections, decide the appropriate update method and frequency for your situation
  • Test and verify your chosen update method before implementing on your live server(s)
  • Ensure that you have a robust test strategy so that after updating the Matillion software you'll quickly know if there are any problems

4.Best practices if you choose to use an In-Place upgrade

  • Make sure you have taken a backup prior to making any changes
  • Make sure you are able to quickly restore from the backup in the event of any problems

5.Best practices if you choose to use Create-New and Migrate

  • Ensure you have a working and tested migration strategy for features which don't get automatically migrated (including non-standard JDBC drivers, git branches, local user configuration, custom licenses, SSL certs)
  • Ensure that your new server is given the same cloud privileges as the original, including feature access rights and firewall rules
  • If you are relying on any OS customization (e.g. you have added libraries), ensure that you can replicate those customizations on the new server
  • If you rely on such considerations, consider creating your own AMI based on ours with your customizations “baked in”. Then each release, re-make your AMI from our latest one and test. This could also be used to ensure drivers, user configuration, and so on are “baked in”.

I look forward to hearing your thoughts and opinions, as with your input we can ensure our users have a clear route to upgrading.

Thanks, Joe

@JoeCommunityManager​ , from my perspective, I think the document is much clearer. The only thing I would add to the cons for the In-place upgrade would be that if something goes wrong during the upgrade, the downtime could be extensive. Obviously, in the Create-New and Migrate situation, you are never down so there is zero risk there.

This got me thinking. Does Matillion have documentation out there for recovering from a backup should anything go wrong during the in place upgrade?

The last thing I would say is that this is great from a documentation and knowledge base standpoint but is there anything that will be changed in the Matillion product that will lead the end user to make the right decision? If the end user doesn't know this documentation exists the assumption will be that the only way to upgrade will be the in place method. I say that from personal experience. It wasn't until I started having issues with the in place upgrade did I think start looking for information on the upgrade process.

I would like to say that the documentation looks great and I am grateful to Matillion for making efforts to make this important process clearer. It's much appreciated!

Thanks for the reply @NickSayce-MatillionProductManager​ . It's much appreciated. Speaking from a Matillion for Snowflake perspective, in reading the document you provided which is the document I have always used. The recommended approach is to backup and stand up a new instance. I am sure others are thinking the same thing I am which is why is there an update button to do an in place upgrade if that is not the recommended method for upgrading? When should we use or not use that update button?

Thanks again for your input. It's much appreciated!

Thanks @Bryan​,

This is great feedback and I agree with your point in regards to the in-place method for upgrading. After all, this does seem like the logical step if you are unaware of any alternatives and their accompanying documentation.

@SSabnss​ and @matteo.fiorillo1586365319990​, please feel free to add any additional points to this thread, as well as any other community members. I am going to take this feedback away and discuss internally as I think this is a prime opportunity to ensure our customers are fully educated prior to upgrading.

I will post an update on this thread. Once again, thank you for taking time out to respond Bryan. This is exactly the kind of feedback we want to receive from our customers.

Regards,

Nick

@JoeCommunityManager@Bryan@NickSayce-MatillionProductManager

From my perspective, I would add cloud/edition specific details regarding

  • Backup/recovery strategy of the VM
  • IAM related steps that the admin should take care of (e.g., service accounts) while updating
  • GIT/ versioning related details

In our case it was quite painful to downgrade / recover from a errorneous version (even with backups in place and having an extensive versioning strategy), because the error / problem was detected after one week development on the new version. It was not easy to downgrade to the previous version without loosing all intermediate development work.

Given that I think even a more cautious strategy is necessary, e.g. setting up the new instance and testing in a separate environment while continue development on the old version.

I recently went through a new instance upgrade and would recommend the following:

  1. Include guidance on release notes whenever there is some underlying change like an OS change that would 'require' a new instance upgrade vs an in-place upgrade.
  2. List out where items that cannot be migrated via the GUI do have API options -- for example, internal user config can be migrated via the API but not the GUI
  3. Provide some Git workflow recommendations for how to maintain continuity across instances or improve the integration for instance migrations. This was one of our major pain points. In using Git and versions together, we ran into Git flagging 'false positive' changes due to changes in the underlying job file JSON creation/update dates due to how the migrate/fetch/clone operations work.

Hi @StanT​ ,

I totally agree with your points. I have submitted two tickets to the ideas portal about including internal user config (https://metlcommunity.matillion.com/s/idea/0874G000000kAtoQAE/detail) and git config (https://metlcommunity.matillion.com/s/idea/0874G000000kAqLQAU/detail) in the migration tool. Please upvote these tickets if you want to see this included in a future release.

Thanks

Michael

@JoeCommunityManager@StanT@Michael@Bryan@MatillionProductTeam

One additional thing, as it was already mentioned as pain point for downgrading regarding JSON repository changes:

  1. Is there a general rule (e.g. along with the versioning system major.minor.hotfix) how to tell, if a specific JSON repository export related to version x is backward compatible to y (given x > y)?
  2. If 1. doesn't exist: is it possible to extend the release notes with explicit notice about JSON braking changes or even backward compatibility?

This might help us to judge risk and plan version upgrades better when it comes to trade-off effort vs. benefits of an upgrade.

This topic is interesting!

We have only once made the actual suggested way of upgrade of Matillion, and that was done because there was this aws image os change happening.

 

However, our process is now only to use the magic update button and hope for the best.

 

Reasons for us not Migrating to a new instance are the following:

  1. The high-rate of updates and the annoying sticky notification on the Matillion. We want to keep our instance up-to-date and developers get this out-dated feeling when this notification stays for a longer period. (And now with the latest, it seems to be stuck?)
  2. Some of the problems mentioned above already, but
    1. user migration (we are now just copypasting users from terminal level)
    2. the matillion execution, run_id goes back to 'beginning' or zero. We track our executions using the run_id supplied by Matillion, if this gets reset everytime it tends to get a bit confusion.. however, we have now combined date with this key to solve this :P
  3. The process itself is quite technical, so it's a bit of a high strain to start doing this every week with the 'long process'. We are running matillion in two instances (Production and Development), so if we start to upgrade Dev, we need to upgrade Prod before we deploy stuff there. The Deployment pipe-line with these fast-paced tools are so fast, so we need to keep the version same.

 

However, we are looking for a nice process to get instance migrated and try to script our way through the rest of the obstacles to make this a efficient process. Anything Matillion can do to help in this, would be awesome.

 

Thanks.

Br,

Michael

Updates has been a struggle for me. We run regular NESSUS scans and we have a few vulnerabilities that running only security updates does not address. Running a full migration to a new instances isn't a sustainable path. Is it safe to run the updates from the updates repo only? We can lock the packages from the Matillion repo so those don't get updated.

Hello @Bodo@StanT@Michael

Thank you so much for your valuable feedback, with your help we can ensure our users have the best experience when using Matillion.

The points you have raised have been taken on board by the @MatillionProductTeam​ who are working on creating an 'official' upgrade article which you have helped us design. I will post a link to this on here once it is live, as your feedback is very important to us.

In relation to the in-product upgrade, we have noted your points regarding the confusion. In the short term we will be updating our messaging and linking users to the aforementioned upgrade guide, prior to selecting the in-product upgrade. We are also planning further enhancements to the upgrade path, though we are just finalising our ideas.

l will be sure to keep you posted in relation to all of the above. Again thank you for continued support and as always please so let me know if you have any further questions or ideas.

Kind Regards, Joe

Hi all, this is a great discussion. We have a backlog item to review how we can improve the migration tool and as part of this we will run a focus group. I'll get this set up sooner rather than later and I would love it if you would like to contribute. Thanks, Laura, Matillion Product Manager

Hello @Bryan@SSabnss@matteo.fiorillo1586365319990@Bodo@StanT​ & @MichaelBoucht

We wanted to loop back and keep you all updated on this and share with you the document we have created in relation to our upgrade process.

https://documentation.matillion.com/docs/6688312

The document is a work in progress and will continue to be updated as we receive further customer feedback. This will ensure we have one central location for this information to be stored. We have also listened to your thoughts regarding the 'In-Product' upgrade process and the confusion it could potentially cause. Thanks to this we will be improving our in-product messaging and again linking it out to the aforementioned document.

As always, all users are welcomed and encouraged to share feedback with us.

Kind regards, Matillion Product Team

Hi @StanT​ ,

I was going through this page and found your comment for GIT and Versions.

We are following the same thing ( we have versions and Git both)

Could you please let us know the approach you follow to migrate these items in version upgrade?

Thanks,

Shivani