Validation takes over 1 h

Anyone else have problems with validation taking a loooooong time? Is there a limit on components in a job for example? Got approximatly 40+ components in a job that never validates through the last step. Every step before goes through without problems and between 1s to 2m

Hi @rka1621004583744​ ,

I'm assuming you have lots of components on a page with lots of table inputs?

If so, I believe the validation is from the database having to perform metadata queries (to obtain field names and data types) for each database connected component on the job.

A workaround, is to (for the lack of a better word) "precompile" a view using the final component prior to a write operation in your job. From this component, get the "expanded" SQL source and create a database view out of it.

Create a new job to replace the old one with two components, a Table Input calling the single "precompiled" view and the write component. This will mean only the single view will be queried for its metadata instead of each component in the job.

Hope this is making sense!

Cheers,

Mick

Yes I do. This is an ETL tool after all, and the point is?

Seems to me, after putting a write view as a intermediate destinasjon and the resulting view script, that Matillion takes every component and add it to one horrible sql for the end target to get the data from.

 

It does not help either by the way. Removed the offending component that led to endless validation and put it into a table. Still running after 1,5 h. Its a small amount of test data, but that should not be a problem since we are talking about meta data. A support answer was that the data amount was too small

The component that the validation hangs on may not be the actual culprit. For example, we've seen long validations happening on a rewrite component but the root cause was because of a prior window function generating component.

 

After partial validation, take a look at the SQL tab in a component near the end of where your job does finish validation and click 'expand references', you'll probably see a huge SQL and something there may give you an idea of what's a problem.

 

Something to try for debugging and ultimately might improve the performance too is to test breaking up the job into parts at different points with each intermediate part outputting an intermediate table (not views).

I'll try it, but I wish management weren't having a hard on for cloud solutions. In SSIS ordinary lookups would not be a problem

Found the culprit. Takes under 2 min now. Obviously need to rethink some processess when using Matillion.

 

The point is, some use different design patterns with their jobs. Some may use multiple jobs (to update an intermediate table) and may never experience delays, instead of a single job, etc.

 

I've found depending on the infrastructure, validation times can vary (eg. on our staging environment, with less resources, it takes twice as long compared to our prod). Having many components multiplies this time out, although admittingly, I've never had validation go for 2 hours.

 

Two things I would start looking at:

 

  1. Reducing the number of components reduces the time it takes to validate. Precompiling a view does just that. Just to be clear, I'm not saying use a "write view" every time the job runs but just once to see what difference it might make.
  2. This may apply if you have shared "Transformation Jobs", I believe chatting with a Matillion guy, these are actually executed from a queue instead of parallel threads. So, something to look at if it applies.