Unable to load data from Oracle to Redshift using DB Query Component

I am trying to load an Oracle Table into Redshift using Database Query component. The table has nearly 130 million rows and I am able to identify the table in the Oracle DB,, even able to sample data. However when I am running my job which performs the lift, after running for nearly 4hours and 39minutes and nearly lifting all the data from the Oracle DB, the job is failing with "This connection has been closed." error.

This is not the first time this run is being performed or not the first time I am running this kind of lift job. The other tables which are comparatively smaller in number <100mn rows have been running fine but it is only happening with this particular table.

I have seen this behaviour as well for some projects in the past. This is not necessary an issue with Matillion - other tools work the same way. Usually, the problem can be found on the source database side. In this case, your Oracle DB might close the connection when a session is open for a too long time and a maximum session time is reached.

 

You could try to partition your data in the huge table and load smaller partitions with Matillion one after another. Depending on your data, you could try to partition the table by year, product categories, countries, etc. and use an Iterator component in Matillion to iterate over these partitions.

There are multiple factors involved while fetching the data from on premise databases primarily the source database configurations for data transfer , as Michael highlighted above, and the network bandwidth. Check for the throughput of your VPN and make sure it is allowed for higher volumes.