I am trying to load data from salesforce using Salesforce query component. My result set having is a huge difference while using "UseBulkApi" in connection option as compared to run without using "UseBulkApi". I am using Matillion 1.42 version and loading data in snowflake. Can someone please help me here.
Hi @RK160269 ,
A couple things that come to mind as I think about your question/issue. Since you are on an old version, you may be experiencing a bug that may have been fixed in later versions. I see in the release notes at least 3 upgrades to the Salesforce driver that Matillion uses since the 1.42 version. I wonder if it would help to upgrade or standup a separate instance next your current one to test this specific scenario? I don't know if this will help determine what is going on but you might check the Catalina server logs during and/or after the Salesforce component has finished to see if there are any indicators as to what is going on while the UseBulkApi is set to true versus false. I do know that flipping that UseBulkApi to true use a different set of Salesforce API's than if it was false.
The other thing to watch is the Salesforce side. If the quantity of records is high it may require more API calls. Every Salesforce account is limited on the quantity of calls they can make in a 24 hour period. The catch with use the Bulk API is whether the 1.42 version of Matillion is using Salesforce's Bulk API v1 or v2. For instance, on Bulk API v1 there is a hard stop after 2 mins of processing your query. Meaning you can't supply a query that is going to return a bunch of data really slow or it will timeout. As far as I know this isn't the case with Bulk API v2.
There are a lot of variables at play here but I would start with running a newer instance of Matillion and see if that fixes the issue. If not, I would open a case with Matillion and see if they can hunt it down. I hope this info helps in some way.