I'm trying to load data from MongoDB to Redshift. As I know, it will try to guess the schema by reading the first 100 documents returned from Mongo.
However, there're new columns added to the Mongo collection, which means that MongoDB Query can't find the new columns and it says: Invalid column name for the table
There's a am article about fixed schema for Mongo collection, but it hasn't been updated for the new Matillion version
The RowScanDepth option that @ganesh_pitchai mentioned is good, but it doesn't work beyond an undocumented limit (which seems to be fairly low).
But I can confirm that the Fixed Schema Definitions workaround definitely does work in the current latest version of Matillion ETL, v1.54. Although agreed with @anhkhoadx's comment that the instructions are now out of date.
Now you have to go to Project / Manage API Profiles / Manage Query Profiles
.. and to get to the table definition you have to toggle the Advanced Mode button.
Here's an updated screenshot. I copied line 7 and pasted a new line 8 containing a new column that it did not autodetect. The "rarecolumn" does not occur in the first 1 million documents of the collection.
Note you also have to set the other:bsonpath = "$.rarecolumn" at the end of that line.
After running a MongoDB component once where you have set Location and GenerateSchemaFiles, you must create a second MongoDB component containing only the same Location parameter.
After that the component should show your manually added column even in Basic mode, for example like this:
You have to restart Matillion after any change to the RSD files
The QueryPassthrough option looks promising, but itself suffers from the same problem: it doesn't list columns that it doesn't find in the first handful of documents.
All new features and improvements like this come via the Ideas Portal. Currently there's nothing there about making the MongoDB interface simpler, so I would ask please add a new Idea there. To our other community members... please upvote if it's relevant to your use case.