I am getting a .xlsx file from SharePoint using the data transfer job. Then I use the Excel Query in order to upload this data to a Snowflake table. However, on the Metadata Explorer tab, Matillion detected one of the columns as Float size 15, but this column is supposed to contain Item Codes, which can be varchars as well. Thus, I cannot upload this file, getting the error code:
Error obtaining value '741.46547OH' for column 'Item Number'. Failed to convert to datatype: 'double'. The original datatype of the column is: 'double'. Original Exception: Error parsing double value [741.46547OH].
While loading from Excel, By default, data types are guessed by looking at the cell formatting, not cell contents. This is controlled using the Connection Option "type detection scheme" which can be set to ColumnFormat (the default, which examines the cell formatting), RowScan (which will scan 15 rows of data and guess the data type based on the data values), or None (treat everything as text). A second connection parameter, "row scan depth" controls how many rows to scan when determining column types.
You can set your Connections Options on the Excel Query component , adding "type detection scheme" as "RowScan" and "Row Scan Depth" till the max record no. This might help you to figure out the datatype more precisely.
You could also add a special "first row" in your excel. This is to define the datatypes manually and to ensure the RowScan mode will always "guess" the correct datatypes.
Just type in "AAA" for all VARCHAR columns, "123" for all INTEGER columns, and so on. You can filter this row afterwards.