Hi @chris.odell1617208923357 ,
Hi Chris, there are a couple approaches to take on this but the most out of the box approach would be to leverage the power of Snowflake as much as possible. Implement the pattern you spoke about in this statement , "If I use S3 Load without any modifications to the file, I end up with the entire XML file loaded into a variant column". Once the data is in snowflake you have several options for handling the flattening which will allow you to materialize the data in to tables or simply expose the data via a view. This is what the SQL statement would look like to query that data once it's loaded into a variant column as a single row.
select
get(xmlget(r.$1, 'title'), '$')::string as FEED_TITLE,
get(xmlget(r.$1, 'link'), '$')::string as FEED_LINK,
get(xmlget(r.$1, 'description'), '$')::string as FEED_DESCRIPTION,
get(xmlget(parse_xml(f.THIS), 'g:id'), '$')::string as ITEM_ID,
get(xmlget(parse_xml(f.THIS), 'g:gtin'), '$')::string as ITEM_GTIN,
get(xmlget(parse_xml(f.THIS), 'title'), '$')::string as ITEM_TITLE,
get(xmlget(parse_xml(f.THIS), 'description'), '$')::string as ITEM_DESCRIPTION,
get(xmlget(parse_xml(f.THIS), 'g:brand'), '$')::string as ITEM_BRAND,
get(xmlget(parse_xml(f.THIS), 'g:google_product_category'), '$')::string as ITEM_GOOGLE_PRODUCT_CATEGORY,
get(xmlget(parse_xml(f.THIS), 'g:product_type'), '$')::string as ITEM_PRODUCT_TYPE,
get(xmlget(parse_xml(f.THIS), 'g:item_group_id'), '$')::string as ITEM_GROUP_ID,
get(xmlget(parse_xml(f.THIS), 'g:color'), '$')::string as ITEM_COLOR,
get(xmlget(parse_xml(f.THIS), 'g:size'), '$')::string as ITEM_SIZE,
get(xmlget(parse_xml(f.THIS), 'g:image_link'), '$')::string as ITEM_IMAGE_LINK,
get(xmlget(parse_xml(f.THIS), 'g:additional_image_link'), '$')::string as ITEM_ADDITIONAL_IMAGE_LINK,
get(xmlget(parse_xml(f.THIS), 'g:price'), '$')::string as ITEM_PRICE,
get(xmlget(parse_xml(f.THIS), 'g:availability'), '$')::string as ITEM_AVAILABILITY,
get(xmlget(parse_xml(f.THIS), 'g:condition'), '$')::string as ITEM_CONDITION,
get(xmlget(parse_xml(f.THIS), 'g:adwords_labels'), '$')::string as ITEM_ADWORDS_LABELS,
get(xmlget(parse_xml(f.THIS), 'g:shipping_label'), '$')::string as ITEM_SHIPPING_LABEL
from YOUR_SCHEMA.YOUR_TABLE r,
table(flatten(input => r.$1, recursive => true)) f
where f.KEY = '@' and f.VALUE = 'item';
I hope this helps.