Issue with flattening and extracting JSON data using flatten variant

I am having issues with trying to extract values from a JSON Array and utilizing the flatten variant component. Below is the JSON format:

[

  {

    "key": "ga_session_id",

    "value": {

      "double_value": null,

      "float_value": null,

      "int_value": 1698415322,

      "string_value": null

    }

  },

  {

    "key": "firebase_event_origin",

    "value": {

      "double_value": null,

      "float_value": null,

      "int_value": null,

      "string_value": "auto"

    }

  },

  {

    "key": "engagement_time_msec",

    "value": {

      "double_value": null,

      "float_value": null,

      "int_value": 1318,

      "string_value": null

    }

  },

  {

    "key": "ga_session_number",

    "value": {

      "double_value": null,

      "float_value": null,

      "int_value": 1,

      "string_value": null

    }

  }

]

My desired output would be a new column for each object in the array above:

ga_session_id | firebase_event_origin | engagement_time_msec | ga_session_number

---------------------------------------------------------------------------------------------------------------------------------------------

1698415322 | auto | 1318 | 1

Here is how the "flatten variant" component is setup:

I am able to get the four rows flattened but any time i try to extract the “key” value or value within, it get NULLs.

I’ve also tried the “extract nested data” component which identifies the “event_params” as a json structure but nothing occurs when I select autofill.

Any help/tips would be appreciated!

Hello,

if using Flatten Variant, you can get your JSON pulled into separate columns first,

soemthing like this:

 

but this will have 4 different rows from your data, use case-when or coalesce, or similiar, to identify the correct value per name.

which you would then need to transpose to your desired output format.

 

Or if the dataset is ‘static’,

you can always dig it out, using Snowflake SQL on a Calculator component, like this:

If it’s always the same order of 4 keys,

“event_params”[0]:value:int_value as ga_session_id

 

or if it’s not guaranteed the same order, but always 4,

 

CASE

WHEN “event_params”[0]:key::VARCHAR = ‘ga_session_id’ THEN “event_params”[0]:value:int_value

WHEN “event_params”[1]:key::VARCHAR = ‘ga_session_id’ THEN “event_params”[1]:value:int_value

WHEN “event_params”[2]:key::VARCHAR = ‘ga_session_id’ THEN “event_params”[2]:value:int_value

WHEN “event_params”[3]:key::VARCHAR = ‘ga_session_id’ THEN “event_params”[3]:value:int_value

  ELSE NULL

END as ga_session_id

 

I would never do unless 100% sure it’s always like that. to keep generic and easily add more, I would use flatten variant as in example one, and then just transpose, then you can semi-automate the column genereration, or fully automate it using grid variables, just keep all the values as varchar until they are actually used somewhere.

 

Hope this makes sense and helps you forward :slight_smile:

 

-Michael

Hey Michael,

 

Thanks for the response and explanation! Unfortunately the JSON is dynamic and there is no set structure i can use to call out the indexes of the JSON which is why I was hoping to utilize Matillion for this since the code to write it in snowflake would require multiple stored procs (willing to do if unable to figure out).

 

I am still struggling to get the key value and their corresponding values from your screenshot you provided. I always receive back NULL values and just one row (not the four you mentioned). Not sure what I am doing wrong in Matillion but I have ensured that the JSON array columns are converted to VARIANT type before pulling in the flatten component variable.

 

Thank you~!