Grid Variable Data into Snowflake Table

Hi,

 

I have been trying to load data from grid variable into Snowflake (Not Create/Replace but append rows to pre-existing snowflake table).

 

Below is the birds-eye view of what I am trying to achieve.

 

  • Load unstructured data from s3 bucket into Grid Variable using Python Component.

Why: I had .log file and couldnt find any other way than using Python Script to split data in columns, and then storing in grid variable

Status: Achieved

 

Code:

import boto3

import re

 

s3 = boto3.client('s3')

obj = s3.get_object(Bucket='mat-stage-dev', Key='data_science/Tabpy/Tabpy_logs/tabpy_log.log') 

 

result = []

i = 1

 

for line1 in obj['Body'].readlines():

  

  line = line1.decode('utf-8')

   

  matching_level = re.search('\[(.*?)\]',line)

  matching_file = re.search('\((.*)\.py\:(.*)\:(\d{3}|\d{2}|\d{4})\):',line)

  matching_message = re.split(r'\((.*)\.py\:(.*)\:(\d{3}|\d{2}|\d{4})\):', line)[-1]

  

  if None in (matching_level,matching_file,matching_message):

     pass   

  else:

     

     result.append([["date",line[0:10]],[ "time",line[11:19]],["levelname",matching_level.group()],["file",matching_file.group()],["message",matching_message]])

 

print(result)     

#Grid Variables are accessible via the context:

 

# Updating a grid variable:

context.updateGridVariable('log_table', result)

 

 

 

  • From Grid Variable load/append in Pre-exisitng Snowflake table

Status: Unable to find the right component to do that. I tried table output in transformation job but I dont know how to map it/ make it work.

 

 

 

 

 

Your approach looks quite complicated to me. And I think this "ETL" approach for processing data on the Matillion server is not the easiest with Matillion and Snowflake. You might also quickly run into performance issues if your log files are larger.

 

I'd go for a true ELT solution: Use Matillion to load the log file directly to Snowflake. Hint: The file can be loaded as CSV file with just one column. Splitting the data to columns and further String transformation can then happen in pure Snowflake SQL and there is no need to transfer the data back and forth.