Hi Everyone, I am using RDS bulk output component to load data from redshift to postgres table. I want to do insert/update. Ihave mentioned primary key column and update strategy as "Replace" but I am getting ERROR: duplicate key value violate

VM573110 · November 23, 2021, 8:05am

The source data having the same key values as target but with changes in other columns.So i want to update those columns in target .But its not updating the target table instead of that its throwing error.Can anyone please help..

Pete · January 12, 2022, 12:54pm

Hi.

Have you tried checking the data in the Redshift table to ensure there are no duplicates? I had the same error and found this to be the case.

Manuel · July 19, 2022, 12:51am

Hi there, I'm running into the same issue. Was a resolution ever identified?

We have no duplicate records in the source table in Redshift. However, the index in the Postgres database is throwing this error. We've tried every possible combination in the properties for the RDS Bulk Output component to no avail. Plus, if we set the target table to "truncate" one would presume that the truncation occurs before any records are written. A table truncation should also trigger the index to be cleared beforehand. We don't understand why this error is being raised. Any clues?

ChikaMatillionCommunityMgr · July 21, 2022, 8:03pm

Hi @manuelM

Redshift does not enforce uniqueness - PK and FKs are informational only - so I'm going to go with Occam's Razor - there are duplicates in the data since the index is on the target. Is there anyway you can screen the data before loading to check this based on the index?

Truncating the table would not matter and supports the assertion that there are duplicates based on the index - an empty target table still applies the uniqueness rule so it points at there being duplicates in the source.

Hope this helps,

Chika

Topic		Replies	Views
RDS Bulk Output throws an error. ERROR: column "KEY" named in key does not exist Position: 332 Matillion ETL	1	2	February 9, 2024
The RDS Bulk Output component errors out after first run Matillion ETL	1	1	November 15, 2023
Incremental load fails for a table when a new column is added in the source table. What can be done to solve this issue? Matillion ETL	6	0	December 20, 2023
In the Nested Data Load component for Redshift, is there a way to set the sortkey or primary key as a column that is being aliased? Matillion ETL	0	0	February 15, 2023
We have moved from 1.58 to 1.75 just recently on AWS using RedShift DB. In a random pattern, different scheduled jobs where showing "Validation Failure. Input components are not in a valid state" error for "rewrite table" object and "Rename" objects, Matillion ETL	5	2	July 24, 2024

Hi Everyone, I am using RDS bulk output component to load data from redshift to postgres table. I want to do insert/update. Ihave mentioned primary key column and update strategy as "Replace" but I am getting ERROR: duplicate key value violate

Related topics