Hi Everyone, I am using RDS bulk output component to load data from redshift to postgres table. I want to do insert/update. Ihave mentioned primary key column and update strategy as "Replace" but I am getting ERROR: duplicate key value violate

The source data having the same key values as target but with changes in other columns.So i want to update those columns in target .But its not updating the target table instead of that its throwing error.Can anyone please help..

Hi.

Have you tried checking the data in the Redshift table to ensure there are no duplicates? I had the same error and found this to be the case.

 

Hi there, I'm running into the same issue. Was a resolution ever identified?

 

We have no duplicate records in the source table in Redshift. However, the index in the Postgres database is throwing this error. We've tried every possible combination in the properties for the RDS Bulk Output component to no avail. Plus, if we set the target table to "truncate" one would presume that the truncation occurs before any records are written. A table truncation should also trigger the index to be cleared beforehand. We don't understand why this error is being raised. Any clues?

Hi @manuelM

Redshift does not enforce uniqueness - PK and FKs are informational only - so I'm going to go with Occam's Razor - there are duplicates in the data since the index is on the target. Is there anyway you can screen the data before loading to check this based on the index?

Truncating the table would not matter and supports the assertion that there are duplicates based on the index - an empty target table still applies the uniqueness rule so it points at there being duplicates in the source.

Hope this helps,

Chika