Remove Duplicates in DB2 Columnar Format

I had dupe data in my OLAP table, where the columnar data can be duplicated based on event id. (I loaded data 2x). I had to differentiate the data and remove the duplicates, so I assigned row_numbers over a partition ordered by.

I hope this helps you.
db2 "update (select OME.*, row_number() over(partition by IDN_EVENT_ID order by IDN_EVENT_ID) as rnk from X.OLAP OME) set APP_NM = rnk"

Then I removed using this.
db2 "DELETE X.OLAP OME WHERE APP_NM = 2"

I recommend the two-phase, as you can in theory run this en batch, or async, and double check, versus hope it works.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.