delete duplicate rows in a table

来源:互联网 发布:mac的flash player 编辑:程序博客网 时间:2024/05/21 14:07

Purpose:
========

The purpose of this article is to provide you with an example of how to
delete duplicate rows in a table.

 
Scope & Application:
====================

This article is most useful for Oracle Developers.


How to Delete Duplicate Rows in a Table:
========================================
 
1. Table dept_copy is a copy of the dept table and contains a few duplicate
   values.

   SQL> select * from dept_copy;
 

     DEPTNO DNAME          LOC
   ---------- -------------- -------------
           10 ACCOUNTING     NEW YORK
           20 RESEARCH       DALLAS
           30 SALES          CHICAGO
           40 OPERATIONS     BOSTON
           10 ACCOUNTING     NEW YORK
           20 RESEARCH       DALLAS
           30 SALES          CHICAGO
           40 OPERATIONS     BOSTON
           10 ACCOUNTING     NEW YORK
           20 RESEARCH       DALLAS
           30 SALES          CHICAGO
           40 OPERATIONS     BOSTON
           10 ACCOUNTING     NEW YORK
           20 RESEARCH       DALLAS
           30 SALES          CHICAGO
           40 OPERATIONS     BOSTON
 
   16 rows selected.
   SQL>


2. Drop the duplicate rows using the rowid because rowid is the only
   difference between these rows.
 
   SQL> delete from dept_copy                                            
        where rowid in
        (select rowid from dept_copy
        minus
        select max(rowid) from dept_copy
        group by DEPTNO, DNAME, LOC);

   12 rows deleted.


3. Verify the table is correct by doing the following:

   SQL> select *  from dept_copy
        group by DEPTNO, DNAME, LOC having count(*) > 1;
 
   no rows selected

   Note that this query shows you another method to display duplicate rows.


References:
===========

Oracle 8i Release 2 (8.1.6).
SQL Reference, Volume 1 and Volume 2.
Note 65080.1 Using SQL to Delete Duplicate Rows in a Table.
Note 1015631.6 HOW TO SELECT DUPLICATE ROWS WITHOUT USING ROWID.    
Note 1004425.6 HOW TO FIND OR DELETE DUPLICATE ROWS IN TABLE.

Keywords
========
duplicate
delete
rows
ora-1

 

 

 

 

 

 

 

 

 

 

 

 

There's an Oracle-specific trick to delete the duplicates from a table.

Suppose your table FOO has the following fields: bar, baz, blat, blong, blork

To get rid of the dupes, you want to identify the rows that are identical on -all- the fields in the table. So the query to ID the dupes would be something like

select bar, baz, blat, blong, blork
from FOO
group by bar, baz, blat, blong, blork
having count(*) > 1 ;

So now you have a list of dupes. Fine and dandy. Now there's a neat trick you can use to get rid of all but 1 of each of these rows! Oracle provides a "pseudocolumn," ROWID, which is a representation of the physical location of each row and so is unique for each row in the table. So the following query gets you the rows that you want to keep:

select min(rowid), bar, baz, blat, blong, blork
from FOO
group by bar, baz, blat, blong, blork ;

For rows that AREN'T dupes, this will just return the row with its rowid. For rows that ARE dupes, this will return the row data along with the rowid for ONE copy of that row, no matter how many copies are in the table. Neat eh? So all you have to do is get rid of all the rows that AREN'T in this resultset:

delete from FOO
where rowid not in
  ( select min(rowid)
    from FOO
    group by bar, baz, blat, blong, blork
  )
;

 

 

 

To detect duplicates
 select x.rowid, x.a,x.b ....
 from <tbl> x
 where x.rowid>any (
  select y.rowid
  from <tbl> y
  where x.a = y.a
  )
To delete dupliocates
 delete from <tbl> x
 where x.rowid > any
  (select y.rowid
  from <tbl> y where y.a=x.a

 

Other ways to detect duplicates

 select <key cols>,count(<key cols>)
 from <tbl>
 group by <key cols>
 having count(<key cols>) > 1

 select *
 from <tbl> a, <tbl> b
 where a.key = b.key
 and a.rowid != b.rowid