How To Find Duplicate Records In Oracle Using Rowid

Yesterday a friend came across an oracle query problem:

By using ROWID, Oracle Optimizer would select the fastest path, which is a direct route to a single row of data. The following code selects the most current record for every duplicate pair of col1 and col2.
Finding duplicate rows using the aggregate function To find duplicate rows from the fruits table, you first list the fruit name and color columns in both SELECT and GROUP BY clauses. Then you count the number of appearances each combination appears with the COUNT (.) function as shown below.
Oracle has a pseudo column called 'ROWID', which is unique for each row in a table. So we can use that to delete duplicate records. The above code will find only one ROWID for non-duplicate records. For duplicate records, it'll delete a record with lower ROWID. If there are more than two columns that make up unique key for a record, then you.

By specifying how many duplicate values we want to get in a sub-query, we can obtain non-duplicate or duplicate records from a table. Duplicates are those records that are duplicated twice or more in the table. Queries on this page use the Oracle Northwind database that was converted from the popular Access Northwind database.

Consider below table:

cid cname.... cdata 1 x xxxx 1 x xxxx .. 2 xzzz fjnd 3 a evddd

Now the problem was cid is repeated and cdata column is having some extra characters but data is considered same (eg. xxxx & xxxx..).
so when he does
SELECT cid,cname....cdata FROM TABLENAME ....;
He kept on getting duplicated records which intern he wanted to have only one per duplicated cid.
It’s not by the way of table structure but by the way of query he wanted to have results.

I suggested him:
SELECT cid,cname....cdata FROM TABLENAME GROUP BY cid;
As far as I know this works with MySQL, but it gave error as I came to know that I have to include all SELECTed variables in GROUP BY clause.
This is Oracle 🙂

Well later I came back to my place and recollected something called ROWID of Oracle, may be that can help!!

Then I created similar table in MySQL to “stimulate” Oracle rowID:

How To Eliminate Duplicate Rows In Oracle Without Using Rowid

rowId cid cname .... cdata 1 1 x xxxx 2 1 x xxxx .. 3 2 xzzz fjnd 4 3 a evddd

And finally I got the solution as:

select cid,cname….cdata from tablename where rowID in (select min(rowID) from tablename group by cid);

How To Find Duplicate Records In Oracle Using Rowid

So finally I decided to give it a try on Oracle and it worked 🙂

select * from TABLENAME WHERE rowid in (select min(rowid) from TABLENAME group by DUPLICATECOLUMN);

See Full List On Wikihow.com

Ranch Hand

posted 14 years ago

Speaking as an Oracle database administrator... DON'T DO THIS
The only time any application should be using Oracle ROWIDs is when performance is so important that you'd cut off your hand and sell you grandmother into slavery to get a 5% performance improvement. Nothing, absolutely nothing, about a web-based application can ever justify using ROWID in this manner.
Instead, create a proper primary key column and use that.
Oracle has lots of features that enable easy maintenance on the database, even while other applications are using it; many of these features absolutely depend on being able to change a ROWID during a table reorganization. If you code your program in this manner using ROWID, then your program will need to be stopped whenever some of these maintenance operations are required.

Ranch Hand

posted 14 years ago

Also, you might want to consider doing yourself a favour and let JDBC deal with handling date conversion issues for you, and just pull the column from the result set as the appropriate java.sql.Date/Time/Timestamp type you need. You can always use a date formatter to do whatever you want within Java anwyays, and the code will be more portable across databases. I wouldn't even be surprised if it was a smidgeon faster (it should take fewer bits to transmit a date over the wire in its native format than to transmit the stringified version).

Greenhorn

posted 14 years ago

hello,
thanks for that enlightening info. now i have dropped the idea of using ROWID. but i dont have any primary key in the table. so i decided to use IDENTITY columns instead. i used the following query in oracle 9i:
create table kbtable (RELATEDTO VARCHAR2(100) NOT NULL,
GENERALSUBTYPE VARCHAR2(200) NOT NULL,
ISSUEOPENDATE DATE NOT NULL,
ISSUESTATUS VARCHAR2(7) NOT NULL,
ISSUE VARCHAR2(1000) NOT NULL,
UPDATEGIVENBYVARCHAR2(100),
ADDEDBY VARCHAR2(2) NOT NULL,
ADDEDONDATEDATE NOT NULL,
LASTMODIFIEDDATE ,
LASTMODIFIEDBYVARCHAR2(2),
ISSUE_IDINT IDENTITY PRIMARY KEY )
but i am getting an error:
missing or invalid option
Please help
regards
sameer