Compare table contents with the ‘minus’ SQL operator

Here’s a quick and dirty method to compare the contents of a couple tables:

See all the rows in hr.employees which do not appear in oe.employees

SQL> select * from hr.employees
2 minus
3 select * from oe.employees;

EMPLOYEE_ID FIRST_NAME LAST_NAME
----------- -------------------- -------------------------
EMAIL PHONE_NUMBER HIRE_DATE JOB_ID SALARY
------------------------- -------------------- --------- ---------- ----------
COMMISSION_PCT MANAGER_ID DEPARTMENT_ID
-------------- ---------- -------------
210 Groucho Marx
gm@lifeaftercoffee.com 26-JUL-07 ST_MAN

And the converse can be done by switching the order of the tables:

See all the rows in oe.employees which do not appear in hr.employees

SQL> select * from oe.employees
2 minus
3 select * from hr.employees;

EMPLOYEE_ID FIRST_NAME LAST_NAME
----------- -------------------- -------------------------
EMAIL PHONE_NUMBER HIRE_DATE JOB_ID SALARY
------------------------- -------------------- --------- ---------- ----------
COMMISSION_PCT MANAGER_ID DEPARTMENT_ID
-------------- ---------- -------------
207 Jon Emmons
jon@lifeaftercoffee.com 26-JUL-07 AD_VP

208 Gwen Emmons
gwen@lifeaftercoffee.com 26-JUL-07 AD_VP

This can be very useful if you need to, say, determine what has changed between two different databases. I have been using this over a database link recently to compare tables in a development instance to the tables in a production instance.

The caveats: The two tables must have the exact same column layout. If they are close but not exact you should still be able to make this work by specifying all the fields which are the same instead of ‘*’.

If you want to exclude a field (like an activity date) you will need to specify all the fields you want to include from both tables. The types and order of the fields must be identical between the two select statements.

I describe this as “quick and dirty” because it is. You should not use a query like this in a production database without first looking at how it will affect your database. This will cause full table scans and could kill performance.

oracle, database, dba, database administration, sql

6 thoughts on “Compare table contents with the ‘minus’ SQL operator”

  1. I describe this as “quick and dirty” because it is. You should not use a query like this in a production database without first looking at how it will affect your database. This will cause full table scans and could kill performance.

    So what other techniques besides a minus can be used to compare data between two tables without impacting performance?

    I’ve been using the minus to write SQL verifying that an ETL process transforms data properly within a data mart environment. I really can’t think of any other procedure that would have less impact on the database and still compare the two tables.

  2. Blake,

    Here are a couple options you could consider to compare table contents instead of the ‘minus’ operator:

    Oracle change tables are probably the best for tracking changes internal to the database. You can see an example of them here.

    This is what most Oracle data warehouse ETL is built with.

    Custom triggers on insert, update and delete can enable similar functionality to change tables (change tables actually work with internal triggers.) You can have the triggers insert your data into another table or just keep some metadata in a separate table to indicate which rows have changed.

    Simply adding an activity date to your tables, populated with a trigger may be enough to track changes and keep databases in sync. If you know, for example, that the database was cloned at midnight on the 1st of the month, then to see what has changed since then you just need to examine all rows with an activity date after that. The activity date can be useful for several other things as well making this a fairly desirable approach.

    Hope this helps. Thanks for the question!

  3. Jon,

    I think your second example here is reversed. The snippet:

    SQL> select * from oe.employees
    2 minus
    3 select * from hr.employees;

    will show all the rows in oe.employees which aren’t in hr.employees.

    Thanks for the great site!

  4. Jon
    I want to put a flag on oe.employees table which indicate that the record should be process.

    how can i do that ?

    Thanks

  5. Oly,

    I’d suggest adding a column to your table for ‘processed’. When a new field is created you can leave the column null to indicate that it should be processed, then once it is processed change the column to ‘Y’ or some other value to indicate that it has been processed. If updates need to be processed in addition to inserts you could add an ‘on update’ trigger to make the value null.

    Hope this helps.

Leave a Reply

Your email address will not be published. Required fields are marked *