An SQL Hint

Date last modified: 19 April, 2013

Back to home page

When to use select distinct

In SQL query exercises on M359 it is never a mistake to include the word DISTINCT between SELECT and the list of column specifications that follow in the SELECT clause!  On the other hand, it is sometimes a mistake to miss out DISTINCT, because our exercises are always phrased such that a result in which the same row appears more than once is a logical mistake.

Of course, if you follow my advice in real life you will sometimes get penalised by poor performance, because inclusion of DISTINCT requires the DBMS to make an extra effort to look for and eliminate redundant duplicates, and that extra effort is incurred even when there are no duplicates.  But in M359 we don't ask for the best solution from a performance point of view—we just want a solution that is guaranteed to give the correct result.

This advice, always to include DISTINCT (and never to write UNION ALL instead of just UNION), doesn't apply to exercises in creating updatable views.  For example:

CREATE VIEW AllButSalary AS
SELECT EmployeeNumber, Name, Address, Phone, HireDate, DateOfBirth, Manager
FROM   Employee;

GRANT UPDATE ON AllButSalary TO Admin;

The purpose here is to allow Admin the change some but not all of an employee's details.  If DISTINCT had been included, the resulting view would be non-updatable under SQL's somewhat arbitrary rules for updatability.

Back to home page