Webucator's Free Advanced Oracle SQL Queries Tutorial

# Lesson: Using Set Operators

Welcome to our free Advanced Oracle SQL Queries tutorial. This tutorial is based on Webucator's Advanced Oracle SQL Queries course.

Set operators are used to combine the result sets returned by two separate queries into a single result set. The set operators for SQL are MINUS, INTERSECT, UNION and UNION ALL.

Lesson Goals

• Learn about the set operators.
• Review set operator examples.

## Set Operators Defined

A set operator appears between two independent queries. It is used to reconcile the results of the two queries into a single result set. These operations are based upon principles in elementry mathematical set theory.

• `MINUS` is used to produce the difference between two result sets.
• `INTERSECT` is used to show the commonalities between two result sets.
• `UNION` combines two results sets, but removes any duplicates.
• `UNION ALL` combines two results sets, but preserves any duplicates.

These relationships can be visualized using Venn Diagrams which represent the logical relations between sets in a visual form. A Venn diagram uses circles to show areas of commonality and exclusivity between sets.

## Relationship to Mathematical Set Theory

The diagrams that follow show the correlation between the Oracle set operator, the mathematical notation and a Venn diagram representing the relationship.

`MINUS` operation is analagous to a difference:

`INTERSECT` named the same as the mathematical operation:

`UNION` is also named the same as the mathematical operation:

`UNION ALL` does not have an analagous operation in elementary set theory where there is no notion of duplicates.

Other mathematical operations can be emulated by combining SQL operations. For instance, creating a symetric difference can be accomplished by the following combination:

### UNION - MINUS - INTERSECT

```(<query 1> UNION <query 2>)

MINUS

(<query 1> INTERSECT <query 2>)```

## Restrictions on Set Operators

There are a number of restrictions related to the use of set operators. The number and datatypes of the columns in each query must be the same, but the column lengths can be different. The names of the columns in the result set are the names used in first query, so aliases specified in the first query apply for the second query. In terms of order of operation, all set operators have the same precedence. Operators with the same precedence are evaluated from left to right. Parenthesis can be used to change the order in which the queries are evaluated.

In many cases, the functionality provided by the set operators is available by writing the query differently. However, the use of set operations can (at times) provide a clearer representation of the intention of the programmer creating the query than any alternative.

Within the HR Table, there are two tables that include a `job_id` field:

1. JOB
2. JOB_HISTORY

The following queries compare the `job_id`s in these two fields using the set operators.

First, we retrieve a count of distinct `job_id`s from the JOB and the JOB_HISTORY tables. This information gives us a baseline for the queries that will use the set operators to reconcile `job_id`s in the two tables.

## Code Sample:

Using-Set-Operators/Demos/count_of_job_ids.sql
```SELECT count(distinct job_id)
FROM jobs;

SELECT count(distinct job_id)
FROM job_history;```

This next example using the `INTERSECT` operator is equivalent to a query that selects distinct rows from the JOB and the JOB_HISTORY tables related by an inner join.

## Code Sample:

Using-Set-Operators/Demos/intersect.sql
```SELECT job_id
FROM jobs
INTERSECT
SELECT job_id
FROM job_history;

--
--  This result is equivalent to using an INNER JOIN and DISTINCT
--
SELECT DISTINCT j.job_id
FROM jobs j
INNER JOIN job_history jh
ON j.job_id = jh.job_id;```

This next example using the `MINUS` operator is equivalent to using an antijoin (`NOT IN` clause) between the two tables.

## Code Sample:

Using-Set-Operators/Demos/minus.sql
```SELECT distinct job_id
FROM jobs
MINUS
SELECT distinct job_id
FROM job_history;

--
-- This query produces the same results as an antijoin
-- between jobs and job_history
--
SELECT job_id
FROM jobs
WHERE job_id NOT IN (
SELECT job_id
FROM job_history
);```

This next example using the `UNION` operator, which does not preserve duplicates, is shown in comparison to an analagous query that uses a full outer join and the distinct operator to remove duplicates.

## Code Sample:

Using-Set-Operators/Demos/union.sql
```SELECT distinct job_id
FROM jobs
UNION
SELECT distinct job_id
FROM job_history;

--
-- This is equivalent to a full outer join with a distinct
-- operation to eliminate duplicates
--
SELECT DISTINCT j.job_id
FROM jobs j
FULL OUTER JOIN job_history jh
ON j.job_id = jh.job_id;```

The `UNION ALL` operator preserves duplicates between both of the tables. Because it preserves all duplicates and does not involve a join, there is no simple similar query using a join to provide the same results:

## Code Sample:

Using-Set-Operators/Demos/union_all.sql
```SELECT distinct job_id
FROM jobs
UNION ALL
SELECT distinct job_id
FROM job_history
ORDER BY job_id;```

## Set Operators Exercise

Duration: 20 to 30 minutes.
1. Write a simple query to retrieve `city` and `state_province` data from the `locations` table.
2. Using a set operator, write a query to determine names that appear in both the `city` and the `state_province` fields of the `locations` table.
3. Write a query that returns the same results without using a set operator.
4. Write a query (using an inline view) that returns a count of how many distinct entries appear in either the `city` and the `state_province` fields.
5. Write a query (using an inline view) that returns a count of how many total entries appear in the `city` and the `state_province` fields combined.

## Solution:

Using-Set-Operators/Solutions/set_operations_solutions.sql
```--
-- Write a simple query to retrieve city and
-- state_province data from the locations table.
--
SELECT city, state_province
FROM locations;

--
-- Using a set operator, write a query to determine
-- names that appear in both the city and the
-- state_province fields of the location table.
--
SELECT city
FROM locations
INTERSECT
SELECT state_province
FROM locations;

--
-- Write a query that returns the same results without
-- using a set operator.
--
SELECT city
FROM locations
WHERE city = state_province;

--
-- Write a query (using an inline view) that returns a
-- count of how many distinct entries appear in either
-- the city and the state_province fields.
--
SELECT count(*)
FROM
(
SELECT city
FROM locations
UNION
SELECT state_province
FROM locations
);

--
-- Write a query (using an inline view) that returns a
-- count of how many <em>total</em>entries appear in the
-- city and the state_province fields combined.
--
SELECT count(*)
FROM
(
SELECT city
FROM locations
UNION ALL
SELECT state_province
FROM locations
);

--
-- (Besides the duplicates identified earlier, there are a number of nulls):
--
SELECT count(*)
FROM
(
SELECT city AS place
FROM locations
UNION ALL
SELECT state_province
FROM locations
)
WHERE place IS NULL;```