Welcome to our free Advanced Oracle SQL Queries tutorial. This tutorial is based on Webucator's Advanced Oracle SQL Queries course.
Oracle SQL queries are not limited to selecting data from a single table. Several tables can be accessed within a query and the data aggregated into a single result set which is returned to the user. The relationship between two tables is known as a join and consists of a mapping of values between specified columns in each table. In this lesson, you will learn about various types of joins and their application to data retrieval.
A join is the means of relating the data in two tables together.
An equijoin has a join condition containing an equality operator. If the joined columns have the same name, the join can be referred to as a natural join. While it is possible to join a table using columns other than primary or foreign keys (a non-key join),, the majority of the time, joins are made between two tables with a foreign key to primary key relationship. A single table can even be joined to itself if the two keys are in that table (a reflexive join).
It is even possible to join tables without using the equality operator (e.g., an anti-join or semi-join) or to join with the results of a query rather than a table (named subqueries).
A number of different types of joins use the equals operator. The most common is the inner join. Oracle has an "old" syntax for joins but also supports ANSI standard SQL joins.
There are actually some additional distinctions for different types of equijoins. Oracle's "old" syntax for the inner join conforms to earlier ANSI join syntax. Oracle's "old" syntax for the outer join is proprietary. It is based on the earlier ANSI inner join syntax. The earlier ANSI inner join syntax is often referred to as a "Theta Join". The newer ANSI join syntax is referred to as a "SQL99 Join".
With an inner join, columns with a matching row in each
joined table are returned. An inner join is the default
type of join, so the
may be omitted in the SQL99 version.
For example, the following query shows how to list country and region information.
-- Earlier ANSI Join Syntax (Theta Join) SELECT * FROM regions r, countries c WHERE r.region_id = c.region_id; -- New (ANSI SQL99) Syntax - the INNER keyword is optional SELECT * FROM regions r INNER JOIN countries c ON r.region_id = c.region_id;
An inner join causes columns that have a matching row in each joined table to be returned.
Traditionally this type of query was specified in Oracle as part of the
WHERE clause using
the earlier ANSI standard (Theta Join) syntax.
While the old syntax is still supported, the newer SQL99 ANSI standard join syntax has been supported for many, many years (since Oracle 9i) and is now widely used. You should stick to this newer syntax in any new code you write, but you don't necessarily have to go back through old code you come upon changing it to the newer syntax.
The code below illustrates how some data can be "lost" when using inner joins.
-- -- This query returns a count of 107 records in the employees table -- SELECT count(*) FROM employees; -- --When we make an inner join to departments, we only get 106! -- SELECT count(*) FROM employees e INNER JOIN departments d ON e.department_id = d.department_id; -- -- What happened to the one record? -- As it turns out, one record has a null department_id. -- SELECT * FROM employees WHERE department_id IS NULL;
Start by checking how many records are in the
Next, create an inner join with the department table and check
the count again. There appears to be a record missing.
The missing record has a
null value in the
department_id field, and
therefore no corresponding row in the
By using a left outer join, this record is returned as well.
In order to restrict the number of rows returned in the following queries, filtering predicate expressions (criteria)
are added to each
WHERE clause. However, keep in mind that an outer query
returns all of the records returned by an inner query PLUS certain additional
records from one or both of the tables.
-- -- The employee can be retrieved when the department_id is ignored... -- SELECT first_name, last_name FROM employees e WHERE e.employee_id = 178; -- -- ...but the employee is not returned when a join is made to the -- department table because there is no matching record in the -- department table -- SELECT first_name, last_name FROM employees e INNER JOIN departments d ON d.department_id = e.department_id WHERE e.employee_id = 178; -- -- By using a LEFT OUTER JOIN we include all records that are included -- in an inner join plus all records in the employees table. -- SELECT first_name, last_name FROM employees e LEFT OUTER JOIN departments d ON d.department_id = e.department_id WHERE e.employee_id = 178; -- -- In order to demonstrate the right and full outer joins, it is necessary -- to add a record to the departments table that does not have any -- employee records associated with it. -- INSERT INTO departments ( department_id, department_name, manager_id, location_id ) VALUES ( 999,'Vacant Dept', NULL, NULL ); -- -- This query returns no records -- SELECT first_name, last_name, department_name FROM employees e INNER JOIN departments d ON d.department_id = e.department_id WHERE e.employee_id = 178 OR d.department_id = 999; -- -- Using a LEFT OUTER JOIN records are returned from the employee -- table with no associated record in the department table -- SELECT first_name, last_name, department_name FROM employees e LEFT OUTER JOIN departments d ON d.department_id = e.department_id WHERE e.employee_id = 178 OR d.department_id = 999; -- -- Using a RIGHT OUTER JOIN records are returned from the deparment -- table with no associated record in the employees table -- SELECT first_name, last_name, department_name FROM employees e RIGHT OUTER JOIN departments d ON d.department_id = e.department_id WHERE e.employee_id = 178 OR d.department_id = 999; -- -- This query returns records in the deparment table with no associated -- record in the employees table as well as records in the employees -- table with no record in the departments table -- SELECT first_name, last_name, department_name FROM employees e FULL OUTER JOIN departments d ON d.department_id = e.department_id WHERE e.employee_id = 178 OR d.department_id = 999; -- -- Using Oracle proprietary syntax, a left outer join can also be specified using (+) to indicate -- the table column where a null is acceptable -- Note, however, that this syntax is now discouraged in favor of the SQL99 syntax. SELECT first_name, last_name, department_name FROM employees e , departments d WHERE (e.employee_id = 178 OR d.department_id = 999) AND d.department_id = e.department_id(+)
A generally accepted "Best Practice" is to avoid nulls in the database. However, real world situations result in occasions where parent records have no associated child records. An employee may not have an assigned department (if they have retired), and a department may not have an associated employee (if the department is no longer in use but still has relevant data associated with it).
These situations require the use of outer joins as illustrated in these examples.
A cross join (or Cartesian Product) will return a result set where each row from the first table is combined with each row from the second table. This is not often required in practice and is used intentionally rather sparingly.
-- Older Syntax SELECT first_name, last_name, department_name FROM employees, departments ORDER BY first_name, last_name, department_name; -- Newer Syntax SELECT first_name, last_name, department_name FROM employees CROSS JOIN departments ORDER BY first_name, last_name, department_name;
A reflexive join (or self-join), is a relationship in which a table is joined to itself.
The employees table has both employees and employee managers in it.
A self join can be made using this table and using the
manager_id and the
employee_id in the join.
SELECT first_name, last_name FROM employees WHERE employee_id=103; SELECT first_name, last_name FROM employees WHERE manager_id=103; -- --If we want to return information about the manager and his employees --in the same query, we must use a self-join or reflexive join -- SELECT e.first_name, e.last_name, m.first_name AS manager_first_name, m.last_name AS manager_last_name FROM employees e INNER JOIN employees m ON e.manager_id = m.employee_id;
The first query is a simple query to return the manager with employee id 103's first name and last name. The next query shows the names of the people that are managed by this manager. The third query is a reflexive query in which the employee table is joined to itself in order to return each employee's name and their manager name in the same row.
A special type of self join is available in Oracle called a hierarchical query. This type of query is useful for representing recursive relationships which might exist within the data.
SELECT employee_id, manager_id FROM employees START WITH manager_id is null CONNECT BY PRIOR employee_id = manager_id; SELECT employee_id, manager_id, sys_connect_by_path(last_name,'->') AS path FROM employees START WITH manager_id is null CONNECT BY PRIOR employee_id = manager_id;
The first query produces hierarchical query
results by using the
CONNECT BY PRIOR clause. It
orders the results of the employee table so that
employees are nested under their assigned managers.
The next query uses the
to provide a representation of the hierarchy using
Joins are usually specified on primary key and foreign key columns. However, SQL is not limited to joining on such columns. Any columns can be joined as long as the datatypes match. In fact, even a function can be used as part of join.
Just for fun let's see if the first two characters of an employee's first name happen to match any country code; uppercase those two characters since country codes are uppercased.
SELECT country_id, country_name, first_name FROM employees e INNER JOIN countries c ON c.country_id = upper(substr(e.first_name,1,2));
This is a somewhat contrived example since the HR schema is well designed and relatively simple. It involves the use of functions in the joined column. Be aware, there is a potential for performance problems if appropriate indexes have not been defined. Oracle has a feature called function based indexes which can be helpful when using deterministic functions in this manner under some circumstances.
A natural join involves tables joined by columns
with the same name. The
NATURAL JOIN clause implicitly
creates a join on columns with the same column name
between the joined tables. The returned results
(in the case of a
SELECT *) contain only one column for
each pair of equally-named columns.
--Query using INNER JOIN syntax SELECT * FROM regions r INNER JOIN countries c ON c.region_id = r.region_id; --Same rows return using NATURAL JOIN syntax (***NOT RECOMENDED***) SELECT * FROM regions NATURAL JOIN countries;
Although it is useful to be aware of
NATURAL JOIN syntax, it
is best to avoid its use in practice. At best, the columns
being used to make the join are not explicitly identified,
and this ambiguity can lead to confusion when attempting
to debug a query. If a column is renamed, a query referencing
the column in a
NATURAL JOIN will remain valid, however the
results of the query will change.
A semi-join returns rows from a table where matches are
found in the joined table. It uses the
An anti-join returns rows from a table for which there are
no corresponding rows in the joined table. It uses the
NOT EXISTS keywords.
For example, the following query lists all of the job ids for all of the jobs that have ever been assigned to an employee.
SELECT * FROM jobs WHERE job_id IN ( SELECT job_id FROM job_history);
This example is essentially the same as an inner query. However,
it provides a different type of flexibility unavailable to an
inner join by adding criteria to the
WHERE clause of the
And here's the inverse: a query that lists all of the job ids for all of the jobs that have never been assigned to an employee.
SELECT * FROM jobs WHERE job_id NOT IN ( SELECT job_id FROM job_history );
And here is how you would write the same queries using
SELECT * FROM jobs j WHERE EXISTS ( SELECT job_id FROM job_history h WHERE j.job_id = h.job_id); SELECT * FROM jobs j WHERE NOT EXISTS ( SELECT job_id FROM job_history h WHERE j.job_id = h.job_id);
A named subquery is also known as inline view. It can be joined in the same manner as a regular table.
-- -- The following query returns a list of employee managers as well as -- the department id to which each is assigned. Note that the employee -- managers utilize the self relationship in the EMPLOYEES table. Also -- note that an employee manager is different than a department manager. -- SELECT first_name, last_name, department_id FROM employees WHERE manager_id IS NULL OR employee_id IN (SELECT manager_id FROM employees); -- -- This query can be used as a named subquery with an alias of managers. -- It can then be joined to the departments table to see a list of each -- employee manager and the name of the department to which each is assigned. -- SELECT managers.first_name, managers.last_name, d.department_name FROM ( SELECT first_name, last_name, department_id FROM employees WHERE manager_id IS NULL OR employee_id IN ( SELECT manager_id FROM employees ) ) managers INNER JOIN departments d ON d.department_id = managers.department_id;
This example starts with a query that makes sense for use as an inline view. It is a listing of all employee managers. This query is then used in the second query in the place of a table and is joined with the departments table to provide the final set of results.
eto the table.
department_idcolumns (thus utilizing the department assignment relationship). Note that this is a natural join (but does not use the ambiguous
-- -- All employee records -- SELECT * FROM employees e; -- -- An inner join -- SELECT e.* FROM employees e INNER JOIN departments d ON e.department_id = d.department_id; -- -- The same results from a semijoin -- SELECT e.* FROM employees e WHERE e.department_id IN ( SELECT d.department_id FROM departments d );